CN-122023585-A - High-resolution remote sensing image generation method for space-based remote sensing model training

CN122023585ACN 122023585 ACN122023585 ACN 122023585ACN-122023585-A

Abstract

The application relates to the technical field of artificial intelligence and discloses a high-resolution remote sensing image generation method, device, equipment and medium for training a space-based remote sensing model, wherein the method comprises the steps of dividing an image generation text prompt input by a user into at least one sub-prompt; generating a global noise-containing image matched with the image generation text prompt, generating local image blocks matched with each sub-prompt, dividing the global noise-containing image according to the number of the local image blocks to obtain noise-containing image blocks, performing alignment optimization on the local image blocks and the noise-containing image blocks to obtain target local image blocks to be spliced, and splicing all the target local image blocks to obtain a high-resolution remote sensing image. The technical scheme provided by the application can solve the problems of insufficient image space consistency and high-resolution generation quality during remote sensing image generation.

Inventors

LI CHAO
YAO KELU
XUE JUNXIAO
Lv Chenghui
XU NUO

Assignees

之江实验室

Dates

Publication Date: 20260512
Application Date: 20260206

Claims (10)

1. A method for generating a high-resolution remote sensing image for space-based remote sensing model training, the method comprising: dividing an image generation text prompt input by a user into at least one sub-prompt; Generating a global noisy image matched with the image generation text prompt, generating local image blocks matched with each sub-prompt, and dividing the global noisy image according to the number of the local image blocks to obtain noisy image blocks; performing alignment optimization on the local image block and the noise-containing image block to obtain a target local image block to be spliced; And splicing all the target local image blocks to obtain a high-resolution remote sensing image.
2. The method of claim 1, wherein the dividing the user-entered image-generated text prompt into at least one sub-prompt comprises: And carrying out semantic analysis on the image generated text prompt by using the large language model to obtain at least one sub-prompt, wherein each sub-prompt corresponds to one image block in the high-resolution remote sensing image to be generated.
3. The method of claim 1, wherein the performing alignment optimization on the local image block and the noisy image block to obtain a target local image block to be stitched comprises: Spatially aligning the local image block with the noisy image block and optimizing the local image block based on a similarity between the aligned local image block and the noisy image block to obtain an optimized local image block; And optimizing pixels in the boundary overlapping region of any adjacent optimized local image blocks to obtain the target local image blocks to be spliced.
4. A method according to claim 3, wherein said optimizing the local image block based on the similarity between the aligned local image block and the noisy image block to obtain an optimized local image block comprises: determining pixel pairs between the aligned local image blocks and the noisy image blocks; Determining, for each pixel pair, a similarity of the pixel pair; and iteratively updating pixels of the local image block by taking the maximized similarity as an objective function to obtain the optimized local image block.
5. A method according to claim 3, wherein optimizing pixels of the overlapping area of the boundary of any adjacent optimized local image blocks to obtain the target local image block to be stitched comprises: Extracting boundary overlapping areas of any adjacent optimized local image blocks; determining boundary pixel difference values of any adjacent optimized local image blocks in the boundary overlapping area; and iteratively updating boundary pixels of the optimized local image block by taking the minimized boundary pixel difference value as an objective function to obtain a target local image block to be spliced.
6. The method according to claim 1, wherein the stitching all the target local image blocks to obtain a high resolution remote sensing image comprises: splicing all the target local image blocks to obtain a spliced image; And decoding the spliced image to obtain a high-resolution remote sensing image.
7. The method of claim 1, wherein the generating a global noisy image that matches the image-generated text cues and the generating a local image block that matches each sub-cue are accomplished by a two-channel diffusion model that completes training, the two-channel diffusion model being trained as follows: acquiring an image to generate a text prompt training sample and a high-resolution remote sensing image sample; And generating a text prompt training sample and the high-resolution remote sensing image sample by using the image to train the two-channel diffusion model.
8. A high resolution remote sensing image generation apparatus for space-based remote sensing model training, the apparatus comprising: the sub-prompt generation unit is used for dividing an image generation text prompt input by a user into at least one sub-prompt; The image block generating unit is used for generating a global noisy image matched with the image generation text prompt, generating local image blocks matched with each sub-prompt, and dividing the global noisy image according to the number of the local image blocks to obtain noisy image blocks; The alignment optimization unit is used for performing alignment optimization on the local image block and the noise-containing image block so as to acquire a target local image block to be spliced; and the remote sensing image obtaining unit is used for splicing all the target local image blocks to obtain a high-resolution remote sensing image.
9. A computer device, comprising: A memory and a processor communicatively coupled to each other, the memory having stored therein computer instructions that, upon execution, perform the high resolution telemetry image generation method for space-based telemetry model training of any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the high resolution remote sensing image generation method for space-based remote sensing model training of any of claims 1 to 7.

Description

High-resolution remote sensing image generation method for space-based remote sensing model training Technical Field The application relates to the technical field of artificial intelligence, in particular to a high-resolution remote sensing image generation method, device, equipment and medium for space-based remote sensing model training. Background With the increasing wide application of high-resolution remote sensing images in the fields of geographic information systems, environmental monitoring, urban planning and the like, the geographic details provided by the high-resolution remote sensing images are critical to ground surface change monitoring and scientific decision making. However, the traditional acquisition mode based on satellite or unmanned aerial vehicle faces the limitations of high cost, large influence by weather and topography, long acquisition period and the like, and is difficult to meet the requirements of real-time and large-scale application. Therefore, a new approach is provided for remote sensing image generation based on deep learning technologies such as an antagonism network, a variational self-encoder, a diffusion model and the like. The method can generate images with good visual effect and rich details, and particularly, the diffusion model is outstanding in quality stability. However, when the ultra-high resolution remote sensing image is generated, two major challenges are faced, namely, the problem of space consistency, the difficulty of maintaining global consistency and local smooth transition in a large-scale complex scene, the possibility of boundary fracture or inconsistent style in the existing method, the defect of insufficient detail quality, the emphasis of overall generation of a model, and the limited fine reduction capability of local features such as topography, vegetation, buildings and the like limit the application of the model in accurate monitoring. In view of this, how to solve the defects in image space consistency and high resolution generation quality at the time of remote sensing image generation is a technical problem that needs to be solved at present. Disclosure of Invention The application provides a high-resolution remote sensing image generation method, device, equipment and medium for space-based remote sensing model training, which solve the technical problems of image space consistency and high-resolution generation quality deficiency during remote sensing image generation. In order to achieve the above purpose, the main technical scheme adopted by the application comprises the following steps: In a first aspect, an embodiment of the present application provides a method for generating a high-resolution remote sensing image for training a space-based remote sensing model, where the method includes: dividing an image generation text prompt input by a user into at least one sub-prompt; Generating a global noisy image matched with the image generation text prompt, generating local image blocks matched with each sub-prompt, and dividing the global noisy image according to the number of the local image blocks to obtain noisy image blocks; performing alignment optimization on the local image block and the noise-containing image block to obtain a target local image block to be spliced; And splicing all the target local image blocks to obtain a high-resolution remote sensing image. The embodiment provides a high-resolution remote sensing image generation method for space-based remote sensing model training, which generates text prompts by analyzing images input by a user, generates corresponding sub-prompts and further drives a subsequent image generation flow. And then, splicing all the aligned target local image blocks according to global coordinates to obtain a high-quality high-resolution remote sensing image. The method remarkably improves the overall quality, information integrity and usability of the generated image, and effectively solves the problems of insufficient image space consistency and high-resolution generation quality during remote sensing image generation. In one embodiment, the dividing the user-entered image-generated text prompt into at least one sub-prompt includes: And carrying out semantic analysis on the image generated text prompt by using the large language model to obtain at least one sub-prompt, wherein each sub-prompt corresponds to one image block in the high-resolution remote sensing image to be generated. In the embodiment, the text prompt input by the user is subjected to deep semantic understanding and spatial deconstructing through a large language model. The macro language model disassembles the macro description into a plurality of sub-prompts which are discrete in semantics and space through analysis, each sub-prompt accurately corresponds to and controls the generation content of a specific image block in the final high-resolution remote sensing image, so that the fuzzy generation process is converted into