CN-121981890-A - Image processing method, device and storage medium for fragment identification
Abstract
The invention discloses an image processing method, an image processing device and a storage medium for fragment identification, relates to the technical field of image processing, and aims to solve the problems of space-spectrum distortion and serious artifacts existing in fragment hyperspectral super-resolution reconstruction. The method comprises the steps of obtaining a high-resolution RGB image and a low-resolution hyperspectral image of the same scene, generating a deformation field through a deformable cross-mode registration module to realize registration, learning a space blurring kernel and a spectrum blurring kernel to generate a degraded low-resolution RGB image pair, inputting the low-resolution RGB image pair into a dual-branch degraded image spectrum reconstruction network to generate two hyperspectral image rough estimated values, inputting the two hyperspectral image rough estimated values and the high-resolution RGB image into a hyperspectral super-resolution reconstruction network, and outputting the high-resolution hyperspectral image. The invention adopts an unsupervised deep learning method, does not need priori and training data, and can avoid the serious problems of space-spectrum distortion and artifact existing in the fragment hyperspectral super-division reconstruction.
Inventors
- ZHAO DONGE
- YU PEIYUN
- MA YAYUN
- CHU WENBO
- ZHANG BIN
Assignees
- 中北大学
Dates
- Publication Date
- 20260505
- Application Date
- 20260123
Claims (10)
- 1. An image processing method for fragment identification, characterized by comprising the steps of: step one, acquiring a high-resolution RGB image and a low-resolution hyperspectral image of the same scene; Generating a deformation field through a deformable cross-mode registration module, and performing spatial transformation on the low-resolution hyperspectral image to realize registration with the high-resolution RGB image; step three, a spatial blur kernel and a spectral blur kernel are learned, non-negativity and normalization constraint are applied to parameters of the spatial blur kernel and the spectral blur kernel, and a degraded low-resolution RGB image pair is generated; Inputting the low-resolution RGB image pair into a dual-branch degradation image spectrum reconstruction network, capturing a cross-band dependency relationship through a degradation spectrum learning transducer module, combining an up-sampling module to expand the spectrum channel number, and generating two hyperspectral image rough estimation values through cross-branch interaction fusion characteristics; Inputting the rough estimated values of the two hyperspectral images and the high-resolution RGB image into a hyperspectral super-resolution reconstruction network, realizing multi-scale feature alignment through a deformable cross-mode registration module, screening key features by combining a residual attention feature fusion block, enhancing space details through an up-sampling convolution feature fusion block, and finally outputting the high-resolution hyperspectral image.
- 2. The image processing method for fragment identification according to claim 1, wherein the second step comprises: wherein Y represents a low resolution hyperspectral image, , Representing a real set, h and w are the length and width of the low resolution hyperspectral image, respectively, C is the number of hyperspectral bands, Z represents the high resolution RGB image, H and W are the length and width of the high resolution hyperspectral image respectively, Q represents the query matrix obtained by Y, K represents the key matrix obtained by Z, V represents the value matrix obtained by Y; Representing a query weight matrix; representing a key weight matrix; Representing a value weight matrix; representing the deformation field of the object, A mapping function representing U-Net; Representing a splicing operation; Representing the registered low resolution hyperspectral image; Representing a deformable cross-modality registration module; indicating the effect of the distortion field on X.
- 3. The image processing method for fragment identification according to claim 1, wherein the step three includes: (6) In the formula, Representing a spectrally degraded low resolution RGB image; representing a spectral degradation process; Representing the registered low resolution hyperspectral image; Parameters that are SRF; Representing a spatially degraded low resolution RGB image; z represents a high resolution RGB image; is a parameter of the PSF; representing a first loss function; And The value of (1) is set to 0.5, ssim represents a structural similarity index loss; representing the Frobenius norm, abbreviated as F-norm.
- 4. The image processing method for fragment identification according to claim 1, wherein in the fourth step, the dual-branch degraded image spectrum reconstruction network is represented as: In the formula, The representation is composed of A low-resolution hyperspectral image is obtained through spectrum reconstruction; The representation is composed of The obtained low-resolution hyperspectral image; Representing a degraded image spectral reconstruction network; Representing a spectrally degraded low resolution RGB image; representing the spatially degraded low resolution RGB image.
- 5. The image processing method for fragment identification according to claim 1, wherein in the fourth step, the cross-band dependency relationship is captured by a degradation spectrum learning transducer module, and the calculation process is as follows: (8) (9) (10) (11) In the formula, 、 、 Respectively represent Branch No A query matrix, a key matrix, and a value matrix for each feature; Representing a query weight matrix; representing a key weight matrix; Representing a value weight matrix; Indicating the number of Is characterized in that, Represents a set of real numbers, Indicating the length of the hyperspectral image, The width of the hyperspectral image is represented, and C is the number of hyperspectral wave bands; Representative of Branch No Attention to individual features is sought; representing a normalized exponential function; representing the dimensions of the key matrix; Indicating that the modified serial number is Is characterized by (2); Indicating number of Is characterized by (2); representing a modified linear unit; Representing a convolution operation.
- 6. The image processing method for fragment identification according to claim 1, wherein in the fourth step, a loss function is established for the degraded image spectrum reconstruction network as follows: In the formula, Representing a second loss function; The representation is composed of A low-resolution hyperspectral image is obtained through spectrum reconstruction; The representation is composed of The obtained low-resolution hyperspectral image; the two hyperspectral image rough estimates are generated as follows: In the formula, Respectively representing two hyperspectral image rough estimation values; representing a degraded image spectral reconstruction network, and Z representing a high resolution hyperspectral image.
- 7. The image processing method for fragment recognition according to claim 1, wherein in the fifth step, the hyperspectral super-resolution reconstruction network includes two encoder-decoder networks with jump links and one depth decoder network, wherein the upper branch encoder-decoder network extracts high resolution spatial features, the lower branch encoder-decoder network extracts spectral features, the depth decoder network is located downstream of the dual branch encoder, receives the fused cross-modal features, and the multi-scale feature alignment is achieved by the deformable cross-modal registration module, comprising: Wherein Q represents a query matrix obtained by Y, K represents a key matrix obtained by Z, V represents a value matrix obtained by Y; Representing a query weight matrix; representing a key weight matrix; y represents a low resolution hyperspectral image, Z represents a high resolution RGB image; For a final rough estimate of the high resolution hyperspectral image, ; And Respectively representing rough estimates in an encoder And Z an unregistered feature at a kth dimension; Representing the deformation field at the kth scale, Representing a mapping function of U-Net; representing registered Registration features of a kth scale; Representing a deformable cross-modality registration module; indicating the effect of the distortion field on X.
- 8. The image processing method for fragment identification according to claim 1, wherein in the fifth step, the following loss function is adopted: In the formula, P represents a space blur kernel, and R represents a spectrum blur kernel; representing the X-sum after spatial downsampling Error of (2); representing the error of X and Z after spectral downsampling; X represents the reconstructed high resolution hyperspectral image; representing a registered low resolution hyperspectral image, and Z representing a high resolution RGB image.
- 9. An image processing apparatus for fragment recognition, comprising: an acquisition unit for acquiring a high-resolution RGB image and a low-resolution hyperspectral image of the same scene; the registration unit is used for generating a deformation field through the deformable cross-modal registration module and performing spatial transformation on the low-resolution hyperspectral image so as to realize registration with the high-resolution RGB image; the learning unit is used for learning the space blur kernel and the spectrum blur kernel, and applying non-negativity and normalization constraint to the parameters of the space blur kernel and the spectrum blur kernel to generate a degraded low-resolution RGB image pair; The generation unit is used for inputting the low-resolution RGB image pair into a dual-branch degradation image spectrum reconstruction network, capturing a cross-band dependency relationship through a degradation spectrum learning transducer module, combining an up-sampling module to expand the spectrum channel number, and generating two hyperspectral image rough estimation values through cross-branch interaction fusion characteristics; The output unit is used for inputting the rough estimated values of the two hyperspectral images and the high-resolution RGB image into a hyperspectral super-resolution reconstruction network, realizing multi-scale feature alignment through a deformable cross-mode registration module, combining a residual error attention feature fusion block to screen key features, enhancing space details through an up-sampling convolution feature fusion block, and finally outputting the high-resolution hyperspectral image.
- 10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program, wherein the computer program, when executed by a processor, implements the steps of the method of any of claims 1 to 8.
Description
Image processing method, device and storage medium for fragment identification Technical Field The present invention relates to the field of image processing technologies, and in particular, to an image processing method and apparatus for fragment identification, and a storage medium. Background The accurate detection and identification of broken pieces has important significance for detecting and tracing explosive residues, evaluating performance, designing effective explosion-proof measures and the like. In recent years, the progress of remote sensing technology has driven the widespread use of unmanned aerial vehicle (UnmannedAerialVehicle, UAV) remote sensing platforms in multidimensional data acquisition. The UAV platform may be equipped with different sensors to capture multi-source images of the fragment, e.g., high definition data from RGB or spectral information from a hyperspectral camera. Current research on fragmentation relies on single modality imaging technology, with high RGB camera spatial resolution, but poor spectral resolution, lacking material discrimination capability. Hyperspectral imaging (HYPERSPECTRALIMAGER, HSI) has the advantage of narrow-band spectral resolution, but due to the hardware limitations of the imaging sensor, there is a trade-off between spatial and spectral resolution and signal-to-noise ratio in hyperspectral images. For current sensors, acquiring high resolution hyperspectral images (HR-HSI) is quite challenging due to the large stand-off distance required for the protection device, the high magnification required to image small size fragments. The contradiction between the requirement for acquiring fragment rich space details and spectrum information and the 'spectrum-space unavailable' of the current imaging technology has prompted the requirement for acquiring a multi-source image fusion technology. In image enhancement, it is often considered to increase the resolution of an image by means of another image captured from the same scene. Since the RGB image shows properties complementary to HSI, a high spatial resolution HR-HSI is obtained by integrating the spectral discrimination of HSI with the spatial details of RGB. The method can break through the limitation of a single sensor, realize the cooperative analysis of the components of the fragment material and the microstructure, and provide key technical support for the precise identification of fragments in a complex battlefield environment. The methods applied in the field of fusion-based Hyperspectral (HS) remote sensing image super-resolution reconstruction in the industry mainly comprise three methods, namely a method based on detail injection, a method based on optimization and a method based on deep learning (DEEPLEARNING, DL). The detail injection-based approach stems from full color sharpening that fuses the low resolution multispectral image and the high resolution full color (PAN) image. Although they are easy to apply, they are prone to cause space-spectral distortions. Based on the optimization approach, the task of hyperspectral super-resolution reconstruction is regarded as an uncomfortable inverse problem, and researchers focus on designing various regulators to accurately reconstruct the solution space, such as total variation and non-local similarity. They often rely on regularization of manual design. If the prior design is too simple, the performance will be less than ideal. Otherwise, the model is too difficult to optimize. The application of deep learning techniques to hyperspectral super-resolution reconstruction tasks has also made an impressive progression. In general, existing deep learning-based methods are divided into two groups, i.e., supervised and unsupervised, according to their training paradigms. In particular, the supervised approach aims at learning the nonlinear mapping from the input multispectral image and the hyperspectral image to the ground-truth high-resolution hyperspectral image through one or more branched networks without requiring a design-made prior. However, the unavailability of real high resolution hyperspectral images in real scenes forces researchers to synthesize a reduced-scale training triplet in which raw data is spatially downsampled in order to treat the observed low resolution hyperspectral image (LR-HSI) as a ground truth. This approach ignores the actual degradation model, lacks adequate interpretability, and makes them impractical for practical use due to the large number of training samples and complex network structures required. Less research is being done with respect to the super-resolution reconstruction method in an unsupervised manner. One prior art uses a spectral response function (SpectralResponseFunction, SRF) as a known prior to constrain the generated high resolution hyperspectral image, but ignores the point spread function (PointSpreadFunction, PSF). Another prior art is based on depth image priors (DeepImagePrior, DIP) and des