CN-121999141-A - Super-surface snapshot hyperspectral reconstruction model, training method thereof and electronic equipment

CN121999141ACN 121999141 ACN121999141 ACN 121999141ACN-121999141-A

Abstract

The invention discloses a hyperspectral reconstruction model of a hyperspectral snapshot, a training method thereof and electronic equipment, and relates to the field of hyperspectral snapshots. The invention realizes the end-to-end reconstruction from a low-channel multi-spectral snapshot image to a high-dimensional hyperspectral data cube by constructing a hypersurface snapshot hyperspectral reconstruction model comprising a segmentation module, a spectrum stability evaluation module, an inter-spectrum feature encoder, a conditional feature generation module and a hyperspectral generation module, and the model not only effectively inhibits the interference of a background and a shielding area, but also explicitly models the dependency relationship between the spectrum stability and the channel, and generates the multi-channel hyperspectral feature and finally outputs the hyperspectral data cube by iterative autoregressive modeling based on the conditional feature and random noise.

Inventors

Ju Fayin
LI NING

Assignees

浙江优众新材料科技有限公司

Dates

Publication Date: 20260508
Application Date: 20260408

Claims (10)

1. A hyperspectral reconstruction model for a hyperspectral snapshot of a subsurface, comprising: the segmentation module is used for automatically segmenting the multispectral snapshot image acquired by the super surface to generate a target area mask; The spectrum stability evaluation module is used for evaluating the response consistency of each effective pixel on each spectrum channel of the multispectral snapshot image based on the effective pixel indicated by the target area mask and generating a structural feature based on the evaluation result; The inter-spectrum feature encoder is used for extracting features of the structural features, modeling cross-channel dependency relations among all spectrum channels based on the extracted features and outputting space-spectrum fusion features; The conditional feature generation module is used for generating conditional features by fusing spatial structure information and spectrum constraint based on the spatial-spectrum fusion features and the structural features; And the hyperspectral generation module is used for generating multi-channel hyperspectral features through iterative autoregressive modeling based on the conditional features and random noise, and converting the multi-channel hyperspectral features into hyperspectral data cubes to be output.
2. The hyperspectral reconstruction model as claimed in claim 1, wherein the evaluating the consistency of the response of each active pixel over each spectral channel of the multispectral snapshot image and generating the structured feature based on the evaluation results is specifically: the method comprises the steps of calculating the standard deviation of the spectral response of an effective pixel on each spectral channel, dividing a target area mask into a static stable area and a dynamic unstable area according to the standard deviation of the spectral response; Encoding the static stable region as a first binary mask and the dynamic unstable region as a second binary mask; splicing the multispectral snapshot image, the first binary mask and the second binary mask in the channel dimension to obtain a multispectral content tensor; And fusing the target area mask with the first binary mask and the second binary mask to generate a guiding mask for indicating the static stable area and the dynamic unstable area.
3. The hyperspectral reconstruction model as claimed in claim 2 wherein the inter-spectral feature encoder comprises: The depth convolution layer is used for extracting local space-spectrum characteristics of the multispectral content tensor and outputting a preliminary characteristic representation; the inter-spectrum attention layer is used for splitting the preliminary feature representation into a plurality of channel feature matrixes according to the dimension of the spectrum channels, calculating interdependence weights among the spectrum channels through an attention mechanism based on the channel feature matrixes, and carrying out weighted fusion on feature vectors from the channel feature matrixes at each space position based on the weights to generate spectrum enhancement features; and the multi-head attention layer is used for carrying out global context modeling on the spectrum enhancement features crossing space positions and outputting space-spectrum fusion features.
4. A hyperspectral reconstruction model as claimed in claim 3 wherein the conditional feature generation module comprises: The layer encoder is used for extracting the spatial structural characteristics of the spatial-spectral fusion characteristics and outputting structural characteristic representations; The control encoder is used for adjusting the space size of the guide mask to be consistent with the space dimension represented by the structural feature, and encoding the adjusted guide mask into a spectrum constraint feature through a convolution function and a nonlinear activation function; And the cross-attention fusion unit is used for executing cross-attention operation based on the structural feature representation and the spectrum constraint feature, and carrying out residual fusion on the result of the operation and the structural feature representation to generate a conditional feature.
5. The hyperspectral reconstruction model as claimed in claim 4, wherein the hyperspectral generation module specifically comprises: the initialization unit is used for carrying out feature alignment on the conditional features and random noise to generate initial generation features; And the iteration generating unit is used for generating a multi-channel hyperspectral characteristic through a multi-round autoregressive process comprising upsampling and linear transformation based on the initial generation characteristic and the conditional characteristic.
6. A hyperspectral reconstruction model as claimed in claim 5 wherein, The autoregressive process is used for dividing the spectrum channel of the multi-channel hyperspectral characteristic to be generated into a plurality of channel groups according to the band sequence and carrying out the round processing according to the preset channel group sequence to generate the multi-channel hyperspectral characteristic, wherein: Generating a first channel group feature based on the initial generation feature and the condition feature in a first round; And in the second round and the subsequent rounds, spatial up-sampling is carried out on the channel group characteristics output by the previous round, up-sampling results and the condition characteristics are respectively subjected to linear transformation and then added, and then the channel group characteristics corresponding to the current round are output through Softmax activation function processing.
7. The hyperspectral reconstruction model as claimed in claim 1 wherein the transformation of the multi-channel hyperspectral features into hyperspectral data cube output is specifically: And (3) carrying out convolution operation on the multi-channel hyperspectral features, mapping the feature map of each spectral channel into a single-band intensity image under the corresponding wavelength, and stacking the single-band intensity images according to the ascending order of the wavelength to form a hyperspectral data cube.
8. A method of training a hyperspectral reconstruction model as claimed in any one of claims 1 to 7 comprising: acquiring a multispectral snapshot image acquired by a super surface, acquiring a true value hyperspectral data cube registered with the multispectral snapshot image, constructing a pairing sample, preprocessing the pairing sample to obtain a training sample, and forming a data set; And training a hyperspectral reconstruction model of the hypersurface snapshot through the data set and the joint loss function, and freezing parameters of other layers except for an inter-spectrum attention layer in the inter-spectrum feature encoder in the training process.
9. The method for training a hyperspectral reconstruction model as claimed in claim 8, wherein the joint loss function comprises: Pixel level one norm loss for constraining absolute errors of a hyperspectral data cube output by the hyperspectral reconstruction model of the hypersurface snapshot and the true hyperspectral data cube on each spectral channel intensity value of each pixel; A spectral angle constraint loss for constraining directional consistency between the hyperspectral data cube and the true hyperspectral data cube at each pixel, between spectral curves composed of spectral channel intensity values; the cross entropy loss is used for supervising the prediction consistency of channel group characteristics generated by the hyperspectral generating module in each turn in the autoregressive process and corresponding spectrum channels in the truth hyperspectral data cube; A distribution consistency constraint term for constraining consistency of spectral channel intensity value distributions of the hyperspectral data cube and the true hyperspectral data cube across all pixels.
10. An electronic device comprising a processor and a memory storing a program, wherein the program comprises instructions that when executed by the processor cause the processor to perform the training method of claim 8.

Description

Super-surface snapshot hyperspectral reconstruction model, training method thereof and electronic equipment Technical Field The invention relates to the field of hypersurface snapshot, in particular to a hyperspectral reconstruction model of a hyperspectral snapshot, a training method thereof and electronic equipment. Background In contrast to the traditional hyperspectral imaging system which usually relies on a prism, a grating or a filter wheel and other light-splitting elements, the whole spectrum information needs to be acquired through scanning or multi-frame acquisition, so that the equipment is huge in size, complex in structure and difficult to meet the real-time imaging requirement of a dynamic scene, the hyperspectral imaging technology based on the hypersurface can synchronously capture space and partial spectrum information under single exposure by utilizing a single integrated optical device, the compactness and imaging speed of the system are remarkably improved, but the output of the hyperspectral imaging system is only a multispectral snapshot image of a limited channel (such as 4-8 channels), and the hyperspectral resolution of tens to hundreds of wave bands is far lower than that required by practical application, so that the mapping of a low-channel snapshot into a hyperspectral data cube through a deep learning-based spectrum reconstruction technology is needed. However, the existing reconstruction method has multiple defects commonly that on one hand, most models do not effectively divide a target area in an input image, a background or shielding area is incorporated into reconstruction and noise interference is introduced, on the other hand, an evaluation mechanism for consistency of response among multispectral channels is lacked, static stable areas and dynamic unstable areas cannot be distinguished, so that spectrum reconstruction is distorted at unstable pixels, meanwhile, the existing network structure often ignores cross-channel dependency among spectrum channels, the correlation among spectrum channels cannot be explicitly modeled, channel crosstalk is easy to generate, and in addition, although a partial generation method tries to introduce autoregression or conditional guidance strategy to improve detail fidelity, spatial structure priori and spectrum stability constraint are not fused, the training process is not optimized for small sample scenes, and high-fidelity and physically consistent hyperspectral reconstruction is difficult to realize under limited data. Therefore, there is a need for a reconstruction framework that can jointly perceive target regions, evaluate spectral stability, model inter-spectrum dependencies, and support condition-driven iterative generation to efficiently and accurately recover hyperspectral data cubes under the hardware constraints of hypersurface snapshot imaging. Disclosure of Invention In order to solve the problem that spatial structure perception, spectrum stability modeling and inter-spectrum dependency learning are difficult to consider in the hyperspectral reconstruction of the hyperspectral snapshot in the prior art, the invention provides a hyperspectral reconstruction model of the hyperspectral snapshot, which comprises the following steps: the segmentation module is used for automatically segmenting the multispectral snapshot image acquired by the super surface to generate a target area mask; The spectrum stability evaluation module is used for evaluating the response consistency of each effective pixel on each spectrum channel of the multispectral snapshot image based on the effective pixel indicated by the target area mask and generating a structural feature based on the evaluation result; The inter-spectrum feature encoder is used for extracting features of the structural features, modeling cross-channel dependency relations among all spectrum channels based on the extracted features and outputting space-spectrum fusion features; The conditional feature generation module is used for generating conditional features by fusing spatial structure information and spectrum constraint based on the spatial-spectrum fusion features and the structural features; And the hyperspectral generation module is used for generating multi-channel hyperspectral features through iterative autoregressive modeling based on the conditional features and random noise, and converting the multi-channel hyperspectral features into hyperspectral data cubes to be output. Further, the method comprises the steps of evaluating the response consistency of each effective pixel on each spectrum channel of the multispectral snapshot image, and generating a structural feature based on the evaluation result, wherein the structural feature is specifically as follows: the method comprises the steps of calculating the standard deviation of the spectral response of an effective pixel on each spectral channel, dividing a target area mask into a static stable area and a dyna