CN-121982533-A - Transformer-based unsupervised span identification pairing and full-spectrum synthesis method

CN121982533ACN 121982533 ACN121982533 ACN 121982533ACN-121982533-A

Abstract

The invention discloses a trans-former-based unsupervised span identification pairing and full-spectrum synthesis method, relates to the technical field of hyperspectral image processing and computer vision, and particularly relates to a span identification pairing and full-spectrum synthesis method. The invention aims to solve the problems of low detection precision of a hyperspectral target, indistinguishable weak spectrum difference and difficult fusion of different spectrum bands in the prior art. The method comprises the steps of constructing a space-spectrum attention fusion mechanism, extracting a spectrum curve, calculating the correlation weight of a neighborhood and a central spectrum, carrying out deep modeling on a spectrum sequence by using a transducer, training an encoder by combining a self-supervision learning strategy, and realizing pixel-level target detection by using cosine similarity. And extracting group spectrum characteristics and performing cross-image target type matching by adopting a Hungary algorithm. And finally, generating continuous and complete high-quality full-spectrum through an asynchronous target registration and full-spectrum weighted fusion frame.

Inventors

GAO GUOMING
CHEN BANGYI
MIAO JING
MA TIANSHU
GU YANFENG

Assignees

哈尔滨工业大学
航天恒星科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260123

Claims (10)

1. A method for recognizing, pairing and synthesizing full spectrum based on an unsupervised span is characterized by comprising the following specific processes: step one, acquiring a hyperspectral image 1 to be detected, and constructing an original spectrum matrix based on the hyperspectral image 1 Acquiring a hyperspectral image 2 to be detected, and constructing an original spectrum matrix based on the hyperspectral image 2 The hyperspectral image 1 to be detected is a visible light image, the hyperspectral image 2 to be detected is a short wave infrared image, and two types of targets to be detected exist in the hyperspectral image 1 to be detected And Two types of targets to be detected exist in the hyperspectral image 2 to be detected And ; Step two, for the original spectrum matrix Constructing a space-spectrum attention weight matrix based on neighborhood spectrum relation, and obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrix For the original spectrum matrix Constructing a space-spectrum attention weight matrix based on neighborhood spectrum relation, and obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrix ; Step three, fusing spectrum matrix Carrying out overlapped spectrum division on each spectrum to obtain overlapped patches with each spectrum divided, mapping each spectrum subjected to overlapped spectrum division into feature vectors through a linear embedded layer, and fusing a spectrum matrix The feature vectors corresponding to all spectrums in the sequence are formed into a sequence, and a learnable global embedding is spliced at the forefront end of the sequence Obtaining the sequence Sequence(s) Embedding with a leachable location Adding elements by elements to obtain a sequence ; For fusion spectrum matrix Carrying out overlapped spectrum division on each spectrum to obtain overlapped patches with each spectrum divided, mapping each spectrum subjected to overlapped spectrum division into feature vectors through a linear embedded layer, and fusing a spectrum matrix The feature vectors corresponding to all spectrums in the sequence are formed into a sequence, and a learnable global embedding is spliced at the forefront end of the sequence Obtaining the sequence Sequence(s) Embedding with a leachable location Adding elements by elements to obtain a sequence ; Step four, slave sequence Medium random sampling A plurality of pixel spectra, each of which is used as a training sample Sampling at random Each pixel in the individual pixel spectrum is gaussian blurred, and each pixel with gaussian blur applied is used as an enhanced sample Training samples Input on-line encoder On-line encoder Output characteristics to enhance the sample Input momentum encoder Momentum encoder Output characteristics, in-line encoder Mapping of output features into embedded vectors by projection head Momentum encoder Mapping of output features into embedded vectors by projection head ; From the sequence Medium random sampling A plurality of pixel spectra, each of which is used as a training sample Sampling at random Each pixel in the individual pixel spectrum is gaussian blurred, and each pixel with gaussian blur applied is used as an enhanced sample Training samples Input on-line encoder On-line encoder Output characteristics to enhance the sample Input momentum encoder Momentum encoder Output characteristics, in-line encoder Mapping of output features into embedded vectors by projection head Momentum encoder Mapping of output features into embedded vectors by projection head ; The projection head sequentially comprises a first MLP and a second MLP; Step five, based on embedded vector Embedding vectors Calculating InfoNCE loss to obtain trained on-line encoder Based on embedded vectors Embedding vectors Calculating InfoNCE loss to obtain trained on-line encoder ; Step six, fusing the spectrum matrix On-line encoder with trained input Trained on-line encoder Output characteristics Fusion of the spectral matrices On-line encoder with trained input Trained on-line encoder Output characteristics ; Step seven, the target category of the hyperspectral image 1 to be detected Priori spectra And category Priori spectra Mapping to an embedded vector And The target category of the hyperspectral image 2 to be detected Priori spectra And category Priori spectra Mapping to an embedded vector And ; Step eight, calculating characteristics Spectrum and each pixel of (3) And Cosine similarity of (2) And Generating a target response graph based on cosine similarity corresponding to the spectra of all pixels, normalizing each pixel in the target response graph to obtain a response value of each pixel, setting a threshold value Characteristic of if The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target And Characteristic of if If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target And A target to be detected The corresponding pixels are marked by rectangular boxes, and the detected target is marked The corresponding pixels are marked by rectangular frames; Computing features Spectrum and each pixel of (3) And Cosine similarity of (2) And Generating a target response graph based on cosine similarity corresponding to the spectra of all pixels, normalizing each pixel in the target response graph to obtain a response value of each pixel, setting a threshold value Characteristic of if The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target And Characteristic of if If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target And A target to be detected The corresponding pixels are marked by rectangular boxes, and the detected target is marked The corresponding pixels are marked by rectangular frames; Step nine, targets corresponding to the hyperspectral images 1 to be detected And Respectively calculating group characteristics by pixel spectrum matrix in rectangular frame, and performing Z-score standardization on the group characteristics to obtain a target corresponding to the standardized hyperspectral image 1 to be detected Group spectral feature vector and target of (2) Is a group spectrum feature vector of (1); target corresponding to hyperspectral image 2 to be detected And Respectively calculating group characteristics by pixel spectrum matrix in rectangular frame, and performing Z-score standardization on the group characteristics to obtain a target corresponding to the standardized hyperspectral image 2 to be detected Group spectral feature vector and target of (2) Is a group spectrum feature vector of (1); Calculating cosine distance between group spectrum characteristic vector corresponding to standardized hyperspectral image 1 to be measured and group spectrum characteristic vector corresponding to hyperspectral image 2 to be measured, and forming cost matrix based on cosine distance ; Cost matrix by using Hungary algorithm Solving to obtain optimal matching, wherein the optimal matching comprises a type corresponding relation, a spectrum distance and a matching confidence; And step ten, obtaining a full-spectrum hyperspectral image based on the optimal matching obtained in the step nine and the step ten.
2. The method for unsupervised spectral band recognition pairing and full spectrum band spectrum synthesis based on transfomer as claimed in claim 1, wherein the step one is to obtain hyperspectral image 1 to be detected, and construct original spectrum matrix based on hyperspectral image 1 Acquiring a hyperspectral image 2 to be detected, and constructing an original spectrum matrix based on the hyperspectral image 2 The hyperspectral image 1 to be detected is a visible light image, the hyperspectral image 2 to be detected is a short wave infrared image, and two types of targets to be detected exist in the hyperspectral image 1 to be detected And Two types of targets to be detected exist in the hyperspectral image 2 to be detected And The specific process is as follows: step one by one, obtaining hyperspectral image to be measured , ; There are two kinds of targets to be detected in the hyperspectral image 1 to be detected And ; Wherein, the Is a real number; Is the image length; is the image width; Is the number of wave bands; hyperspectral image to be measured Comprising The pixels are arranged in rows and each pixel spectrum comprises The individual wave bands are columns, and the construction dimension is Is a raw spectral matrix of (2) , ; Step one, obtaining hyperspectral image to be measured , ; There are two kinds of targets to be detected in the hyperspectral image 2 to be detected And ; Wherein, the Is a real number; Is the image length; is the image width; Is the number of wave bands; hyperspectral image to be measured Comprising The pixels are arranged in rows and each pixel spectrum comprises The individual wave bands are columns, and the construction dimension is Is a raw spectral matrix of (2) , 。
3. The method for unsupervised span recognition pairing and full spectrum synthesis based on transfomer as claimed in claim 2, wherein the step two is characterized in that the original spectrum matrix is obtained by Constructing a space-spectrum attention weight matrix based on neighborhood spectrum relation, and obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrix For the original spectrum matrix Constructing a space-spectrum attention weight matrix based on neighborhood spectrum relation, and obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrix The specific process is as follows: step two, for original spectrum matrix Constructing a space-spectrum attention weight matrix based on neighborhood spectrum relation, and obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrix The specific process is as follows: step two, one by one, extracting the original spectrum matrix Each pixel of (3) 9 Spectral vectors in 3 x 3 neighborhood of (2) , Comprising ; Representing pixels In the 3 x 3 neighborhood of (2) A bar spectrum vector; ; ; step two, calculation And center spectrum Is expressed as: , Wherein, the Representing a 2-norm; Representation of And center spectrum Cosine similarity of (c); step two, one and three based on And center spectrum Cosine similarity of (2) Constructing a space-spectrum attention weight matrix based on neighborhood spectrum relation ; ; Step two, four, obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrix And is expressed as: Wherein, the Representing a 2-norm; Representation of And center spectrum Cosine similarity of (c); representing pixels Upper corner mark Representation transposition; Representing a fusion spectral matrix; step two, for the original spectrum matrix Constructing a space-spectrum attention weight matrix based on neighborhood spectrum relation, and obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrix The specific process is as follows: Step two and one, extracting original spectrum matrix Each pixel of (3) 9 Spectral vectors in 3 x 3 neighborhood of (2) , Comprising ; Representing pixels In the 3 x 3 neighborhood of (2) A bar spectrum vector; ; ; step two by two, calculate And center spectrum Is expressed as: , Wherein, the Representing a 2-norm; Representation of And center spectrum Cosine similarity of (c); step two and three based on And center spectrum Cosine similarity of (2) Constructing a space-spectrum attention weight matrix based on neighborhood spectrum relation ; ; Step two, two and four, obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrix And is expressed as: Wherein, the Representing a 2-norm; Representation of And center spectrum Cosine similarity of (c); representing each pixel Upper corner mark Representation transposition; Representing a fused spectral matrix.
4. The method for unsupervised span recognition pairing and full spectrum synthesis based on transfomer as claimed in claim 3, wherein the step three is characterized in that the spectrum matrix is fused Carrying out overlapped spectrum division on each spectrum to obtain overlapped patches with each spectrum divided, mapping each spectrum subjected to overlapped spectrum division into feature vectors through a linear embedded layer, and fusing a spectrum matrix The feature vectors corresponding to all spectrums in the sequence are formed into a sequence, and a learnable global embedding is spliced at the forefront end of the sequence Obtaining the sequence Sequence(s) Embedding with a leachable location Adding elements by elements to obtain a sequence ; For fusion spectrum matrix Carrying out overlapped spectrum division on each spectrum to obtain overlapped patches with each spectrum divided, mapping each spectrum subjected to overlapped spectrum division into feature vectors through a linear embedded layer, and fusing a spectrum matrix The feature vectors corresponding to all spectrums in the sequence are formed into a sequence, and a learnable global embedding is spliced at the forefront end of the sequence Obtaining the sequence Sequence(s) Embedding with a leachable location Adding elements by elements to obtain a sequence ; The specific process is as follows: step three, one pair of fusion spectrum matrixes Carrying out overlapped spectrum division on each spectrum to obtain overlapped patches with each spectrum divided, mapping each spectrum subjected to overlapped spectrum division into feature vectors through a linear embedded layer, and fusing a spectrum matrix The feature vectors corresponding to all spectrums in the sequence are formed into a sequence, and a learnable global embedding is spliced at the forefront end of the sequence Obtaining the sequence Sequence(s) Embedding with a leachable location Adding elements by elements to obtain a sequence The specific process is as follows: step three, one-to-one, pair fusion spectrum matrix The method comprises the steps of carrying out overlapping spectrum division on each spectrum to obtain overlapping patches with each spectrum divided, wherein the specific process is as follows: fusion spectral matrix Through convolution kernel size step Patch length 1-Dimensional depth convolution layer of (2) to generate The individual spectra being embedded in marks, i.e. individual spectra being divided into A plurality of overlapping spectral patches; Mapping each spectrum subjected to overlapped spectrum division into feature vectors through a linear embedded layer, wherein the specific process comprises the following steps: Each of length of After passing through the full connection layer, is mapped into Is a feature vector of (1); The dimensions are represented as such, ; Each spectrum, after passing through the fully connected layer, is mapped as Is a feature vector of (1); Step three one three, fusion spectrum matrix All spectra of (3) After passing through the full connection layer, the obtained feature vectors form a sequence, and a learnable global embedding is spliced at the forefront end of the sequence Obtaining the sequence ; Sequences are sequenced Embedding with a leachable location Adding elements by elements to obtain a sequence ; , ; Step three, two, pair fusion spectrum matrix Carrying out overlapped spectrum division on each spectrum to obtain overlapped patches with each spectrum divided, mapping each spectrum subjected to overlapped spectrum division into feature vectors through a linear embedded layer, and fusing a spectrum matrix The feature vectors corresponding to all spectrums in the sequence are formed into a sequence, and a learnable global embedding is spliced at the forefront end of the sequence Obtaining the sequence Sequence(s) Embedding with a leachable location Adding elements by elements to obtain a sequence The specific process is as follows: step three two one pair of fusion spectrum matrixes The method comprises the steps of carrying out overlapping spectrum division on each spectrum to obtain overlapping patches with each spectrum divided, wherein the specific process is as follows: fusion spectral matrix Through convolution kernel size step Patch length 1-Dimensional depth convolution layer of (2) to generate The individual spectra being embedded in marks, i.e. individual spectra being divided into A plurality of overlapping spectral patches; and step III, mapping each spectrum subjected to overlapped spectrum division into a characteristic vector through a linear embedded layer, wherein the specific process comprises the following steps of: Each of length of After passing through the full connection layer, is mapped into Is a feature vector of (1); The dimensions are represented as such, ; Each spectrum, after passing through the fully connected layer, is mapped as Is a feature vector of (1); Step III, and fusion spectrum matrix All spectra of (3) After passing through the full connection layer, the obtained feature vectors form a sequence, and a learnable global embedding is spliced at the forefront end of the sequence Obtaining the sequence ; Sequences are sequenced Embedding with a leachable location Adding elements by elements to obtain a sequence ; , 。
5. The method for unsupervised span recognition pairing and full spectrum synthesis based on Transformer as claimed in claim 4, wherein the sequence is selected from the group consisting of Medium random sampling A plurality of pixel spectra, each of which is used as a training sample Sampling at random Each pixel in the individual pixel spectrum is gaussian blurred, and each pixel with gaussian blur applied is used as an enhanced sample Training samples Input on-line encoder On-line encoder Output characteristics to enhance the sample Input momentum encoder Momentum encoder Output characteristics, in-line encoder Mapping of output features into embedded vectors by projection head Momentum encoder Mapping of output features into embedded vectors by projection head ; From the sequence Medium random sampling A plurality of pixel spectra, each of which is used as a training sample Sampling at random Each pixel in the individual pixel spectrum is gaussian blurred, and each pixel with gaussian blur applied is used as an enhanced sample Training samples Input on-line encoder On-line encoder Output characteristics to enhance the sample Input momentum encoder Momentum encoder Output characteristics, in-line encoder Mapping of output features into embedded vectors by projection head Momentum encoder Mapping of output features into embedded vectors by projection head The projection head sequentially comprises a first MLP and a second MLP; the specific process is as follows: Step four, one, slave sequence Medium random sampling A plurality of pixel spectra, each of which is used as a training sample Sampling at random Each pixel in the individual pixel spectrum is gaussian blurred, and each pixel with gaussian blur applied is used as an enhanced sample Training samples Input on-line encoder On-line encoder Output characteristics to enhance the sample Input momentum encoder Momentum encoder Output characteristics, in-line encoder Mapping of output features into embedded vectors by projection head Momentum encoder Mapping of output features into embedded vectors by projection head The specific process is as follows: Step four one by one, slave sequence Medium random sampling A plurality of pixel spectra, each of which is used as a training sample Sampling at random Each pixel in the individual pixel spectrum is gaussian blurred, and each pixel with gaussian blur applied is used as an enhanced sample ; Step four, two, training samples Input on-line encoder On-line encoder The specific process is as follows: training sample Obtaining features through a normalization layer Features of Obtaining features through a multi-head attention mechanism ; Features (e.g. a character) And Element-by-element addition to obtain features Features of Obtaining features through a normalization layer Features of Obtaining features through a fully-connected layer Features of Obtaining features through 1-dimensional depth convolution Features of Obtaining features through a fully-connected layer Features of Obtaining features through a normalization layer Features of Obtaining features through a multi-head attention mechanism Features of And features Element-by-element addition to obtain features Features of Obtaining features through a normalization layer Features of Obtaining features through a fully-connected layer Features of Obtaining features through 1-dimensional depth convolution Features of Obtaining features through a fully-connected layer ; Step four, one and three, will strengthen the sample Input momentum encoder Momentum encoder The specific process is as follows: Enhancement of samples Obtaining features through a normalization layer Features of Obtaining features through a multi-head attention mechanism ; Features (e.g. a character) And Element-by-element addition to obtain features Features of Obtaining features through a normalization layer Features of Obtaining features through a fully-connected layer Features of Obtaining features through 1-dimensional depth convolution Features of Obtaining features through a fully-connected layer Features of Obtaining features through a normalization layer Features of Obtaining features through a multi-head attention mechanism Features of And Element-by-element addition to obtain features Features of Obtaining features through a normalization layer Features of Obtaining features through a fully-connected layer Features of Obtaining features through 1-dimensional depth convolution Features of Obtaining features through a fully-connected layer ; Four-in-four on-line encoder Mapping of output features into embedded vectors by projection head ; Momentum encoder Mapping of output features into embedded vectors by projection head ; The projection head sequentially comprises a first MLP and a second MLP; Step four, two, slave sequence Medium random sampling A plurality of pixel spectra, each of which is used as a training sample Sampling at random Each pixel in the individual pixel spectrum is gaussian blurred, and each pixel with gaussian blur applied is used as an enhanced sample Training samples Input on-line encoder On-line encoder Output characteristics to enhance the sample Input momentum encoder Momentum encoder Output characteristics, in-line encoder Mapping of output features into embedded vectors by projection head Momentum encoder Mapping of output features into embedded vectors by projection head The specific process is as follows: Step four, two, one, slave sequence Medium random sampling A plurality of pixel spectra, each of which is used as a training sample ; For random sampling Each pixel in the individual pixel spectrum is gaussian blurred, and each pixel with gaussian blur applied is used as an enhanced sample ; Step four, two by two, training samples Input on-line encoder On-line encoder The specific process is as follows: training sample Obtaining features through a normalization layer Features of Obtaining features through a multi-head attention mechanism Features of And Element-by-element addition to obtain features Features of Obtaining features through a normalization layer Features of Obtaining features through a fully-connected layer Features of Obtaining features through 1-dimensional depth convolution Features of Obtaining features through a fully-connected layer Features of Obtaining features through a normalization layer Features of Obtaining features through a multi-head attention mechanism Features of And features Element-by-element addition to obtain features Features of Obtaining features through a normalization layer Features of Obtaining features through a fully-connected layer Features of Obtaining features through 1-dimensional depth convolution Features of Obtaining features through a fully-connected layer ; Step four, two and three, will strengthen the sample Input momentum encoder Momentum encoder The specific process is as follows: Enhancement of samples Obtaining features through a normalization layer Features of Obtaining features through a multi-head attention mechanism Features of And Element-by-element addition to obtain features Features of Obtaining features through a normalization layer Features of Obtaining features through a fully-connected layer Features of Obtaining features through 1-dimensional depth convolution ; Features (e.g. a character) Obtaining features through a fully-connected layer Features of Obtaining features through a normalization layer Features of Obtaining features through a multi-head attention mechanism Features of And Element-by-element addition to obtain features Features of Obtaining features through a normalization layer Features of Obtaining features through a fully-connected layer Features of Obtaining features through 1-dimensional depth convolution Features of Obtaining features through a fully-connected layer ; Step four, two and four on-line encoder Mapping of output features into embedded vectors by projection head ; Momentum encoder Mapping of output features into embedded vectors by projection head ; The projection head sequentially comprises a first MLP and a second MLP.
6. The method for unsupervised span recognition pairing and full spectrum synthesis based on transfomer as claimed in claim 5, wherein the fifth step is based on embedded vector Embedding vectors Calculating InfoNCE loss to obtain trained on-line encoder Based on embedded vectors Embedding vectors Calculating InfoNCE loss to obtain trained on-line encoder The specific process is as follows: Step five, based on embedded vector Embedding vectors Calculating InfoNCE loss to obtain trained on-line encoder The specific process is as follows: Step five, embedding vectors Sequentially storing into FIFO queue Q, length K=1000, and forming negative sample set Wherein, the method comprises the steps of, Represent the first Negative samples; Step five two, based on embedded vector Embedding vectors Calculate InfoNCE loss, expressed as: Wherein, the Represent the first InfoNCE of the spectrum of the individual pixels is lost, The temperature coefficient is represented by a temperature coefficient, Represent the first Negative samples; optimizing on-line encoder parameters based on InfoNCE losses ; Momentum encoder parameters The independent gradient update is not carried out, and the parameters of the online encoder are only updated through weighted average synchronization, so that the updated momentum encoder parameters are obtained And is expressed as: Wherein, the The momentum coefficient is represented by a value representing the momentum coefficient, ; Repeating the steps of step five one to step three and step five two to step two until all training samples Corresponding embedded vector All perform well to obtain trained on-line encoder ; Step five, based on embedded vector Embedding vectors Calculating InfoNCE loss to obtain trained on-line encoder The specific process is as follows: Step five two one, embedding vectors Sequentially storing into FIFO queue Q, length K=1000, and forming negative sample set ; Wherein, the Represent the first Negative samples; Step five two, based on embedded vector Embedding vectors Calculate InfoNCE loss, expressed as: Wherein, the Represent the first InfoNCE of the spectrum of the individual pixels is lost, The temperature coefficient is represented by a temperature coefficient, Represent the first Negative samples; optimizing on-line encoder parameters based on InfoNCE losses ; Momentum encoder parameters The independent gradient update is not carried out, and the parameters of the online encoder are only updated through weighted average synchronization, so that the updated momentum encoder parameters are obtained And is expressed as: Wherein, the The momentum coefficient is represented by a value representing the momentum coefficient, ; Repeating the steps of step five, step two and step two until all training samples Corresponding embedded vector All perform well to obtain trained on-line encoder 。
7. The method for unsupervised span recognition pairing and full spectrum synthesis based on transfomer as claimed in claim 6, wherein the step seven is to test the target class of the hyperspectral image 1 Priori spectra And category Priori spectra Mapping to an embedded vector And The target category of the hyperspectral image 2 to be detected Priori spectra And category Priori spectra Mapping to an embedded vector And The specific process is as follows: seventhly, classifying targets of hyperspectral image 1 to be detected Priori spectra Mapping to an embedded vector The specific process is as follows: constructing an original spectrum matrix based on the hyperspectral image 1 to be detected; carrying out overlap spectrum division on each spectrum in the fusion spectrum matrix to obtain overlap patches with each spectrum divided, mapping each spectrum subjected to overlap spectrum division into feature vectors through a linear embedding layer, forming a sequence by the feature vectors corresponding to all spectrums in the fusion spectrum matrix, splicing a learnable global embedding at the forefront end of the sequence to obtain the sequence, and carrying out element-by-element addition on the sequence and the learnable position embedding to obtain the sequence ; Sequences are sequenced Input step five-one trained on-line encoder Obtaining an embedded vector ; Seventhly, classifying targets of hyperspectral image 1 to be detected Priori spectra Mapping to an embedded vector The specific process is as follows: constructing an original spectrum matrix based on the hyperspectral image 1 to be detected; carrying out overlap spectrum division on each spectrum in the fusion spectrum matrix to obtain overlap patches with each spectrum divided, mapping each spectrum subjected to overlap spectrum division into feature vectors through a linear embedding layer, forming a sequence by the feature vectors corresponding to all spectrums in the fusion spectrum matrix, splicing a learnable global embedding at the forefront end of the sequence to obtain the sequence, and carrying out element-by-element addition on the sequence and the learnable position embedding to obtain the sequence ; Sequences are sequenced Input step five-one trained on-line encoder Obtaining an embedded vector ; Seventhly, classifying targets of hyperspectral image 2 to be detected Priori spectra Mapping to an embedded vector The specific process is as follows: constructing an original spectrum matrix based on the hyperspectral image 2 to be detected; carrying out overlap spectrum division on each spectrum in the fusion spectrum matrix to obtain overlap patches with each spectrum divided, mapping each spectrum subjected to overlap spectrum division into feature vectors through a linear embedding layer, forming a sequence by the feature vectors corresponding to all spectrums in the fusion spectrum matrix, splicing a learnable global embedding at the forefront end of the sequence to obtain the sequence, and carrying out element-by-element addition on the sequence and the learnable position embedding to obtain the sequence ; Will be Input step five two trained online encoder Obtaining an embedded vector ; Seventhly, classifying targets of hyperspectral image 2 to be detected Priori spectra Mapping to an embedded vector The specific process is as follows: constructing an original spectrum matrix based on the hyperspectral image 2 to be detected; carrying out overlap spectrum division on each spectrum in the fusion spectrum matrix to obtain overlap patches with each spectrum divided, mapping each spectrum subjected to overlap spectrum division into feature vectors through a linear embedding layer, forming a sequence by the feature vectors corresponding to all spectrums in the fusion spectrum matrix, splicing a learnable global embedding at the forefront end of the sequence to obtain the sequence, and carrying out element-by-element addition on the sequence and the learnable position embedding to obtain the sequence ; Will be Input step five two trained online encoder Obtaining an embedded vector 。
8. The method for unsupervised span recognition pairing and full spectrum synthesis based on transfomer as claimed in claim 7, wherein the feature is calculated in the eighth step Spectrum and each pixel of (3) And Cosine similarity of (2) And Generating a target response graph based on cosine similarity corresponding to the spectra of all pixels, normalizing each pixel in the target response graph to obtain a response value of each pixel, setting a threshold value Characteristic of if The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target And Characteristic of if If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target And A target to be detected The corresponding pixels are marked by rectangular boxes, and the detected target is marked The corresponding pixels are marked by rectangular frames; Computing features Spectrum and each pixel of (3) And Cosine similarity of (2) And Generating a target response graph based on cosine similarity corresponding to the spectra of all pixels, normalizing each pixel in the target response graph to obtain a response value of each pixel, setting a threshold value Characteristic of if The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target And Characteristic of if If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target And A target to be detected The corresponding pixels are marked by rectangular boxes, and the detected target is marked The corresponding pixels are marked by rectangular frames; the specific process is as follows: step eight, one, calculate the characteristics Spectrum and each pixel of (3) And Cosine similarity of (2) And Generating a target response graph based on cosine similarity corresponding to the spectra of all pixels, normalizing each pixel in the target response graph to obtain a response value of each pixel, setting a threshold value Characteristic of if The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target And Characteristic of if If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target And A target to be detected The corresponding pixels are marked by rectangular boxes, and the detected target is marked The corresponding pixels are marked by rectangular frames, and the specific process is as follows: Step eight one by one, calculate the characteristics Each pixel spectrum of (3) And (3) with Cosine similarity of (2) Expressed as: , Features to be characterized All pixel spectra of (3) Cosine similarity of (2) The cosine similarity of each row is rearranged from the big to the small Satisfy the following requirements The individual rows are moved to the next row and rearranged in order from large to small until all cosine similarities After the discharge, a two-dimensional target response diagram is obtained; carrying out normalization processing on each pixel in the target response graph to obtain a response value of each pixel; Setting a threshold value ; If the characteristics are The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target Or (b) ; If the characteristics are If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target Or (b) ; Step eight two, calculate characteristics Each pixel spectrum of (3) And (3) with Cosine similarity of (2) Expressed as: , Features to be characterized All pixel spectra of (3) Cosine similarity of (2) The cosine similarity of each row is rearranged from the big to the small Satisfy the following requirements The individual rows are moved to the next row and rearranged in order from large to small until all cosine similarities After the discharge, a two-dimensional target response diagram is obtained; carrying out normalization processing on each pixel in the target response graph to obtain a response value of each pixel; Setting a threshold value ; If the characteristics are The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target Or (b) ; If the characteristics are If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target Or (b) ; Step eight one three, step eight one and step eight two detected targets The corresponding pixels are marked by rectangular frames; object detected by step eight one and step eight two The corresponding pixels are marked by rectangular frames; Step eight, two, calculate the characteristics Spectrum and each pixel of (3) And Cosine similarity of (2) And Generating a target response graph based on cosine similarity corresponding to the spectra of all pixels, normalizing each pixel in the target response graph to obtain a response value of each pixel, setting a threshold value Characteristic of if The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target And Characteristic of if If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target And A target to be detected The corresponding pixels are marked by rectangular boxes, and the detected target is marked The corresponding pixels are marked by rectangular frames, and the specific process is as follows: Step eight two one, calculate the characteristics Each pixel spectrum of (3) And (3) with Cosine similarity of (2) Expressed as: , Features to be characterized All pixel spectra of (3) Cosine similarity of (2) The cosine similarity of each row is rearranged from the big to the small Satisfy the following requirements The individual rows are moved to the next row and rearranged in order from large to small until all cosine similarities After the discharge, a two-dimensional target response diagram is obtained; carrying out normalization processing on each pixel in the target response graph to obtain a response value of each pixel; Setting a threshold value ; If the characteristics are The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target Or (b) ; If the characteristics are If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target Or (b) ; Step eight two, calculate the characteristics Each pixel spectrum of (3) And (3) with Cosine similarity of (2) Expressed as: , Features to be characterized All pixel spectra of (3) Cosine similarity of (2) The cosine similarity of each row is rearranged from the big to the small Satisfy the following requirements The individual rows are moved to the next row and rearranged in order from large to small until all cosine similarities After the discharge, a two-dimensional target response diagram is obtained; carrying out normalization processing on each pixel in the target response graph to obtain a response value of each pixel; Setting a threshold value ; If the characteristics are The response value of a certain pixel is larger than or equal to the threshold value, and the corresponding pixel is the target Or (b) ; If the characteristics are If the response value of a certain pixel is smaller than the threshold value, the corresponding pixel is not the target Or (b) ; Step eight two three, step eight two one and step eight two detected target The corresponding pixels are marked by rectangular frames; The object detected by the step eight two one and the step eight two The corresponding pixels are marked with rectangular boxes.
9. The method for unsupervised span recognition pairing and full spectrum synthesis based on transfomer as claimed in claim 8, wherein the step nine is to measure the target corresponding to the spectrum image 1 And Respectively calculating group characteristics by pixel spectrum matrix in rectangular frame, and performing Z-score standardization on the group characteristics to obtain a target corresponding to the standardized hyperspectral image 1 to be detected Group spectral feature vector and target of (2) Is a group spectrum feature vector of (1); target corresponding to hyperspectral image 2 to be detected And Respectively calculating group characteristics by pixel spectrum matrix in rectangular frame, and performing Z-score standardization on the group characteristics to obtain a target corresponding to the standardized hyperspectral image 2 to be detected Group spectral feature vector and target of (2) Is a group spectrum feature vector of (1); Calculating cosine distance between group spectrum characteristic vector corresponding to standardized hyperspectral image 1 to be measured and group spectrum characteristic vector corresponding to hyperspectral image 2 to be measured, and forming cost matrix based on cosine distance The Hungary algorithm is adopted for cost matrix Solving to obtain optimal matching, wherein the optimal matching comprises a type corresponding relation, a spectrum distance and a matching confidence; the specific process is as follows: Step nine, target corresponding to hyperspectral image 1 to be detected And Rectangular in-frame pixel spectrum matrix of (a) And Respectively calculating group characteristics, performing Z-score standardization on the group characteristics to obtain group spectrum characteristic vectors corresponding to the standardized hyperspectral image 1 to be detected, wherein the specific process comprises the following steps: 1) The group characteristics comprise a mean spectrum, a standard deviation spectrum, a variation coefficient and a first derivative mean; Mean value spectrum : ; Wherein, the Representing one pixel within all object boxes of the same class of objects in a single image, Representing the total number of pixels of the similar targets in one graph; Standard deviation spectrum : ; Coefficient of variation : ; First derivative mean : ; Wherein, the Representing object of the same kind in a graph The average value of the individual wavebands, Representing object of the same kind in a graph Average value of each wave band; The mean spectrum, the standard deviation spectrum, the variation coefficient and the first derivative mean form a vector to form group characteristics; 2) Z-score standardization is carried out on the group characteristics to obtain group spectrum characteristic vectors corresponding to the standardized hyperspectral image 1 to be detected, and the specific process is as follows: z-score standardization is carried out on the group characteristics to obtain group spectrum characteristic vectors corresponding to the standardized hyperspectral image 1 to be detected, and each class of hyperspectral image 1 obtains a characteristic vector; step nine, two, treating the corresponding goal of the hyperspectral image 2 And Respectively calculating group characteristics by the pixel spectrum matrix in the rectangular frame; z-score standardization is carried out on the group characteristics to obtain group spectrum characteristic vectors corresponding to the standardized hyperspectral image 2 to be detected, and each class of the hyperspectral image 2 obtains a characteristic vector; step nine, calculating cosine distances between group spectrum feature vectors corresponding to standardized hyperspectral image 1 to be detected and group spectrum feature vectors corresponding to hyperspectral image 2 to be detected, and forming a cost matrix based on the cosine distances The Hungary algorithm is adopted for cost matrix Solving to obtain optimal matching, wherein the optimal matching comprises a type corresponding relation, a spectrum distance and a matching confidence, and the specific process is as follows: Step nine, three and one, calculating cosine distances between group spectrum feature vectors corresponding to standardized hyperspectral image 1to be detected and group spectrum feature vectors corresponding to hyperspectral image 2 to be detected, and forming a cost matrix based on the cosine distances The specific process is as follows: Cost matrix Is one Is a matrix of the (c) in the matrix, ; Wherein, the The cosine distance between the 1 st group characteristic vector of the standardized hyperspectral image to be measured 1 and the 1 st group characteristic vector of the standardized hyperspectral image to be measured 2 is represented; the cosine distance between the 1 st group feature vector of the standardized hyperspectral image to be measured 1 and the 2 nd group feature vector of the standardized hyperspectral image to be measured 2 is represented; the cosine distance between the 2 nd group characteristic vector of the standardized hyperspectral image to be measured 1 and the 1 st group characteristic vector of the standardized hyperspectral image to be measured 2 is represented; the cosine distance between the 2 nd group characteristic vector of the standardized hyperspectral image to be measured 1 and the 2 nd group characteristic vector of the standardized hyperspectral image to be measured 2 is represented; nine-three-two steps and adopting Hungary algorithm to pair cost matrix Solving to obtain optimal matching, wherein the optimal matching comprises a type corresponding relation, a spectrum distance and a matching confidence, and the specific process is as follows: 1) Cost matrix of pair Each element in the array is subjected to line normalization to obtain a cost matrix after line normalization ; 2) Cost matrix normalized by row Each element in the array is subjected to column normalization to obtain a cost matrix after column normalization ; 3) Cost matrix normalized by column by searching minimum covered wire Processing to obtain the minimum covering wire quantity; If the minimum number of cover lines is equal to the cost matrix after column normalization Is 2, then the column normalized cost matrix The position corresponding to the current zero element is the optimal matching; Otherwise, repeating 1) -3) until the optimal match is obtained.
10. The method for recognizing and pairing the unsupervised span based on the Transformer and synthesizing the full spectrum according to claim 9, wherein the full spectrum hyperspectral image is obtained based on the optimal matching obtained in the step nine, and the specific process is as follows: 1) The object in the two hyperspectral images And Covering the rectangular frame area of the frame with a mask; canny edge detection is carried out on 1 image in the two hyperspectral images to obtain a binarized edge image ; Canny edge detection is carried out on the other 1 image in the two hyperspectral images to obtain a binarized edge image ; At binarized edge image Performing Shi-Tomasi corner detection to obtain a feature point set P1; At binarized edge image Performing Shi-Tomasi corner detection to obtain a feature point set P2; screening initial matching pairs for the characteristic point sets P1 and P2 by using a FLANN method, removing the fuzzy matching pairs to obtain an initial matching pair set ; Rejecting initial set of matching pairs using RANSAC method Is obtained by obtaining the abnormal point of (2) And corresponding affine matrix ; Using affine matrices Affine transformation is carried out on 1 image in the two hyperspectral images to obtain The ROI space alignment of the two hyperspectral images is realized; 2) Overlapping the 81 th-120 th wave band of the visible light image and the 1-40 th wave band of the short wave infrared image after the ROI space alignment, and processing the 81 th-120 th wave band of the visible light image and the 1-40 th wave band of the short wave infrared image by using a weighted average method to obtain a full-spectrum hyperspectral image, wherein the specific process is as follows: 21 Calculating a weighted value of a pixel point And is expressed as: Wherein, the Is the sensor signal-to-noise ratio of the visible light image; is the sensor signal-to-noise ratio of the short wave infrared image; is a pixel point of the visible light image; Is a pixel point of the short wave infrared image; is the weighted value of a pixel point; 22 The spectrum mutation at the joint of the overlapping area and the non-overlapping area of the wave band is smoothly eliminated by adopting five-point Gaussian filtering to obtain a full-spectrum hyperspectral image, and the specific process is as follows: the kernel window size of the five-point Gaussian filter is 5, and the corresponding Gaussian kernel weight is ; The 81 th-120 th wave band of the visible light image and the 1 st-40 th wave band of the short wave infrared image after the ROI space alignment are overlapped, the 80 th wave band of the visible light image and the 41 st wave band of the short wave infrared image are two joint positions, 2 wave bands before and after each joint position are selected as filtering areas, and five-point Gaussian kernels are adopted for each pixel in the filtering areas to carry out weighted average, so that the 80 th wave band of the smoothed visible light image after joint treatment and the 41 st wave band of the short wave infrared image after joint treatment are obtained; weighting values of 1-80 wave bands and all pixel points of the visible light image after joint treatment Splicing 41-120 wave bands of the short wave infrared image after joint treatment to obtain a full-spectrum hyperspectral image with complete wave bands of 1-200 ; 3) Extracting a target corresponding to the hyperspectral image 1 from targets in the two hyperspectral images Is a target corresponding to the hyperspectral image 1 Is a target corresponding to the hyperspectral image 2 Is corresponding to the object of the hyperspectral image 2 Is a mean spectrum of (2); target corresponding to hyperspectral image 1 Is corresponding to the object of the hyperspectral image 2 2) To obtain a pixel of 1-200 bands; target corresponding to hyperspectral image 1 Is corresponding to the object of the hyperspectral image 2 2) To obtain a pixel of 1-200 bands; inserting the corresponding class of targets in the obtained 2 pixels with 1-200 wave bands into Corresponding mask positions in the spectrum image are obtained 。

Description

Transformer-based unsupervised span identification pairing and full-spectrum synthesis method Technical Field The invention relates to the technical field of hyperspectral image processing and computer vision, in particular to a method for identifying pairing of a spread spectrum section and synthesizing a full spectrum section spectrum. Background With the continuous development of remote sensing technology, unmanned aerial vehicle technology and sensor technology, hyperspectral imaging is becoming an important means for target identification and monitoring. The hyperspectral image is characterized by extremely high spectral resolution and can acquire the spectral information of a target in hundreds of continuous wave bands, so that the hyperspectral image has incomparable advantages in the aspects of material distinction, fine recognition and tiny difference detection of a target object compared with the traditional RGB image or multispectral image. Currently, earth observation modes are undergoing a profound revolution from traditional single-star detailed investigation to multi-star networking collaborative observation. The single satellite is limited by the orbit regression period, so that the high-frequency revisit of a specific area is difficult to realize, and the continuous monitoring requirement on a dynamic target cannot be met. The satellite cluster or constellation system can greatly improve the time resolution of observation through the cooperative deployment of a plurality of satellites in different orbits and the self-organizing network, and realize wide-area inspection, regional census and tracking monitoring of key regions. This evolution of technology has enabled "relay" continuous tracking of dynamic targets. However, this also presents a new technical challenge of how to accurately re-identify and synthesize full-spectrum data in subsequent images acquired by B satellites that may differ in spectrum range, subject pose, view angle, illumination, and appearance differences due to changes in atmospheric conditions, after a subject is captured by an a satellite in the constellation at a time and within a spectrum range. This challenge faces several difficulties: The spectral features are fine, it is difficult to distinguish similar targets, and it is difficult for traditional methods to capture local spectral details. The existing hyperspectral target detection method based on deep learning is mostly supervised learning, and a large number of targets and background samples need to be marked. However, the hyperspectral image has high labeling cost and high difficulty, and particularly, accurate labeling data in an actual scene is scarce, so that the applicability of the method in a non-labeling scene is limited, and the actual application requirement is difficult to meet. The spectrum acquired by different sensors (such as visible light and infrared light) has visual angle difference, illumination difference and atmospheric attenuation, so that data fusion is difficult. The prior art is difficult to effectively extract the characteristics in the extremely small adjacent area and complete high-precision matching and fusion of the span. Disclosure of Invention The invention aims to solve the problems that in the prior art, high-spectrum targets are low in detection precision, weak spectrum differences are difficult to distinguish and different spectrum segments are difficult to fuse, and provides a method for performing unsupervised spectrum segment recognition pairing and full spectrum segment spectrum synthesis based on a Transformer, which is used for realizing high-precision target detection, spectrum segment target type matching and full spectrum segment spectrum data generation. The method for recognizing and pairing the unsupervised span based on the Transformer and synthesizing the full-spectrum comprises the following specific processes: step one, acquiring a hyperspectral image 1 to be detected, and constructing an original spectrum matrix based on the hyperspectral image 1 Acquiring a hyperspectral image 2 to be detected, and constructing an original spectrum matrix based on the hyperspectral image 2The hyperspectral image 1 to be detected is a visible light image, the hyperspectral image 2 to be detected is a short wave infrared image, and two types of targets to be detected exist in the hyperspectral image 1 to be detectedAnd; There are two kinds of targets to be detected in the hyperspectral image 2 to be detectedAnd; Step two, for the original spectrum matrixConstructing a space-spectrum attention weight matrix based on neighborhood spectrum relation, and obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrixFor the original spectrum matrixConstructing a space-spectrum attention weight matrix based on neighborhood spectrum relation, and obtaining a fusion spectrum matrix based on the space-spectrum attention weight matrix; Step three, fusing spec