CN-122023975-A - SAR long-tail image target classification method based on topological structure
Abstract
The invention relates to a target classification method of SAR long-tail images based on a topological structure, which comprises the following steps of extracting a main network feature map represented by SAR images, executing frequency-aware feature decomposition after initial features are given, then extracting stable geometric prior by structural branches, learning texture semantics with noise perception capability and reliability weight by texture branches, and finally, using a dynamic tail class balancing module as a class And the fused representation is input into a classifier to classify the target of the SAR image. The application realizes explicit decoupling and information retention of the structure and the texture through the structure branch and the noise perception texture branch which are constrained by the frequency domain, adaptively adjusts the fusion proportion of the two branches according to the category density and the representation stability, and provides a structure-driven prototype system oriented to the tail class, which can form a stable semantic structure, thereby remarkably improving the recognition performance of the tail class with few samples.
Inventors
- WANG RUIQI
- LU MINGMING
- Tao chao
- LI HAIFENG
- SONG WEI
- YANG SHUO
- ZHANG ZHEXUAN
Assignees
- 中南大学
- 航天科工智能运筹与信息安全研究院(武汉)有限公司
- 北京机电工程研究所
Dates
- Publication Date
- 20260512
- Application Date
- 20251212
Claims (6)
- 1. The SAR long-tail image target classification method based on the topological structure is characterized by comprising the following steps of: The SAR image is expressed as: the corresponding main network characteristic diagram is recorded as follows: Wherein, the Representing image space height, width and channel dimensions respectively, Representing a backbone network feature map; Given initial characteristics Thereafter, frequency-aware feature decomposition is performed: Wherein the method comprises the steps of And The characteristic responses of structure dominance and texture dominance are respectively represented, Representing an operation of performing frequency domain feature decomposition by which input features are input Decomposing into two parts of structure and texture; subsequently, the structural branches extract a stable geometric prior : Representing a structural branch; texture branch learning with noise perception capability and obtaining texture semantics with reliability weight : Representing texture branches; finally, the dynamic tail class balancing module is classified Prediction category dependent fusion weights And (3) forming: the fused representation is input into a classifier for target classification of the SAR image.
- 2. The topological structure-based SAR long tail image target classification method according to claim 1, wherein the feature decomposition of frequency perception is performed, specifically comprising the steps of: Will initiate the feature Is obtained by two-dimensional discrete Fourier transform A learnable low-pass mask Obtained by the following formula: Wherein the method comprises the steps of Is a lightweight CNN that is configured to be lightweight, Is a sigmoid function; the obtained structure is characterized in that: Wherein +.is the multiplication by element, The representation performs inverse transformation on the frequency domain data, so that the structural information in the frequency domain returns to the spatial domain, and the final structural dominant feature is obtained; a regularization term encourages the low pass mask to maintain consistent low frequency filtering behavior: Wherein the method comprises the steps of For suppressing high frequency leakage, thereby preventing false texture components from entering structural branches; Extracting the strongest Individual scattering centers And construct a topological graph : Wherein the node sets Spatial coordinates comprising scattering centers, adjacency matrix The definition is as follows: respectively scattering centers And Is defined by the spatial coordinates of (a), Is a hyper-parameter used to control the decay rate of the similarity in the adjacency matrix; global structural relationships are encoded using the graph neural network GNN: Wherein the method comprises the steps of Representing a multi-scale structural feature pyramid; for the problem that the tail class has unstable structural semantics due to insufficient samples, adding an exponential sliding average as a consistency constraint loss: Is the structural feature extracted by the structural branch, represents the geometric information extracted by the model through the structural branch, Is an operator for freezing the gradient, preventing the structural features from changing during the training process; this loss is used to stabilize the representation of the structural feature during training.
- 3. The topological structure-based SAR long-tail image target classification method of claim 2, wherein texture branch learning has noise perception capability and obtains texture semantics with reliability weight The method specifically comprises the following steps: using a complementary, learnable bandpass mask Band pass texture decomposition is performed in the form of: Wherein the method comprises the steps of In the case of a lightweight CNN, Is a sigmoid function; Thereby obtaining the texture dominant feature : To ensure that the bandpass filter has an effective passband, a regularization term is added: this term is used to enhance the bandpass characteristics of the filter, thereby distinguishing texture details; Encoding using a dual path residual encoder, wherein one path models local high frequency features through depth separable convolution; the overall texture coding is expressed as: the characteristic of the local high frequency is represented, Representing non-local attention, i.e., the relationship between distant pixels in the image of interest, to enhance global perception; to attenuate the noise response, a reliability score is calculated for each token : A multi-layer sensor is shown as such, Is a feature representation of each token in the texture branch, each token corresponding to a certain local region in the image; The final texture is expressed as: Positive sample pairs are constructed from tokens of reliability, negative sample pairs are constructed from unreliable or across class tokens, with a loss function of: respectively are samples And Is characterized by the texture features of (a), Is a temperature parameter for controlling the smoothness of the similarity, Is a function of the degree of similarity, Is an index of a negative sample, representing a sum Dissimilar token; The loss function is used to prevent the model from mimicking the tail samples dominated by noise.
- 4. The method for classifying a target of a long-tail image of a SAR based on a topological structure according to claim 3, wherein the dynamic tail class balancing module performs the steps of: Category(s) The number of samples is Normalized density of it The method comprises the following steps: is the number of samples for each category in the dataset; calculating the stability of each category on the structural features and the texture features: Is the variance of the structural features and, Is the variance of the texture features; mapping the category statistics to fusion weights: the weight of the control class c in the structural branch feature and the weight of the control class c in the texture branch feature are respectively; The tail class is more dependent on structural features, and the head class is more dependent on texture features; fusing texture representation and structural relation to obtain task self-adaptive category representation The method comprises the following steps: The class representation is used to directly alleviate the feature collapse problem of the tail class.
- 5. The method for classifying a target of a long-tail image of a SAR based on a topological structure according to claim 3, wherein the dynamic tail class balancing module performs the steps of: For categories First, the number of samples is calculated ; Intrinsic stability of texture features and structure is measured by variance at the batch level: Is a category of Is used to determine the variance of the structural features of the (c), Is the variance of the texture features; Calculating fusion weight: wherein g (#) is implemented as a two-layer MLP comprising softmax normalization; Fusing the texture representation and the structural relationship, and finally fusing the characteristics to be expressed as follows: the fused feature representation is input into a classifier.
- 6. The topological structure-based SAR long tail image target classification method of claim 5, wherein the structural features guide the construction of the class prototype and further refine the prototype through a multi-stage geometrical aggregation and diffusion process, specifically comprising: And Respectively represent categories Middle (f) Structural and textural features of the individual samples; Pooling of variance weighted structures, grouping categories Is defined as: wherein the method comprises the steps of Is a stability weight, and the calculation mode is as follows: And Respectively represent categories Mean and variance of structural features of (a) will As a category Is a structural anchor point; within each class c, a kNN diagram is built based on structural features: Is a set of structural features for all samples in category c, representing nodes in the graph, Is a similarity matrix between samples in the category c, representing an adjacency matrix of the graph, constructed by a similarity measure; Geometrical similarity is encoded with the weights of the edges: is a super parameter controlling the smoothness of the Gaussian similarity, affects the calculation of the similarity, Respectively in category c And (d) Structural features of the individual samples; Refined structural prototype Expressed as: Wherein the method comprises the steps of Is a learnable parameter; Calculating structural affinity between class prototypes: Wherein, the The class k with similar geometric structures obtains higher affinity weight; the tail class borrows structural information from geometrically similar classes: wherein gamma' is the diffusion coefficient, Cross-class affinity weights for normalization; given the final structural prototype Features after fusion Align to its prototype to enhance representation learning; For each sample , Is a sample The corresponding class label represents the class c to which the sample belongs, and the prototype alignment loss is calculated: The output of the classifier is further adjusted according to prototype similarity: Wherein the method comprises the steps of Control of the prototype-based reinforcement, The weight vector is used for weighting and fusing the characteristics and influencing the output of the classifier; The complete loss function is: Wherein the method comprises the steps of , , The coefficients of structural consistency, texture discrimination and prototype alignment are balanced, respectively.
Description
SAR long-tail image target classification method based on topological structure Technical Field The invention belongs to the technical field of radar target recognition, and particularly relates to a SAR long tail image target classification method based on a topological structure. Background Synthetic aperture radar SAR target recognition has long been a cornerstone for implementing all-weather, diurnal intelligent sensing systems. With the rapid deployment of SAR sensors in monitoring, mapping and autonomous platforms, the need for reliable automatic target recognition ATR has increased significantly. Recent SAR ATR frameworks based on deep learning have made significant progress. However, their effectiveness in real environments is still severely limited by the long-tailed class distribution, with a few head classes occupying the vast majority of samples, while many classes that are very important in practical tasks only sparsely appear. Traditional long tail learning methods typically solve the data imbalance problem by means of class balancing weights, characterization and classifier decoupling, dual-branch balancing, boundary-based decision adjustment, or distributed perceptual modeling. However, these techniques are primarily designed for natural images and implicitly assume that a minority class is only faced with the problem of insufficient samples. Such assumptions do not hold in SAR imaging. Unlike optical images, the imaging characteristics of SAR result from the combined effects of complex factors such as geometry, body structure, electromagnetic scattering, and speckle noise. As is generally accepted in the SAR field, SAR data naturally contains two distinct layers of information, structural information (physically stable) and texture information (highly sensitive to noise and imaging conditions). This asymmetry results in a critical but long-term neglected phenomenon in long-tail SAR data, where reliable texture information decays faster than structural information as the class frequency decreases. The tail class typically exhibits unstable, noise-dominated textures, thereby failing to form compact and meaningful semantic clusters. Furthermore, mainstream SAR ATR algorithms tend to learn the overall characteristics without respecting the hierarchical nature of the SAR signal. At the same time, structure-texture understanding coupling, frequency domain based modeling, and feature stabilization strategies, which have proven effective methods in natural images, remain of little research in SAR. Disclosure of Invention Long-tailed distributed synthetic aperture radar SAR target identification is not only affected by data scarcity, but is more critical to information imbalance between structural stability and texture vulnerability that is ignored for a long time. While existing resampling and re-weighting strategies distort the data distribution with the risk of destroying the physical authenticity of the SAR, the unified feature learning pipeline mixes the inherently heterogeneous structure with texture cues, so that noise-dominated textures dominate the feature representation of the tail class. To solve this fundamental bottleneck, the present application considers that long tail SAR recognition is not only a data imbalance problem, but also an information imbalance problem between robust structure information and fragile texture information. This perspective suggests that a paradigm shift is required-SAR characterization must be explicitly deconstructed, selectively preserved, and adaptively balanced by its physical properties. To solve this problem, the present application proposes STDB structure-texture dual-branch networks, a new paradigm of information retention that explicitly decouples structure and texture and models in coordination. A frequency domain constrained structure branch is used for extracting robust geometric topology information, and a texture branch with noise perception capability is used for suppressing unreliable scattering texture modes through a self-discrimination learning mechanism. In addition, a dynamic tail class balancing module (DTBM) adaptively fuses the outputs of the two branches based on class density and stability of the feature representation. The application also introduces a first tail-oriented structure driving prototype system, and can construct a stable class prototype based on geometric information, thereby relieving the problem of long-term instability of tail semantics. A large number of experiments on a plurality of long-tail SAR references show that the method can remarkably improve tail class identification precision and overall classification balance, and brings higher interpretability through cross-branch structure-texture cooperation. The application consists of two cooperative branches, namely a frequency domain constrained structural branch for extracting stable geometric structures and a texture branch with noise perception capability for f