CN-121980501-A - Intelligent data identification and classification method based on multi-mode feature fusion

CN121980501ACN 121980501 ACN121980501 ACN 121980501ACN-121980501-A

Abstract

The invention belongs to the technical field of multi-modal data intelligent recognition and classification, and discloses an intelligent data recognition and classification method based on multi-modal feature fusion, which achieves accurate unified characterization of heterogeneous modes through a semantic anchoring double-space architecture and a two-stage alignment mechanism, wherein a shared semantic space depends on a LLM pre-training semantic system and a triplet anchor point constraint conversion direction, a modal specificity space retains a modal unique attribute, double-space dynamic weight adjustment can balance semantic consistency and specificity, two-stage alignment is combined with semantic anchor point coarse matching and adaptive temperature coefficient fine optimization, then time-space synchronous verification is matched to correct timing offset, semantic dislocation and error accumulation are avoided, a multi-dimensional credibility assessment system and dynamic complementation strategy are constructed, complementation modes such as matching migration learning, mixed generation or semantic guidance are used according to different credibility classification, and then a quality feedback mechanism is combined to ensure semantic validity of complementation features, so that semantic deviation after complementation is reduced.

Inventors

ZHOU WEI
WANG GUOXUAN
ZHANG WENHUAN
Lin Yaoting

Assignees

广东海洋大学

Dates

Publication Date: 20260505
Application Date: 20260120

Claims (10)

1. The intelligent data identification and classification method based on multi-mode feature fusion is characterized by comprising an acquisition preprocessing stage, a feature enhancement distillation stage, a characterization alignment stage, a correlation feature enhancement stage, an evaluation completion stage, a fusion scheduling stage and a classification evolution stage; the acquisition preprocessing stage comprises the steps of acquiring multi-source heterogeneous data, executing a preprocessing strategy of modal adaptation, screening repair data through semantic annotation and a four-dimensional quality hierarchy, and outputting standardized data; the feature enhancement distillation stage comprises the steps of constructing a dual-branch enhancement network, combining semantic guided feature screening with special extractor parameter optimization, and extracting single-mode key features; Constructing a semantic anchoring double-space architecture, and dynamically adjusting double-space feature contribution weights based on a two-stage alignment mechanism and space-time verification correction time sequence offset through a triple anchor point constraint feature conversion direction to realize cross-mode semantic consistency characterization; Constructing a correlation diagram model, mining high-order semantic correlation, strengthening feature distinction degree through a mixed enhancement strategy, and obtaining enhanced cross-modal correlation features after confidence level screening; Establishing a multidimensional credibility assessment system, determining comprehensive weights through a combined weighting method, dynamically matching a complementation strategy according to credibility, and combining prototype pool updating and a quality feedback mechanism to verify feature validity and output a multi-mode feature set; The fusion scheduling stage is to construct a cross-level closed-loop fusion architecture, dynamically adjust the weight of each level by means of reinforcement learning through three-level progressive fusion of original feature level, associated feature level and decision level, integrate semantic consistency and reliability index optimization decision and output high-distinction comprehensive features; And in the classified evolution stage, optimizing classified output by adopting multi-model integration and semantic verification, constructing a dynamic prototype pool learning mechanism, generating a virtual sample conforming to data distribution, dynamically updating parameters by means of a full-flow self-evolution closed loop, responding to scene change and newly added categories, and outputting an intelligent data classification result.
2. The intelligent data identification and classification method based on multi-modal feature fusion according to claim 1, wherein the acquisition preprocessing stage adopts a multi-channel synchronous acquisition module to acquire multi-source heterogeneous data including images, texts, audios and sensor signals, records acquisition related information and modal type labels, constructs a heterogeneous collaborative preprocessing unit, and designs adaptive processing strategies aiming at different modal characteristics; The integrated cross-mode semantic labeling module generates unified semantic tags for all mode data through LLM subjected to domain corpus fine adjustment, converts sensor time sequence features into interpretable semantic tags, establishes a signal-to-noise ratio, integrity, consistency and semantic consistency four-dimensional data quality grading system, and quantifies semantic consistency through cosine similarity of calculation features and the semantic tags; And starting the mode semantic migration repair for the low-quality data, positioning an abnormal region through an attention mechanism, and calibrating abnormal fluctuation of the low-quality mode by utilizing semantic information of the high-credibility mode to obtain standardized multi-mode data.
3. The intelligent data identification and classification method based on multi-modal feature fusion according to claim 2, wherein the feature enhanced distillation stage is based on standardized multi-modal data to construct a dual-branch feature enhanced network, the bottom layer adopts the adaptive antagonism enhancement of modal specific parameters, and the distribution rule of modal data is followed; The high-level adopts the LLM semantic guided attention distillation technology, a semantic anchor point is constructed by combining task scene core semantic keywords, mutual information of features and the semantic anchor point is calculated, a core feature subset is screened to remove redundant information, a special mode feature extractor is configured, and semantic related features of corresponding modes are respectively extracted; and carrying out iterative optimization on the parameters of the feature extractor through a parameter optimizer to generate a single-mode key feature vector.
4. The intelligent data identification and classification method based on multi-mode feature fusion according to claim 3, wherein the characterization alignment stage takes a single-mode key feature vector as input, a semantic anchoring dual-space architecture is constructed to realize heterogeneous mode unified characterization, wherein a shared semantic space converts each mode key feature into a LLM pre-training semantic space, the anchor library is dynamically and iteratively updated through object category, action and environment triplet semantic anchor constraint conversion directions, the mode specificity space reserves each mode attribute feature, a dynamic weight adjustment layer is configured, dual-space feature bidirectional flow is realized through a gating mechanism, and feature contribution weights of the dual spaces are dynamically adjusted according to real-time data distribution; executing a two-stage cross-modal alignment mechanism, wherein coarse alignment is based on matching cross-modal semantic units of semantic anchor points, fine alignment adopts a contrast learning mechanism which introduces modal self-adaptive temperature coefficients, and dynamically adjusts inter-class and intra-class distance weights according to modal credibility; And correcting the time sequence offset deviation through a space-time semantic synchronous checking algorithm to obtain the cross-modal characteristic after unified characterization and dynamic alignment.
5. The intelligent data identification and classification method based on multi-modal feature fusion according to claim 4, wherein the association feature enhancement stage constructs a dynamic association graph model based on aligned cross-modal features, uses a single-modal key feature and a semantic anchor point generated by LLM as a mixed node set, has a higher node weight than the single-modal feature node, and calculates comprehensive association strength based on semantic association degree, space-time association and credibility weight as an edge weight basis; Adopting an image model to realize efficient neighborhood aggregation, mining high-order cross-modal semantic association, forming a multi-modal semantic chain, executing a prototype and semantic mixed enhancement strategy, fusing modal association prototypes, new association features and semantic anchor features, and strengthening differentiated association information; And introducing an association confidence evaluation mechanism, calculating the confidence level of the association features by reasoning, dynamically adjusting a confidence threshold according to the classification task difficulty, and reserving high-confidence association to obtain enhanced cross-mode association features.
6. The intelligent data identification and classification method based on multi-modal feature fusion according to claim 5, wherein the assessment completion stage combines cross-modal associated features and single-modal key features to establish a multi-dimensional reliability assessment system comprising four dimensions of data integrity, feature quality, consistency with other modalities and completion feasibility, wherein the comprehensive reliability weight is determined through an entropy weight method and an analytic hierarchy process, and the completion feasibility is assessed through a prediction model trained by historical completion data; Dynamically matching the completion strategy with a prototype pool updating mechanism, wherein the credibility grading adopts a corresponding completion mode, generates a missing mode representation and incorporates the missing mode representation into a corresponding prototype pool; And establishing a complement quality feedback mechanism, verifying validity through the matching degree of the complement features and the semantic anchor points, and obtaining a multi-mode feature set by triggering a secondary complement or a mode missing mark through invalid complement.
7. The intelligent data identification and classification method based on multi-modal feature fusion according to claim 6, wherein the fusion scheduling stage takes a multi-modal feature set as a processing object, a cross-level closed loop fusion architecture is constructed to realize three-level progressive fusion, element-level weighted summation is carried out on the preprocessed original modal features by original feature level fusion, weights are jointly determined by modal credibility and semantic consistency, a dynamic threshold is set, and the original feature weights are adjusted in a backtracking way when the feature differentiation degree after fusion is lower than the threshold; The decision stage fusion adopts a Bayesian weighted fusion strategy, models the uncertainty of classification results of all modes based on probability distribution, performs weighted calculation by combining independent classification results of all modes and cross-mode fusion classification results, and positively correlates the weight with classification confidence and completion confidence of all modes; and configuring a dynamic weight scheduler, introducing semantic consistency indexes by taking classification accuracy, complement quality and calculation efficiency as multi-objective optimization directions, and dynamically adjusting fusion weights of all levels according to real-time modal states and classification task requirements to obtain a comprehensive feature vector after dynamic fusion.
8. The intelligent data identification and classification method based on multi-mode feature fusion according to claim 7, wherein the classification evolution stage is based on comprehensive feature vectors, adopts a multi-mode integration combined confidence calibration strategy, integrates multiple base classifier outputs, optimizes base classifier weights and calibrates classification confidence, integrates a LLM semantic verification module, performs rationality scoring on classification results, and triggers secondary classification when the scoring does not reach the standard; Constructing a dynamic prototype pool learning mechanism to realize small sample increment learning, wherein the dynamic prototype pool comprises an old class prototype, a new class prototype and a complement feature prototype, optimizing the prototype pool by adopting an elimination mechanism and a weight adjustment strategy, restricting the prototype distance by deep measurement learning, generating a virtual sample conforming to data distribution by combining a generating technology, and distilling the old class knowledge by the prototype pool; The method comprises the steps of establishing a full-flow self-evolution closed loop, feeding back a classification error sample, newly added category data and environmental change to the steps, dynamically updating key parameters of each module, adjusting pretreatment parameters, configuration of a feature extractor and fusion weights, improving the parameter optimization efficiency of the full module by adopting a distributed training framework, starting the parameter optimization of the full module when the classification performance is reduced or the newly added category reaches a preset condition, adapting to a dynamic data scene, and outputting a classification result.
9. The intelligent data identification and classification method based on multi-modal feature fusion according to claim 8, wherein the dynamic prototype pool adopts an liveness threshold screening mechanism to eliminate low-contribution prototypes with liveness lower than a preset liveness.
10. The intelligent data identification and classification method based on multi-modal feature fusion according to claim 9, wherein in the cross-level fusion weight adjustment, the semantic consistency index weight ratio is not lower than 20%.

Description

Intelligent data identification and classification method based on multi-mode feature fusion Technical Field The invention belongs to the technical field of intelligent recognition and classification of multi-mode data, and particularly relates to an intelligent data recognition and classification method based on multi-mode feature fusion. Background Artificial intelligence is increasingly deeply applied to complex scenes such as automatic driving, intelligent monitoring, industrial detection and the like, for example, the accurate identification and classification of multi-mode data such as images, texts, audios, sensor signals and the like become core requirements for supporting efficient decision of an intelligent system, and the data has the characteristics of heterogeneous sources, obvious structural differences, modal deletion, time sequence asynchronization and the like, and the prior art still has the following technical problems: the traditional fixed conversion mode is difficult to keep modal specificity information and cross-modal semantic consistency, semantic dislocation is easy to cause, and in a dynamic data scene, alignment errors are accumulated continuously along with data distribution changes, so that the stability of classification accuracy is affected. The existing completion means depend on a single generation model or a static interpolation method, the adaptation capability to a complex missing mode is limited, semantic deviation often exists in the characteristics after completion, and a collaborative optimization mechanism is absent in the two types, so that when the model faces to a small sample newly added type, the situation of confusion between new and old types is easy to occur, and the incremental learning effect is not ideal enough. The existing self-evolution mechanism only covers a single module, a system for dynamically optimizing the whole flow parameters is not formed, the adaptation capability of the model to complex scene changes is limited, and stable performance is difficult to keep for a long time. Disclosure of Invention The invention aims to provide an intelligent data identification and classification method based on multi-mode feature fusion, so as to solve the problems in the background technology. In order to achieve the aim, the invention provides the technical scheme that the intelligent data identification and classification method based on multi-mode feature fusion comprises an acquisition preprocessing stage, a feature enhanced distillation stage, a characterization alignment stage, an associated feature enhancement stage, an evaluation completion stage, a fusion scheduling stage and a classification evolution stage; Preferably, the acquisition preprocessing stage adopts a multi-channel synchronous acquisition module to synchronously acquire multi-source heterogeneous data including images, texts, audios and sensor signals, records data acquisition time sequence, space coordinates and mode type labels, constructs a heterogeneous cooperative preprocessing unit, and designs self-adaptive processing strategies aiming at different mode characteristics, wherein the image modes adopt semantic perception noise filtering, the text modes adopt context error correction and standardization, and the sensor signals adopt time sequence trending processing; Meanwhile, a cross-mode semantic labeling module is integrated, unified semantic label generation is carried out on all mode data through LLM subjected to field corpus fine adjustment, sensor time sequence characteristics are converted into interpretable semantic labels, a signal-to-noise ratio, integrity, consistency and semantic consistency four-dimensional data quality grading system is established, and cosine similarity quantification semantic consistency of the semantic labels is generated through calculation of all modes and the LLM; the low-quality data with semantic consistency of <0.6 starts the modal semantic migration repair, an abnormal region is positioned through an attention mechanism, abnormal fluctuation of the low-quality mode is calibrated by utilizing semantic information of the high-credibility mode, and standardized multi-mode data is obtained. Preferably, the characteristic enhancement distillation stage is based on the standardized multi-mode data of the acquisition pretreatment stage, a dual-branch characteristic enhancement network is constructed, the bottom layer adopts the antagonism enhancement of the mode specific parameter self-adaption, the image rotation angle is dynamically adjusted based on the data distribution entropy value, the text synonymous substitution is based on the field dictionary constraint and maintains the semantic consistency, and the audio speed change and the sensor signal distortion strictly follow the mode data distribution rule; The high-level adopts the LLM semantic guided attention distillation technology, a semantic anchor point is constructed by combining a c