CN-122021779-A - Advanced active learning system based on multi-strategy fusion and sampling method

CN122021779ACN 122021779 ACN122021779 ACN 122021779ACN-122021779-A

Abstract

The application relates to the technical field of machine learning, and discloses an advanced active learning system and a sampling method based on multi-strategy fusion, wherein the advanced active learning system comprises a feature module for extracting feature representation of unlabeled samples; the strategy module comprises a plurality of learnable strategy agents for parallel evaluation of sample information quantity from different targets such as classification uncertainty, space diversity and model influence degree, the judging module dynamically fuses each strategy score and outputs sample selection probability distribution through a coordination network based on an attention mechanism, the optimizing module generates a meta-gradient signal according to the performance improvement of a main task model on a verification set, and all modules are jointly optimized through the meta-gradient signal, so that multi-strategy dynamic intelligent cooperation and system endogenous evolution are realized, and sample selection efficiency and model performance can be improved.

Inventors

RUI JIANWEN
GU HAIHUA
LI XINZHUO
LU GUANMING

Assignees

南京信息职业技术学院
南京东江砼创科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260413

Claims (10)

1. The advanced active learning system based on multi-strategy fusion is characterized by comprising a feature module, a strategy module, a judging module and an optimizing module; The characteristic module is used for extracting characteristics of the unlabeled sample pool and outputting sample characteristic representation; The strategy module comprises a plurality of different strategy agents, wherein each strategy agent receives the sample characteristics, evaluates the information quantity of the sample from different bottom targets and outputs strategy scoring vectors; The judging module receives strategy scoring vectors output by all strategy agents and the current state of the main task model, evaluates and fuses all strategy scores through a coordination network, predicts the improvement of performance of the main task model by different sample selection schemes, and outputs sample selection probability distribution; and the optimization module generates a meta-gradient signal according to the lifting of the main task model in the verification set after the selected sample is marked.
2. An advanced active learning system based on multi-strategy fusion as claimed in claim 1, wherein: The unlabeled sample pool is a machine learning sample set for which labeling information is not obtained yet; And the feature module performs feature extraction and passes through a feature coding network which is independently trained with the main task model, the feature coding network adopts self-supervision learning or auxiliary tasks related to the main task to complete pre-training, and an original sample is mapped into a feature representation with fixed dimension.
3. An advanced active learning system based on multi-strategy fusion as claimed in claim 1, wherein: Each policy agent in the policy module is a lightweight learnable neural network; Each policy agent independently receives the same batch of sample feature representations output by the feature module, processes the sample feature representations through the respective input layer and feature processing layer, and adapts to the evaluation target.
4. An advanced active learning system based on multi-strategy fusion as claimed in claim 3, wherein: each strategy agent evaluates the sample information quantity from the bottom target and outputs a strategy scoring vector, wherein the process of each strategy agent comprises parallel calculation and independent mapping to generate a vector; Each strategy agent in the parallel computing generates an initial evaluation value according to the target and the input sample feature set; Each strategy agent in the independent mapping maps the initial evaluation value to a standardized and comparable scoring interval through a fully-connected network layer; And outputting a strategy scoring vector by each strategy agent aiming at the whole unlabeled sample pool, wherein each dimension of the strategy scoring vector corresponds to the standardized score of one sample and is used for quantifying the information quantity of the current sample from the perspective of the agent target.
5. The advanced active learning system based on multi-strategy fusion of claim 4, wherein: The strategy module comprises a classification uncertainty intelligent agent, a characteristic space diversity intelligent agent and a main task model influence intelligent agent; The classification uncertainty agent works by analyzing the prediction probability distribution obtained by the sample characteristics after passing through the main task model classifier, calculates the entropy of the distribution, normalizes the result and uses the result as the strategy score thereof, and the higher the score is, the more uncertain the sample classification is represented; The feature space diversity agent works by analyzing the distribution density of sample features in the feature space; calculating the average distance from the sample feature to the feature center of the marked sample set or the selected sample set of the batch, normalizing the distance value, and taking the distance value as a strategy score, wherein the higher the score is, the more sparse the feature region of the sample is and the greater the diversity contribution is; The method comprises the steps of calculating a loss gradient vector of a last layer of a main task model classifier by adopting a lightweight approximation method of training the loss gradient, randomly extracting a subset of unlabeled pools with huge sample numbers, calculating the gradient, popularizing the subset to a full sample pool, estimating the expected influence norm of the loss gradient vector on the whole loss gradient of the main task model on a verification set, normalizing the norm to be used as a strategy score, and representing that the higher the score is the greater the expected influence of the sample on model updating.
6. An advanced active learning system based on multi-strategy fusion as claimed in claim 1, wherein: The coordination network in the judging module is a learnable neural network of an attention mechanism; Splicing all strategy scoring vectors output by the strategy module with context vectors reflecting the current main task model state to form a comprehensive input vector; The coordination network processes the comprehensive input vector through an attention sub-network, calculates and outputs the attention weight corresponding to each strategy intelligent agent, wherein the magnitude of the attention weight represents the relative importance degree of the evaluation information provided by the corresponding strategy intelligent agent under the current main task model state; Taking the attention weight as a coefficient, and carrying out weighted summation on the strategy scoring vector to obtain a fused global scoring vector; Mapping the global scoring vector into normalized selection probabilities for each sample in the unlabeled sample pool through a decision output layer, namely forming the sample selection probability distribution; and the parameters of the coordination network, the parameters of the characteristic module and the strategy module receive the meta-gradient signals from the optimization module to perform end-to-end joint optimization.
7. An advanced active learning system based on multi-strategy fusion as claimed in claim 1, wherein: The meta-gradient signal is counter-propagated through the micro-path and is combined with the parameters of the discriminating module, all the strategy agent modules and the characteristic module.
8. The multi-strategy fusion-based advanced active learning system of claim 7, wherein: The method for generating the meta-gradient signal by the optimization module comprises the following steps: Adding the marked selected samples into a training set, and updating the main task model to obtain a new model; respectively calculating loss function values of the main task model before updating and the new model on a fixed verification set, and taking the difference value as the improvement of the performance of the main task model; The true lifting quantity is constructed into a meta-gradient signal of the sample selection probability distribution quality output by the judging module through differentiable approximation, wherein the meta-gradient signal is directly related to the contribution degree of the learnable parameters in the judging module, the strategy module and the characteristic module to the final performance lifting; distributing the meta-gradient signals to a coordination network of the judging module, each strategy intelligent agent in the strategy module and the characteristic module through a reverse propagation path; And executing a gradient descent step through the meta-gradient signal to generate an optimal sample selection decision by combining the optimized parameters of the judging module, the strategy module and the characteristic module.
9. The multi-strategy fusion-based advanced active learning system of claim 7, wherein: The method for generating the meta-gradient signal by the optimization module comprises the following steps: Constructing approximate estimation of the influence on the performance of a main task model verification set after single sample labeling, and taking the direction consistency between the overall loss gradient of the verification set and the loss gradient of a single sample as the measurement of expected contribution of the sample, wherein the higher the direction consistency is, the greater the expected improvement on the generalization performance of the model after labeling the current sample is; taking the sample selection probability distribution output by the judging module as a weight, and carrying out weighted summation on the expected contributions of all unlabeled samples to construct a micro substitution loss function; Decomposing the unreplaced batch sampling into a plurality of single sample sampling steps which are sequentially carried out, and generating a selection result according to the samples which are not selected currently and the probability distribution of the selection thereof currently in each step, wherein the selection result keeps the differentiable characteristic; The method comprises the steps of participating in the update of a main task model through discrete sampling results in a forward propagation stage, and carrying out parameter update through continuous gradient information in a reverse propagation stage, so that a meta-gradient signal can be completely returned through sampling operation; deriving a substitution loss function with respect to a sample selection probability distribution, and generating a meta-gradient signal, wherein each component of the meta-gradient signal is used for indicating adjustment of the selection probability of a corresponding sample, wherein the samples with large expected contribution are used for improving the selection probability, and the samples with small expected contribution are used for reducing the selection probability; The meta-gradient signals are sequentially transmitted to a coordination network of the judging module, each strategy agent of the strategy module and a feature coding network of the feature module through a reverse propagation path, so that the joint update of parameters of each module is driven.
10. An advanced active learning sampling method based on multi-strategy fusion, comprising the advanced active learning system based on multi-strategy fusion as claimed in any one of claims 1-9, wherein: S1, extracting features of an original sample in an unlabeled sample pool through a feature coding network to obtain sample feature representation; S2, inputting the sample characteristic representation to a plurality of different strategy agents, wherein each strategy agent carries out parallel evaluation on the information quantity of the sample based on different bottom targets to generate respective strategy scoring vectors; S3, obtaining strategy scoring vectors output by all strategy agents and the current state of a main task model, dynamically evaluating and fusing the strategy scoring vectors through a coordination network, predicting the improvement of different sample selection schemes on the performance of the main task model, outputting sample selection probability distribution, selecting sample batches to be marked from an unlabeled sample pool by adopting a non-return random sampling mode according to the distribution, and re-normalizing the residual sample probability distribution after sampling; S4, marking the selected sample batch and updating a main task model, generating a meta-gradient signal according to the performance improvement of the updated main task model on a verification set, and reversely transmitting the meta-gradient signal through a micro-path to jointly optimize the learnable parameters of the coordination network, all strategy intelligent agents and the feature coding network.

Description

Advanced active learning system based on multi-strategy fusion and sampling method Technical Field The application relates to the technical field of machine learning, and discloses an advanced active learning system based on multi-strategy fusion and a sampling method. Background The most valuable samples are intelligently selected for labeling through active learning, so that labeling cost is minimized, and model performance is maximized. However, when the existing advanced active learning technology realizes multi-strategy fusion, for example, a multi-strategy fusion mechanism is stiff, and the self-adaptive capability is lacking, wherein the existing technology generally adopts a preset fixed weight or simple rule to combine strategies such as uncertainty, diversity and the like, a static fusion mode cannot sense and adapt to the dynamic change learning state of a main task model in the training process and the mutation of external data distribution, when the model most needs an exploration stage of diversity, the uncertainty is given too high weight, or otherwise, the sample selection efficiency is low, the strategy evaluation of the existing technology generally depends on an intermediate result output by the model or independently calculated data statistics, no direct and optimizable relation is established, the selected samples possibly have usefulness, but the actual improvement contribution of the model performance is limited, in the existing technology, some schemes try to fuse the uncertainty and the diversity strategies, but all adopt the fixed weight or simple rule fusion mode, the dynamic change of the state in the training process of the main task model is not considered, meanwhile, the strategy evaluation network and the feature extraction network of the existing technology cannot evaluate the sample information only aiming at the main task model, and the feature extraction network does not have the joint optimization mechanism, and the feature represents that the adaptive task information is required to evaluate. Disclosure of Invention This section is intended to outline some aspects of embodiments of the application and to briefly introduce some preferred embodiments. Some simplifications or omissions may be made in this section as well as in the description of the application and in the title of the application, which may not be used to limit the scope of the application. The application provides a multi-strategy fusion-based advanced active learning system and a sampling method, which aim to solve the technical problems of stiff fusion mechanism, low calculation efficiency, poor scene suitability and optimized target splitting in the prior art, realize multi-strategy dynamic intelligent collaboration and system endogenous evolution, adapt a large-scale unlabeled sample pool through lightweight design, support classified regression tasks and class imbalance dynamic sample pools through a multi-scene adaptation module, and finally improve sample selection efficiency, reduce labeling cost and improve generalization performance of a main task model. In one aspect, the application provides an advanced active learning system based on multi-strategy fusion, which comprises a feature module, a strategy module, a judging module and an optimizing module; The characteristic module is used for extracting characteristics of the unlabeled sample pool and outputting sample characteristic representation; The strategy module comprises a plurality of different strategy agents, wherein each strategy agent receives the sample characteristics, evaluates the information quantity of the sample from different bottom targets and outputs strategy scoring vectors; The judging module receives strategy scoring vectors output by all strategy agents and the current state of the main task model, evaluates and fuses all strategy scores through a coordination network, predicts the improvement of performance of the main task model by different sample selection schemes, and outputs sample selection probability distribution; and the optimization module generates a meta-gradient signal according to the lifting of the main task model in the verification set after the selected sample is marked. As a preferred scheme of the advanced active learning system based on multi-strategy fusion of the application, the application comprises the following steps: The unlabeled sample pool is a machine learning sample set for which labeling information is not obtained yet; And the feature module performs feature extraction and passes through a feature coding network which is independently trained with the main task model, the feature coding network adopts self-supervision learning or auxiliary tasks related to the main task to complete pre-training, and an original sample is mapped into a feature representation with fixed dimension. As a preferred scheme of the advanced active learning system based on multi-strategy fusion of the applicati