CN-122020187-A - Training data screening method and device for supervised fine tuning of model

CN122020187ACN 122020187 ACN122020187 ACN 122020187ACN-122020187-A

Abstract

The application provides a training data screening method and device for supervised fine tuning of a model, which can be applied to the technical field of model training and optimization. The method comprises the steps of carrying out sparse projection processing on hidden state representation output by a source model target network layer to obtain multi-dimensional sparse activation characteristics, carrying out characteristic screening operation on the multi-dimensional sparse activation characteristics to obtain candidate characteristic sets, carrying out self-adaptive decoding weight-based operation on activation scalar to obtain characteristic influence vectors, carrying out residual weight enhancement on the source model by using the characteristic influence vectors to obtain an enhanced source model, obtaining significant forward information by using an original result output by the source model and an enhancement result output by the enhanced source model, carrying out screening operation on the candidate characteristic sets by using the significant forward information to obtain target characteristic sets, and carrying out characteristic activation amplitude-based data screening processing on the original training data sets by using the target characteristic sets to obtain a training data set for supervised fine adjustment of the target model.

Inventors

XIONG DEYI
SHI LING

Assignees

天津大学

Dates

Publication Date: 20260512
Application Date: 20260414

Claims (10)

1. A training data screening method for supervised fine tuning of a model, the method comprising: extracting hidden state characterization from a processing result of a text synthesis sequence by a target network layer of a source model, and calling a sparse calculation accelerator to carry out sparse projection processing on the hidden state characterization to obtain multidimensional sparse activation characteristics, wherein the text synthesis sequence comprises text description and execution results of a target task; performing feature screening operation based on an activation frequency threshold on features in each dimension of the multidimensional sparse activation features to obtain candidate feature sets, wherein the candidate feature sets are stored in a memory with a multi-level caching mechanism; Performing self-adaptive decoding weight-based operation on an activation scalar to obtain a characteristic influence vector, and performing residual weight enhancement on the source model by utilizing the characteristic influence vector to obtain an enhanced source model, wherein the activation scalar is obtained by performing self-adaptive coding weight-based operation on a processing result of verification data by the target network layer; The original result of executing the target task output by the source model and the enhanced result of executing the target task output by the enhanced source model are utilized to obtain significant forward information for verifying the causal relationship between the source model input and output, and screening operation is performed on the candidate feature set by utilizing the significant forward information to obtain a target feature set; and performing data screening processing based on the feature activation amplitude on the original training data set by using the target feature set to obtain the training data set for the supervised fine tuning of the target model.
2. The method of claim 1, wherein invoking a sparse computation accelerator to sparse projection process the hidden state representation to obtain a multi-dimensional sparse activation feature comprises: Deploying a trained sparse self-encoder into hardware acceleration equipment to obtain the sparse calculation accelerator, wherein the trained sparse self-encoder is used for performing space mapping on hidden features generated in the process of executing the target task by the source model, and the hardware acceleration equipment comprises image processing equipment, tensor processing equipment, neural network processing equipment or an application specific integrated circuit; And calling the sparse calculation accelerator to acquire self-adaptive coding weights in the process of processing the text synthesis sequence by the target network layer, and performing linear activation processing on the hidden state representation and the self-adaptive coding weights to realize sparse projection processing on the hidden state representation, so as to obtain the multidimensional sparse activation feature.
3. The method of claim 1, wherein performing a feature screening operation based on an activation frequency threshold on features in each dimension of the multi-dimensional sparse activation feature, obtaining a candidate feature set comprises: Obtaining a mark position of a keyword element in a target task corpus to which the text synthesis sequence belongs, wherein the target task corpus is stored in a first cache memory, and the keyword element is used for connecting text description and an execution result of the target task; Calculating feature activation frequencies of features in the target task corpus on each dimension of the multi-dimensional sparse activation features by using the marking positions, wherein the feature activation frequencies represent the proportion of the number of activated samples in the target task corpus to the total number of samples in the target task corpus; Comparing the characteristic activation frequency with the activation frequency threshold to obtain a comparison result; and under the condition that the comparison result is larger than or equal to the activation frequency threshold, taking the feature corresponding to the comparison result as a candidate feature to obtain the candidate feature set.
4. The method of claim 1, wherein performing an adaptive decoding weight-based operation on the activation scalar to obtain the feature impact vector comprises: Acquiring verification data hiding state characterization generated in the process of processing verification data stored in a first cache memory by the target network layer; Invoking a sparse calculation accelerator deployed with a trained sparse self-encoder to perform operation based on self-adaptive coding weights on the verification data hiding state representation to obtain the activation scalar; And in the process of processing the verification data by the target network layer, calling the sparse calculation accelerator to acquire the self-adaptive decoding weight of the current verification data processing stage, and operating the self-adaptive decoding weight and the activation scalar to obtain the characteristic influence vector.
5. The method of claim 1, wherein performing residual weight enhancement on the source model using the feature influence vector, the enhanced source model comprising: Extracting verification hidden state characterization from a processing result of the target network layer on verification data stored in a first cache memory, and operating the verification hidden state characterization and the feature influence vector to obtain enhanced verification hidden state characterization; And carrying out enhancement processing on the weight of the residual layer of the source model by utilizing the enhanced verification hidden state representation to obtain the enhanced source model, and storing the parameters of the enhanced source model in a second cache memory.
6. The method of claim 5, wherein utilizing the source model to perform the original result of the target task output and the enhanced source model to perform the enhanced result of the target task output to obtain significant forward information for verifying causal relationships between the source model input and output comprises: scoring the original result and the enhanced result stored in the third cache memory by using a preset evaluation index respectively to obtain the score of the original result and the score of the enhanced result; and calculating the scores of the original results and the scores of the enhanced results to obtain the remarkable forward information.
7. The method of claim 6, wherein performing a screening operation on the candidate feature set using the salient forward information to obtain a target feature set comprises: and descending order sorting is carried out on the candidate feature sets by using the remarkable forward information to obtain descending order sorted candidate feature sets, and a plurality of candidate features which are sorted in the front are selected from the descending order sorted candidate feature sets to construct the target feature set.
8. The method of claim 3, wherein performing a feature activation magnitude-based data screening process on the original training data set using the target feature set to obtain the target training data set comprises: Splicing each input sample in the original training data set and the label value corresponding to each input sample based on the marking position to obtain a spliced training data set; Calculating a characteristic activation amplitude value of each piece of spliced training data in the spliced training data set at the marking position by using the target characteristic set to obtain a characteristic resonance score of each piece of spliced training data; Sorting and screening the spliced training data sets according to the characteristic resonance score of each spliced training data to obtain an initial target training data set; And performing decoupling operation on the mark position on a target input sample of each initial target training data in the initial target training data set and a label value corresponding to the target input sample to obtain a target training data set for supervised fine adjustment of the model.
9. The method according to any one of claims 1-8, wherein the target task comprises at least one of a translation task, a summary generation task, and a mathematical reasoning task; the source model comprises a translation source model, a summary generation source model and a mathematical reasoning source model; The target model comprises a translation target model, a summary generation target model and a mathematical reasoning target model; wherein the text synthesis sequence comprises at least one of a translation text sequence, a summary generation text sequence and a mathematical reasoning text sequence; wherein the validation data includes at least one of a text sample for a translation task, a text sample for a summary generation task, and a text sample for a mathematical reasoning task.
10. A training data screening apparatus for supervised fine tuning of a model, the apparatus comprising: The multi-dimensional sparse activation feature acquisition module is used for extracting hidden state characterization from a processing result of a text synthesis sequence by a target network layer of the source model, and calling a sparse computing accelerator to carry out sparse projection processing on the hidden state characterization to obtain multi-dimensional sparse activation features, wherein the text synthesis sequence comprises text description and execution results of a target task; The candidate feature set acquisition module is used for executing feature screening operation based on an activation frequency threshold on the features in each dimension in the multi-dimensional sparse activation feature to obtain a candidate feature set, wherein the candidate feature set is stored in a memory with a multi-level cache mechanism; the source model enhancement module is used for carrying out operation based on self-adaptive decoding weights on an activation scalar to obtain a characteristic influence vector, carrying out residual weight enhancement on the source model by utilizing the characteristic influence vector to obtain an enhanced source model, wherein the activation scalar is obtained by carrying out operation based on self-adaptive coding weights on a processing result of verification data by the target network layer; The target feature set acquisition module is used for acquiring significant forward information for verifying causal relation between the input and the output of the source model by utilizing the original result of the source model for executing the target task output and the enhanced result of the enhanced source model for executing the target task output, and executing screening operation on the candidate feature set by utilizing the significant forward information to acquire a target feature set; And the training data set screening module is used for carrying out data screening processing on the original training data set based on the characteristic activation amplitude by utilizing the target characteristic set to obtain the training data set for the supervised fine adjustment of the target model.

Description

Training data screening method and device for supervised fine tuning of model Technical Field The application relates to the technical field of model training and optimization, in particular to a training data screening method and device for supervised fine tuning of a model. Background Mechanism interpretability (MECHANISTIC INTERPRETABILITY, MI) is a research approach to understanding how a model (e.g., a large language model, etc.) works by reversing the neural network internal computing mechanisms, with the goal of revealing the specific paths of information flow, characterization formation, and decision generation of the model at the time of processing tasks. However, the existing mechanism interpretability research still stays on the "interpretation" level of model behavior, or is only applied to inference-time intervention, and the inference-time intervention-based method often faces high delay and instability problems in practical application, and a general procedure is not yet available to convert the deep model internal insights (i.e. specific mechanisms how information is represented, transferred and converted) into active guidance signals, so that the model construction process is directly optimized in the training stage. Meanwhile, in the aspect of data screening technology, the existing data screening methods have remarkable limitations that on one hand, a model to be trained is generally regarded as a black box and mainly depends on external signals to judge the effectiveness of data, direct feedback of the internal state of the model to the data is ignored, and on the other hand, when the data scale is enlarged, some data screening methods based on external indexes are difficult to surpass simple random selection, so that the effectiveness of the data screening methods is questioned. Disclosure of Invention In view of the foregoing, the present application provides a training data screening method and apparatus for supervised fine tuning of models, for solving at least one of the foregoing problems. The first aspect of the application provides a training data screening method for supervised fine tuning of a model, which comprises the steps of extracting hidden state characterization from a processing result of a target network layer of a source model on a text synthesis sequence, calling a sparse computation accelerator to conduct sparse projection processing on the hidden state characterization to obtain multidimensional sparse activation features, executing feature screening operation based on an activation frequency threshold on the features in each dimension of the multidimensional sparse activation features to obtain candidate feature sets, storing the candidate feature sets in a memory with a multi-level cache mechanism, conducting operation based on adaptive decoding weights on an activation scalar to obtain feature influence vectors, conducting residual weight enhancement on the source model by utilizing the feature influence vectors to obtain an enhanced source model, conducting operation based on adaptive coding weights on the processing result of verification data on the target network layer by utilizing the source model to execute original results of target task output and enhanced results of target task output to obtain significant information for verifying relation between source model input and output and significant feature screening operation on the feature sets, conducting operation based on the feature sets to obtain the candidate feature sets, and conducting supervised fine tuning on the feature sets by utilizing the feature sets to obtain the active feature sets, and conducting operation on the active feature sets to obtain the active feature sets. According to the embodiment of the application, the method for obtaining the multi-dimensional sparse activation characteristic by calling the sparse calculation accelerator to carry out sparse projection processing on the hidden state representation comprises the steps of deploying a trained sparse self-encoder into hardware acceleration equipment to obtain the sparse calculation accelerator, wherein the trained sparse self-encoder is used for carrying out space mapping on the hidden characteristic generated in the process of executing a target task by a source model, the hardware acceleration equipment comprises image processing equipment, tensor processing equipment, neural network processing equipment or an application-specific integrated circuit, and calling the sparse calculation accelerator to obtain self-adaptive coding weight in the process of processing a text synthesis sequence by a target network layer, and carrying out linear activation processing on the hidden state representation and the self-adaptive coding weight to realize sparse projection processing on the hidden state representation to obtain the multi-dimensional sparse activation characteristic. According to the embodiment of the application, the feature