CN-122021741-A - Sample model training method and system based on data driving
Abstract
The invention discloses a sample model training method and system based on data driving, which relate to the technical field of model training, and are characterized in that an initial sample set is acquired and is input into a generator together with a noise vector to generate an initial prediction sample, a composite feature tensor is constructed based on the initial sample, the difference of a reference label in a feature space of the prediction sample and a discriminator and the change track of total loss of the generator are monitored to output a training termination condition tensor, the tensor is used as a condition constraint to be fed back to the generator to update the prediction sample and is input into the discriminator together with the initial sample set to execute countermeasure training, wherein the total loss of the generator is the weighted sum of the countermeasure loss and regularization loss, the network parameter is updated by counter propagation according to the weighted sum, the feature resampling is triggered to dynamically correct training data in the continuous update process of the network parameter, and a final generator network is output when the difference of continuous training in two rounds reaches a convergence threshold, and the method realizes closed-loop correction of self-adaption judgment of training convergence.
Inventors
- CHENG HONGJU
- CHEN KEFEI
- CHEN JIA
Assignees
- 福州大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260414
Claims (10)
- 1. A method for training a sample model based on data driving, the method comprising: Acquiring an initial sample set, inputting the initial sample set and a noise vector into a generator network, and generating an initial prediction sample; Constructing a composite feature tensor based on an initial sample set, monitoring the change track of total loss of a generator according to the feature difference between an initial prediction sample output by the generator in each round of training and a reference sample label in a discriminator feature space, judging that the training converges when the fluctuation amplitude of a loss change sequence in a sliding window is lower than a convergence tolerance threshold, triggering an iterative training termination instruction, and outputting a training termination condition tensor; Inputting the updated initial prediction sample and the initial sample set into a discriminator network to execute countermeasure training, wherein in the countermeasure training, the total loss of a generator is a weighted sum of the countermeasure loss and the regularization loss, and the network parameters of the generator network and the discriminator network are updated according to total loss counter propagation, and the network parameters at least comprise convolution kernel weights, bias items and normalization parameters; And triggering feature resampling to update training data in the process of continuously updating the network parameters, and outputting a trained generator network when the difference between two continuous rounds of training reaches a convergence threshold.
- 2. The data-driven sample model training method of claim 1, wherein the initial predictive sample acquisition process comprises: acquiring an initial sample set, wherein the initial sample set at least comprises a sample label, a spatial position code associated with the sample label and a local stress amplitude; The method comprises the steps of converting an initial sample set and a noise vector into an input tensor, sequentially carrying out layer-by-layer conversion through a full-connection layer, a convolution layer, an up-sampling layer and an activation layer of a generator network, wherein each layer carries out weighted summation and nonlinear mapping on the input tensor, gradually changes the shape and numerical distribution of the input tensor, and generates a multidimensional floating point number tensor through an output layer as an initial prediction sample.
- 3. The data-driven sample model training method of claim 2, wherein the constructing process of the composite feature tensor comprises: Based on semantic relativity among sample tags, combining spatial position coding and local stress amplitude, constructing and generating a difference feature set sample by sample, wherein the difference feature set refers to a difference value among the sample tags; the difference value between the sample labels comprises a difference value between the first sample labels, a difference value between the second sample labels and a difference value between the third sample labels; performing cosine value conversion on the difference value between the first sample labels to obtain a first feature vector, then applying convolution operation to the local stress amplitude, and solving the feature gradient point by point to calculate the modulus value of the feature gradient; Combining the first feature vector, the modulus value of the feature gradient and the sample label according to the same spatial index to form a local feature matrix; performing cosine value conversion on the difference value between the third sample labels to obtain a second feature vector, and extracting key points of all samples in the initial sample set to obtain a key point set; Establishing a local neighborhood with a fixed radius in a feature space by taking each key point as a center, reading a similarity sequence of space position codes in the local neighborhood, and performing neighborhood average operation to obtain local feature density parameters corresponding to each key point so as to form a feature coupling matrix; And combining the characteristic coupling matrix with the local characteristic matrix to form a composite characteristic tensor.
- 4. A method of training a data-driven based sample model according to claim 3, wherein the process of triggering an iterative training termination instruction comprises: For each initial prediction sample output by the generator network, calculating the characteristic difference between the initial prediction sample and a reference sample label in a discriminator network characteristic space to obtain a plurality of groups of characteristic distances, and calculating statistics of the initial prediction samples on the basis of each group of characteristic distances to form a consistency index; Obtaining a loss change sequence by recording the change track of the generator counter loss in each round of training; setting a sliding window and giving a convergence tolerance threshold by taking the current training round as the center, and defining a convergence judging function by combining a loss change sequence so as to judge whether the training is converged or not and trigger an iterative training ending instruction.
- 5. The data-driven sample model training method of claim 4, wherein the training termination condition tensor construction process comprises: if the fluctuation amplitude of the loss change sequence in the sliding window is lower than the convergence tolerance threshold, judging that the training reaches a convergence state, and enabling the output of a convergence judging function to be 1 to obtain a convergence confirmation mark; extracting the maximum value of the consistency index as the final consistency index of the current training round, and then performing verification on the final consistency index based on the convergence confirmation mark to obtain the convergence judgment result of each training round; The convergence decision results of each training round are included in the composite feature tensor to form a training termination condition tensor.
- 6. The method of claim 5, wherein inputting the training termination condition tensor as a condition constraint to an input of the generator network, generating a network model in combination with the condition to update the initial prediction samples, inputting the updated initial prediction samples and the initial sample set to the arbiter network, and performing the countermeasure training, comprises: Combining a training termination condition tensor and a noise vector to form an input condition for countertraining, and synchronously loading the input condition to an input end of a generator network, wherein the generator network adopts a convolution layer stacking-based condition to generate a network structure, the network structure comprises a plurality of convolution layers, an up-sampling layer and a nonlinear activation layer, and a mapping relation from a low-dimensional noise space to a high-dimensional feature space is established in each convolution layer through local convolution, feature fusion and condition normalization operation so as to output a new prediction sample at an output end; Summing based on the absolute values of the feature distances and normalizing according to feature dimensions to obtain a total feature difference index; Identifying a key point set in a new prediction sample, calculating the local change rate of spatial position codes in the fixed radius neighborhood of each key point to obtain local feature density parameters of each key point, and calculating the difference value between prediction features at the same key point to obtain the bias quantity of the key point; And constructing key point pairs, calculating a change parameter difference and a bias difference for each key point pair according to the local feature density parameter and the key point bias quantity of each key point to generate a feature difference field, and carrying out gradient calculation on the feature difference field to obtain a feature space gradient mode.
- 7. The method of claim 6, wherein the training termination condition tensor is input as a condition constraint to an input of the generator network, and the network model is generated in combination with the condition to update the initial prediction samples, wherein the updated initial prediction samples and the initial sample set are input to the arbiter network to perform the countermeasure training, and further comprising: inputting the new predicted sample and the initial sample set into a discriminator network, wherein the discriminator network at least comprises a full-connection layer, a multi-layer convolution layer and an up-sampling layer, performing convolution operation on each convolution layer to extract feature mapping, and generating a discrimination probability matrix on an output layer; In each training batch, based on the discriminant probability matrix, respectively acquiring the antagonism loss functions corresponding to the generator network and the discriminant network; According to regularized loss formed by the total characteristic difference index and the characteristic space gradient module, combining the corresponding countermeasures loss function of the generator network to obtain a total loss function of the generator network; Based on the total loss function of the generator network and the corresponding counterloss function of the discriminator network, calculating the gradient of each loss function to the network parameters by adopting a back propagation algorithm, and respectively updating the network parameters of the generator network and the discriminator network along the gradient descent direction; the network parameters include at least convolution kernel weights, bias terms, and normalization parameters.
- 8. The data-driven sample model training method of claim 7, wherein triggering feature resampling to update training data during a continuous update of network parameters and outputting a trained generator network when a convergence threshold is reached for a difference between consecutive two rounds of training, comprises: Combining the total characteristic difference index and the characteristic space gradient module to obtain a difference evaluation set, scanning the difference evaluation set point by triggering a training state evaluation instruction, marking a high difference region if the corresponding characteristic points meet the condition that the characteristic space gradient module is larger than the total characteristic difference index, and triggering a characteristic resampling command to re-extract local characteristic data; and updating corresponding features in the initial sample set based on the local feature data to obtain a new difference evaluation set.
- 9. The data-driven sample model training method of claim 8, wherein during continuous updating of network parameters, feature resampling is triggered to update training data and a trained generator network is output when a convergence threshold is reached for a difference between consecutive two rounds of training, further comprising: And carrying out item-by-item difference on the new difference evaluation set and the difference evaluation set of the previous round to obtain a training state feedback set, judging that the countermeasure training is stable if the difference value of the total characteristic difference index and the local characteristic difference field in two continuous rounds of training reach the convergence threshold value, triggering a training termination instruction, stopping the training operation, storing a final network parameter set, and completing closed loop training and archiving.
- 10. A data-driven based sample model training system for implementing the data-driven based sample model training method according to any of the preceding claims 1 to 9, comprising: The generator primary generation module is used for acquiring an initial sample set, inputting the initial sample set and a noise vector into a generator network and generating an initial prediction sample; the training pre-convergence module is used for constructing a composite characteristic tensor based on the initial sample set, monitoring the change track of the total loss of the generator according to the characteristic difference between the initial prediction sample output by the generator in each round of training and the reference sample label in the characteristic space of the discriminator, judging the training convergence when the fluctuation amplitude of the loss change sequence in the sliding window is lower than the convergence tolerance threshold, triggering an iteration training termination instruction, and outputting a training termination condition tensor; the system comprises a loss analysis module, a decision making module and a decision making module, wherein the loss analysis module is used for taking a training termination condition tensor as a condition constraint, inputting the training termination condition tensor to an input end of a generator network, generating a network model by combining the condition to update an initial prediction sample, inputting the updated initial prediction sample and an initial sample set to the decision making module, and executing countermeasure training; And the generator output module is used for triggering feature resampling to update training data in the process of continuously updating the network parameters, and outputting a trained generator network when the difference between two continuous rounds of training reaches a convergence threshold.
Description
Sample model training method and system based on data driving Technical Field The invention relates to the technical field of model training, in particular to a sample model training method and system based on data driving. Background In the field of machine learning applications, condition generating networks are widely used to address the generation of multi-dimensional data, such as time series data, text data, or analog data of complex features, for which input data typically includes tag information, spatial or temporal coding, and associated auxiliary parameters, while generator networks convert low-dimensional noise and condition information into high-dimensional sample outputs through multiple layers of nonlinear mapping. The difference between the generated sample and the training sample in the feature space is quantitatively evaluated, so that the generator network can be guided to optimize the output of the generator network, and the realistic reproduction of the sample on the specific feature distribution is realized. The method realizes artificial intelligent reasoning, prediction or simulation tasks in a computer system. In the prior art, there are several problems with the generator and arbiter based challenge training method. Firstly, the inconsistency of the generated samples and the training samples in the feature space easily occurs in the training process of the traditional generating network, so that the model is difficult to converge and the quality of the output samples is unstable. Secondly, the existing method often depends on empirical weights or manually set regularization strategies under the constraint of multidimensional feature conditions, so that the model is difficult to actively capture and accurately reflect the spatial coupling and the interdependence relationship between different features, and therefore the model is difficult to fully learn the complex association between samples, meanwhile, judgment of training convergence is often dependent on a preset maximum iteration round or a single loss threshold value, and an adaptive sliding window monitoring mechanism based on the fluctuation amplitude of a loss change sequence is absent, so that under fitting or over fitting is easily caused. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a sample model training method and system based on data driving, which solve the problems in the background technology. In order to achieve the above purpose, the invention is realized by the following technical scheme: In a first aspect, a data-driven sample model training method includes the steps of: Acquiring an initial sample set, inputting the initial sample set and a noise vector into a generator network, and generating an initial prediction sample; Constructing a composite feature tensor based on an initial sample set, monitoring the change track of total loss of a generator according to the feature difference between an initial prediction sample output by the generator in each round of training and a reference sample label in a discriminator feature space, judging that the training converges when the fluctuation amplitude of a loss change sequence in a sliding window is lower than a convergence tolerance threshold, triggering an iterative training termination instruction, and outputting a training termination condition tensor; Inputting the updated initial prediction sample and the initial sample set into a discriminator network to execute countermeasure training, wherein in the countermeasure training, the total loss of a generator is a weighted sum of the countermeasure loss and the regularization loss, and the network parameters of the generator network and the discriminator network are updated according to total loss counter propagation, and the network parameters at least comprise convolution kernel weights, bias items and normalization parameters; And triggering feature resampling to update training data in the process of continuously updating the network parameters, and outputting a trained generator network when the difference between two continuous rounds of training reaches a convergence threshold. The system for training the sample model based on data driving comprises a generator initial generation module, a training pre-convergence module, a loss analysis module and a generator output module; The generator primary generation module is used for acquiring an initial sample set, inputting the initial sample set and a noise vector into a generator network and generating an initial prediction sample; the training pre-convergence module is used for constructing a composite characteristic tensor based on the initial sample set, monitoring the change track of the total loss of the generator according to the characteristic difference between the initial prediction sample output by the generator in each round of training and the reference sample label in the characteristic space of the dis