Search

CN-121981213-A - Model training method combining parameter adjustment record screening and large model and related product

CN121981213ACN 121981213 ACN121981213 ACN 121981213ACN-121981213-A

Abstract

The present disclosure provides a model training method and related products that combine pitch-controlled recording screening with large models. According to the method, a simplified historical parameter adjustment record is obtained through screening from the historical parameter adjustment record. And analyzing the change rule of the simplified historical parameter adjustment record by using the large model, outputting the next round of super parameter configuration, and performing the next round of training by using the next round of super parameter configuration. The method can reduce the number of tokens for large model analysis, improve the speed of generating the next round of super-parameter configuration, enable the next round of super-parameter configuration to have interpretation, determine the characteristic distance between two super-parameter configurations by means of characteristic sensitivity of each super-parameter based on a target machine learning model to training evaluation results, and screen parameter adjustment records based on the characteristic distance between the super-parameter configurations, so that high-value simplified parameter adjustment records can be obtained. The lower round super parameter configuration obtained based on the more representative high-value simplified parameter adjustment record is easier to understand and more stable by utilizing the large model.

Inventors

  • YANG BOCHEN
  • Shi Liangpeng
  • LUO JIAN
  • CHEN CHANGRU

Assignees

  • 百融云创科技股份有限公司

Dates

Publication Date
20260505
Application Date
20260123

Claims (16)

  1. 1. A model training method combining pitch-controlled record screening and large models, comprising: acquiring a historical parameter adjustment record set of a target machine learning model, wherein the historical parameter adjustment record comprises historical super-parameter configuration and training evaluation results corresponding to the historical super-parameter configuration; Performing screening parameter record generation operation based on the historical parameter record set to obtain a screening parameter record set, wherein the screening parameter record generation operation comprises the steps of determining the characteristic sensitivity of each super parameter of the target machine learning model to training evaluation results based on the historical parameter record set, and generating the screening parameter record set based on each characteristic sensitivity and the historical parameter record set; Generating a template according to a preset prompting word, and generating a prompting word, wherein the prompting word comprises an input data prompting word part for indicating the screening parameter adjustment record set as input data and a task target part for indicating the large model to generate lower-round super-parameter configuration based on the input data; inputting the prompt word into a large model, wherein the large model is used for generating lower-round super-parameter configuration of the target machine learning model based on the screening parameter adjustment record set; and training the target machine learning model according to the next round super parameter configuration.
  2. 2. The method of claim 1, wherein determining feature sensitivity of each hyper-parameter of the target machine learning model to training assessment results based on the set of historical tuning records comprises: Determining a first parameter tuning record in the historical parameter tuning record set according to a first preset rule, wherein the first preset rule is used for determining parameter tuning records with training evaluation results larger than preset evaluation result indexes; Feature sensitivity of each super parameter of the target machine learning model to training evaluation results is determined based on each first tuning record.
  3. 3. The method of claim 2, wherein generating the set of screening tuning records based on each of the feature sensitivities and the set of historical tuning records comprises: for the super-parameter configuration in any two first parameter adjustment records, determining the characteristic distance between the super-parameter configurations in the two first parameter adjustment records based on each characteristic sensitivity; Determining a representative first parameter record in each first parameter record based on a characteristic distance between super parameter configurations in any two first parameter records; a set of screening tuning parameter records is generated based on each of the representative first tuning parameter records.
  4. 4. The method of claim 3, wherein the screening call record generation operation further comprises: Determining a second tuning record in the set of historical tuning records according to a second preset rule, and The generating a screening parameter record set based on each representative first parameter record includes: the set of screening call records is generated based on each of the representative first call records and each of the second call records.
  5. 5. The method of claim 4, wherein the screening call record generation operation further comprises: Obtaining recent call records, which are first preset number of historical call records in the historical call record set, which are not determined to be second call records and representative first call records, and the time difference between the training time and the current time accords with the time condition of the preset recent call records, and The generating the screening parameter record set based on each of the representative first parameter record and each of the second parameter record includes: the set of screening call records is generated based on each of the representative first call records, each of the second call records, and each of the recent call records.
  6. 6. The method of claim 3, wherein said determining a representative first tuning record in each of said first tuning records based on a characteristic distance between hyper-parametric configurations in any two of said first tuning records comprises: Determining an initially selected first call record in each first call record; Representative first tuning record selection operations are performed until no less than a second preset number of first tuning records have been selected, determining, for each unselected first tuning record, a characteristic distance between the first tuning record and each of the selected first tuning records based on a characteristic distance between a superparameter configuration of the first tuning record and a superparameter configuration of each of the selected first tuning records, and determining, among each of the unselected first tuning records, at least one selected first tuning record based on a characteristic distance between each of the unselected first tuning records and each of the selected first tuning records.
  7. 7. The method of claim 6, wherein for each unselected first tuning record, determining a characteristic distance between the first tuning record and each of the selected first tuning records based on a characteristic distance between a superparameter configuration of the first tuning record to a superparameter configuration of each of the selected first tuning records, comprises: for each unselected first tuning record, determining a first characteristic distance satisfying a first condition from a superparameter configuration of the first tuning record to a characteristic distance between superparameter configurations of each of the selected first tuning records, the determined first characteristic distance being a characteristic distance between the first tuning record and each of the selected first tuning records, and The determining at least one selected first tuning record in each unselected first tuning record based on the characteristic distance between each unselected first tuning record and each selected first tuning record includes: And determining the first parameter tuning record, of which the characteristic distance between each unselected first parameter tuning record and each selected first parameter tuning record meets a second condition, as the selected first parameter tuning record.
  8. 8. The method of claim 6, wherein said determining, for each unselected preferred tuning record, a characteristic distance between the first tuning record and each of the selected first tuning records based on a characteristic distance between a superparameter configuration of the first tuning record to a superparameter configuration of each of the selected first tuning records, comprises: For each unselected first tuning record, determining an average distance from the superparameter configuration of the first tuning record to the feature distance between the superparameter configuration of each of the selected first tuning records, the determined average distance being the feature distance between the first tuning record and each of the selected first tuning records, and The determining at least one selected first tuning record in each unselected first tuning record based on the characteristic distance between each unselected first tuning record and each selected first tuning record includes: and determining the first parameter tuning record, of which the characteristic distance between each unselected first parameter tuning record and each selected first parameter tuning record meets a third condition, as the selected first parameter tuning record.
  9. 9. The method of claim 2, wherein determining a first call record from the set of historical call records according to a first predetermined rule comprises: and selecting a third preset proportion or a third preset number of historical parameter adjustment records from the historical parameter adjustment record set as first parameter adjustment records according to the loss in the training evaluation result in the historical parameter adjustment record set, wherein the loss of each first parameter adjustment record is smaller than the loss of other historical parameter adjustment records which are not determined as first parameter adjustment records in the historical parameter adjustment records.
  10. 10. The method according to claim 1, wherein the method further comprises: performing performance evaluation on the target machine learning model to obtain a training evaluation result; generating a parameter adjustment record based on the next round super parameter configuration and the training evaluation result; The generated call record is added to the set of historical call records.
  11. 11. A model training device combining pitch-controlled record screening and large models, comprising: The parameter adjustment record acquisition module is configured to acquire a historical parameter adjustment record set of the target machine learning model, wherein the historical parameter adjustment record comprises historical super-parameter configuration and training evaluation results corresponding to the historical super-parameter configuration; The parameter adjustment record screening module is configured to execute screening parameter adjustment record generation operation based on the historical parameter adjustment record set to obtain a screening parameter adjustment record set, wherein the screening parameter adjustment record generation operation comprises the steps of determining the characteristic sensitivity of each super parameter of the target machine learning model to a training evaluation result based on the historical parameter adjustment record set, and generating the screening parameter adjustment record set based on each characteristic sensitivity and the historical parameter adjustment record set; The prompt word generation module is configured to generate a template according to a preset prompt word, and generate a prompt word, wherein the prompt word comprises an input data prompt word part for indicating the screening parameter record set as input data and a task target part for indicating the big model to generate next round super parameter configuration based on the input data; a super-parameter optimization module configured to input the prompt word into a large model for generating a next super-parameter configuration of the target machine learning model based on the screening parameter record set And the model training module is configured to train the target machine learning model according to the lower round super parameter configuration.
  12. 12. A method of classifying data, comprising: acquiring data to be classified; Inputting the data to be classified into a data classification model to obtain a classification result corresponding to the data to be classified, wherein the data classification model is obtained by training in advance through the following training steps: Acquiring a historical parameter adjustment record set of a data classification model, wherein the historical parameter adjustment record comprises a historical super-parameter configuration and a data classification model training evaluation result corresponding to the historical super-parameter configuration; Performing screening parameter record generation operation based on the historical parameter record set to obtain a screening parameter record set, wherein the screening parameter record generation operation comprises the steps of determining data classification characteristic sensitivity of each super parameter of the data classification model to a data classification model training evaluation result based on the historical parameter record set, and generating a screening parameter record set based on each data classification characteristic sensitivity and the historical parameter record set; generating a template according to a preset prompting word, and generating a prompting word, wherein the prompting word comprises an input data prompting word part for indicating the screening parameter adjustment record set as input data and a task target part for indicating the generation of next round super parameter configuration based on the input data; inputting the prompt word into a large model, wherein the large model is used for generating lower-round super-parameter configuration of the data classification model based on the screening parameter-adjusting record set; And training the data classification model according to the lower round super parameter configuration.
  13. 13. A data sorting apparatus, comprising: The data acquisition module is configured to acquire data to be classified; the data classification module is configured to input the data to be classified into a data classification model to obtain a classification result corresponding to the data to be classified, wherein the data classification model is obtained through training in advance through the following training steps: Acquiring a historical parameter adjustment record set of a data classification model, wherein the historical parameter adjustment record comprises a historical super-parameter configuration and a data classification model training evaluation result corresponding to the historical super-parameter configuration; Performing screening parameter record generation operation based on the historical parameter record set to obtain a screening parameter record set, wherein the screening parameter record generation operation comprises the steps of determining data classification characteristic sensitivity of each super parameter of the data classification model to a data classification model training evaluation result based on the historical parameter record set, and generating a screening parameter record set based on each data classification characteristic sensitivity and the historical parameter record set; generating a template according to a preset prompting word, and generating a prompting word, wherein the prompting word comprises an input data prompting word part for indicating the screening parameter adjustment record set as input data and a task target part for indicating the generation of next round super parameter configuration based on the input data; inputting the prompt word into a large model, wherein the large model is used for generating lower-round super-parameter configuration of the data classification model based on the screening parameter-adjusting record set; And training the data classification model according to the lower round super parameter configuration.
  14. 14. An electronic device, comprising: One or more processors; a storage device having one or more programs stored thereon, The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-10 and/or the method of claim 12.
  15. 15. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by one or more processors, implements the method according to any one of claims 1-10 and/or the method according to claim 12.
  16. 16. A computer program product comprising computer programs/instructions which when executed by a processor implement the method of any one of claims 1-10 and/or the method of claim 12.

Description

Model training method combining parameter adjustment record screening and large model and related product Technical Field The embodiment of the disclosure relates to the technical field of machine learning, in particular to a model training method combining parameter adjustment record screening and a large model and a related product. Background The machine learning model training often involves using a bayesian optimization method to generate a next round of hyper-parameter configuration based on the hyper-parameter configuration of each round of training the machine learning model historically and the training evaluation result obtained by evaluating the performance of the corresponding model. However, in the above process, an automated search needs to be performed based on all parameter adjustment records (i.e., including the hyper-parameter configuration and the training evaluation result) in history, which results in higher time consumption for single hyper-parameter setting when the sample size is larger or the model structure is more complex, and further results in higher time cost for each round of training including parameter optimization and training based on the optimized hyper-parameters. In addition, although the large model can be used for analysis based on the historical parameter adjustment records to generate the next round of super-parameter configuration, if all the historical parameter adjustment records are used for analysis, the amount of data (i.e. Token) required to be analyzed by the large model is too large, and the calculation speed is slow. Disclosure of Invention Embodiments of the present disclosure propose model training methods, apparatuses, electronic devices, storage media, and computer program products that combine pitch-recording screening with large models. In a first aspect, embodiments of the present disclosure provide a model training method combining pitch-recording screening and large models, the method comprising: acquiring a historical parameter adjustment record set of a target machine learning model, wherein the historical parameter adjustment record comprises historical super-parameter configuration and training evaluation results corresponding to the historical super-parameter configuration; Performing screening parameter record generation operation based on the historical parameter record set to obtain a screening parameter record set, wherein the screening parameter record generation operation comprises the steps of determining the characteristic sensitivity of each super parameter of the target machine learning model to training evaluation results based on the historical parameter record set, and generating the screening parameter record set based on each characteristic sensitivity and the historical parameter record set; Generating a template according to a preset prompting word, and generating a prompting word, wherein the prompting word comprises an input data prompting word part for indicating the screening parameter adjustment record set as input data and a task target part for indicating the large model to generate lower-round super-parameter configuration based on the input data; inputting the prompt word into a large model, wherein the large model is used for generating lower-round super-parameter configuration of the target machine learning model based on the screening parameter adjustment record set; and training the target machine learning model according to the next round super parameter configuration. In some optional embodiments, the determining, based on the set of historical tuning records, feature sensitivity of each super parameter of the target machine learning model to training evaluation results includes: Determining a first parameter tuning record in the historical parameter tuning record set according to a first preset rule, wherein the first preset rule is used for determining parameter tuning records with training evaluation results larger than preset evaluation result indexes; Feature sensitivity of each super parameter of the target machine learning model to training evaluation results is determined based on each first tuning record. In some alternative embodiments, the generating a screening set of tuning records based on each of the characteristic sensitivities and the historical set of tuning records comprises: for the super-parameter configuration in any two first parameter adjustment records, determining the characteristic distance between the super-parameter configurations in the two first parameter adjustment records based on each characteristic sensitivity; Determining a representative first parameter record in each first parameter record based on a characteristic distance between super parameter configurations in any two first parameter records; a set of screening tuning parameter records is generated based on each of the representative first tuning parameter records. In some optional embodiments, the filtering tuning rec