CN-120600190-B - Model training method and program product
Abstract
The application discloses a model training method and a program product, the method comprises the steps of determining first indication information, wherein the first indication information is used for indicating the complexity of a structure of a first artificial intelligent AI model, the first indication information comprises one or more data indexes of first training data for training the first AI model, determining a first value of a super parameter of the first AI model aiming at the first indication information according to the first indication information, setting the super parameter of the first AI model as the first value, training the first AI model by utilizing first training data, the first training data comprises theoretical signals corresponding to N values of parameters of a micro-nano structure, the first AI model is used for calculating the theoretical signals based on different values of the parameters of the micro-nano structure, and N is a positive integer. Therefore, the performance of the first AI model is effectively guaranteed, the accuracy of calculating theoretical signals is improved, the risk of gradient disappearance is effectively reduced, and the model training efficiency is improved.
Inventors
- CHEN LU
- MA YANZHONG
- MA XIANGYU
- WEN LANGFENG
Assignees
- 深圳中科飞测科技股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20250730
Claims (9)
- 1. A method of model training, the method comprising: Determining first indication information, wherein the first indication information is used for indicating the complexity of a structure of a first AI model, the first indication information comprises one or more data indexes of first training data for training the first AI model, the one or more data indexes of the first training data are used for indicating the scale of the first training data and the characteristics of data contained in the first training data, and the first indication information comprises one or more of the data quantity N of parameters of the micro-nano structure included in the first training data, the value range of the parameters of the micro-nano structure in the first training data, the condition that theoretical signals in the first training data change along with the value of the parameters of the micro-nano structure, or the condition that the value of the parameters of each theoretical signal in the first training data change along with the wavelength; Determining second indication information including one or more data metrics for pre-training second training data of the first AI model; Pre-training the first AI model by using second training data, wherein the second training data comprises M theoretical signals respectively corresponding to the values of parameters of the micro-nano structure, and M is a positive integer smaller than N; determining a second value of the hyper-parameter of the first AI model for the second indication information according to the training condition of pre-training the first AI model; amplifying the second value into a first value of the hyper-parameter of the first AI model aiming at the first indication information according to the difference proportion between the first indication information and the second indication information, wherein the first value is the product of the second value and the difference proportion; Setting the hyper-parameters of the first AI model to a first value; And training a first AI model by using first training data, wherein the first training data comprises theoretical signals respectively corresponding to N values of the parameters of the micro-nano structure, the first AI model is used for calculating the theoretical signals based on different values of the parameters of the micro-nano structure, and N is a positive integer.
- 2. The method according to claim 1, wherein the method further comprises: In the process of training the first AI model by using the first training data, when the training condition of the first AI model meets a first condition, adjusting the super-parameters of the first AI model to a third value according to the training condition of the first AI model, wherein the third value is used for optimizing the training condition of the first AI model; The first condition includes iterating that a performance improvement magnitude of the first AI model is below a first threshold, or that a training error of the first AI model exceeds a second threshold.
- 3. The method of claim 1 or 2, wherein the super parameters of the first AI model include one or more of a network width, a network depth, a learning rate, a regularization strength, a batch size of the first AI model.
- 4. The method of claim 2, wherein the second indication information includes one or more of the M, a range of values of the parameter of the micro-nanostructure in the second training data, a case where a theoretical signal in the second training data changes with the value of the parameter of the micro-nanostructure, or a case where a parameter value of each theoretical signal in the second training data changes with a measurement condition.
- 5. The method of any one of claims 1, 2, 4, further comprising: Measuring a measurement signal of the micro-nano structure; And executing a fitting algorithm on the measurement signals by using the first AI model to obtain target values of parameters of the micro-nano structure, wherein theoretical signals corresponding to the target values are matched with the measurement signals.
- 6. The method of claim 5, wherein prior to fitting the measurement signals, the method further comprises: Determining third indication information, wherein the third indication information comprises one or more data indexes of the micro-nano structure; and adjusting the value of the super-parameters of the fitting algorithm according to the third indication information, wherein the super-parameters of the fitting algorithm comprise iteration compensation parameters.
- 7. The method of claim 5, wherein performing a fitting algorithm on the measurement signals using the first AI model yields a target value for a parameter of the micro-nanostructure, comprising: Determining a fitting data set by using the first AI model, wherein the fitting data set comprises theoretical signals corresponding to the values of the parameters of the micro-nano structure meeting a fitting numerical range; and executing a fitting algorithm on the measurement signals by retrieving the fitting data set to obtain the target value in the fitting numerical range.
- 8. A model training apparatus, the apparatus comprising: A determining module, configured to determine first indication information, where the first indication information is used to indicate complexity of a structure of a first AI model, the first indication information includes one or more data indexes for training first training data of the first AI model, where the one or more data indexes of the first training data are used to indicate a scale of the first training data and characteristics of data included in the first training data, and the first indication information includes one or more of a data amount N of a parameter of a micro-nano structure included in the first training data, a value range of the parameter of the micro-nano structure in the first training data, a case that a theoretical signal in the first training data changes with a value of the parameter of the micro-nano structure, or a case that a parameter value of each theoretical signal in the first training data changes with a wavelength; determining second indication information, wherein the second indication information comprises one or more data indexes of second training data for pre-training the first AI model, pre-training the first AI model by utilizing the second training data, wherein the second training data comprises theoretical signals respectively corresponding to M values of parameters of the micro-nano structure, M is a positive integer smaller than N, determining a second value of a super-parameter of the first AI model aiming at the second indication information according to the training condition of pre-training the first AI model, amplifying the second value into the first value of the super-parameter of the first AI model aiming at the first indication information according to the difference proportion between the first indication information and the second indication information, the first value is the product of the second value and the difference proportion; The setting module is used for setting the hyper-parameters of the first AI model to the first value; The training module is used for training the first AI model by using the first training data, the first training data comprises N theoretical signals respectively corresponding to the values of the parameters of the micro-nano structure, the first AI model is used for calculating the theoretical signals based on the different values of the parameters of the micro-nano structure, and N is a positive integer.
- 9. A computer program product containing instructions that, when run on a computing device, cause the computing device to perform the method of any of claims 1 to 7.
Description
Model training method and program product Technical Field The application relates to the field of nanotechnology, in particular to a model training method and a program product. Background With the rapid development of technology, nanotechnology is gradually applied to more and more industries and fields. The micro-nano structure is used as the core application of the nano technology, and has huge application potential in the fields of information technology, energy sources, biological medicine and the like due to the unique size effect. For example, semiconductor micro-nano structures have excellent properties in terms of electronics and photonics, and have become key materials for high performance electronic and optoelectronic devices. In practical applications, determining parameter information of the micro-nano structure is important, for example, the parameter information of the micro-nano structure can be used for describing performance of the micro-nano structure or can be used for quality detection of the micro-nano structure. The parameters of the micro-nano structure can be parameters such as critical dimensions (critical dimension, CD), period, height or sidewall angle of the micro-nano structure. Typically, an artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) model may be utilized to determine parameter information for the micro-nano structure. For example, the scattering signal of the micro-nano structure can be measured based on different measurement technologies or measurement devices (such as a scattering measurement technology, an ellipsometer, a CD small-angle X-ray scattering technology, a CD scanning electron microscope, etc.), and theoretical signals of parameters of the micro-nano structure under different values are calculated based on an AI model, so that the values of the parameters of the micro-nano structure are inverted by comparing and matching the different theoretical signals with the measured signals. However, the AI model may have problems that the training efficiency is low, the performance of the AI model cannot meet the measurement requirements of different micro-nano structures, and the like. Disclosure of Invention The embodiment of the application provides a model training method for improving training efficiency and performance of an AI model for determining micro-nano structure parameter information. In addition, the embodiment of the application also provides a corresponding device, a device storage medium and a computer program product. In a first aspect, an embodiment of the present application provides a model training method, including determining first indication information, where the first indication information is used to indicate complexity of a structure of a first artificial intelligence AI model, the first indication information includes one or more data indexes of first training data used to train the first AI model, determining, according to the first indication information, a first value of a super parameter of the first AI model for the first indication information, setting the super parameter of the first AI model to the first value, training the first AI model with the first training data, where the first training data includes N theoretical spectrums corresponding to N values of parameters of the micro-nano structure, and the first AI model is used to calculate the theoretical spectrums based on different values of the parameters of the micro-nano structure, where N is a positive integer. In one possible implementation manner, the determining the first value of the super-parameter of the first AI model according to the first indication information includes determining second indication information, wherein the second indication information includes one or more data indexes of second training data for pre-training the first AI model, pre-training the first AI model by using the second training data, the second training data includes theoretical spectrums corresponding to M values of parameters of the micro-nano structure respectively, M is a positive integer smaller than N, determining the second value of the super-parameter of the first AI model for the second indication information according to the training condition of pre-training the first AI model, and determining the first value based on the second value according to the difference condition between the first indication information and the second indication information. In one possible implementation manner, the method further comprises the step of adjusting the super-parameters of the first AI model to a third value according to the training condition of the first AI model when the training condition of the first AI model meets a first condition in the process of training the first AI model by using the first training data, wherein the third value is used for optimizing the training condition of the first AI model, and the first condition comprises the step of iterating that the perform