Search

CN-122021746-A - Quantization method, quantization device, computer device, chip and readable storage medium for generating model

CN122021746ACN 122021746 ACN122021746 ACN 122021746ACN-122021746-A

Abstract

The application relates to a quantization method, a quantization device, a computer device, a chip and a readable storage medium for generating a model. The method comprises the steps of obtaining a plurality of candidate sampling coefficients of a generation model and generation data corresponding to each running step, calculating a data acquisition state list corresponding to each candidate sampling coefficient based on each candidate sampling coefficient, a preset initial sampling interval and a maximum step number threshold of the running step, carrying out data acquisition based on the data acquisition state list in the generation data corresponding to each running step to obtain calibration data, and carrying out quantitative screening on the generation model through each calibration data to obtain a target quantitative model of the generation model. By adopting the method, the generated data corresponding to each operation step can be screened, and the data with higher quality is extracted as the calibration data, so that the quantization efficiency and accuracy of the model are further improved.

Inventors

  • YANG JINXIAO

Assignees

  • 展讯通信(上海)有限公司

Dates

Publication Date
20260512
Application Date
20260128

Claims (10)

  1. 1. A quantization method of a generated model, the method comprising: acquiring a plurality of candidate sampling coefficients of a generation model and generating data corresponding to each operation step respectively; calculating a data acquisition state list corresponding to each candidate sampling coefficient respectively based on each candidate sampling coefficient, a preset initial sampling interval and a maximum step number threshold of the running step; in the generated data corresponding to each running step, data acquisition is carried out based on the data acquisition state list to obtain calibration data; And carrying out quantization screening on the generated model through each calibration data to obtain a target quantization model of the generated model.
  2. 2. The method according to claim 1, wherein the method further comprises: And carrying out multiple reasoning on the generating model based on the data to be inferred of the generating model and the maximum step number threshold of the running steps to obtain generating data corresponding to each running step respectively, wherein the input data of the initial running step of the generating model comprises the data to be inferred and noise data, and the input data of other running steps are the data to be inferred and the generating data of the last running step.
  3. 3. The method according to claim 2, wherein the method further comprises: And in a preset attenuation coefficient interval, calculating based on the attenuation coefficient searching step length corresponding to the generation model and a preset sampling attenuation coefficient to obtain a plurality of candidate sampling coefficients of the generation model.
  4. 4. The method according to claim 1, wherein calculating the data acquisition status list corresponding to each candidate sampling coefficient based on each candidate sampling coefficient, a preset start sampling interval, and a maximum step number threshold of the running steps, includes: Determining a preset initial sampling interval as an initial sampling interval according to each candidate sampling coefficient, and processing the candidate sampling coefficients and the initial sampling interval through a preset rounding function, the maximum step number threshold value and the initial sampling interval to obtain a plurality of sampling intervals after the initial sampling interval; and obtaining a data acquisition state list corresponding to the candidate sampling coefficient based on the data sampling interval list and the data sampling interval list.
  5. 5. The method according to claim 1, wherein said performing quantization screening on said generated model by each of said calibration data to obtain a target quantization model of said generated model comprises: respectively carrying out post-training quantization processing on the generated model through each piece of calibration data to obtain an initial quantization model corresponding to the calibration data; and screening each initial quantization model to obtain a target quantization model.
  6. 6. The method of claim 5, wherein said screening each of said initial quantization models to obtain a target quantization model comprises: Acquiring a verification data set of the generation model, wherein the verification data set comprises verification input data and verification generation data; Respectively reasoning the verification input data through each initial quantization model to obtain output generation data, and calculating the similarity between the verification generation data and each output generation data; And determining an initial quantization model corresponding to the output generated data with the maximum similarity as a target quantization model of the generated model.
  7. 7. A quantization apparatus for generating a model, the apparatus comprising: the first acquisition module is used for acquiring a plurality of candidate sampling coefficients of the generation model and the generation data corresponding to each operation step respectively; the first calculation module is used for calculating a data acquisition state list corresponding to each candidate sampling coefficient respectively based on each candidate sampling coefficient, a preset initial sampling interval and a maximum step number threshold value of the running steps; The calibration data acquisition module is used for acquiring data based on the data acquisition state list in the generated data corresponding to each operation step respectively to obtain calibration data; And the quantization module is used for carrying out quantization screening on the generated model through each calibration data to obtain a target quantization model of the generated model.
  8. 8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.
  9. 9. A chip comprising a processor and a communication interface, the processor being configured to cause the chip to perform the steps of the method of any one of claims 1 to 6.
  10. 10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.

Description

Quantization method, quantization device, computer device, chip and readable storage medium for generating model Technical Field The present application relates to the field of computer technologies, and in particular, to a quantization method, apparatus, computer device, chip, and readable storage medium for generating a model. Background In the field of generating, the generation can be realized through a diffusion model, the diffusion model comprises a forward stage and a reverse stage, noise is added to data step by step in the forward stage until the data is completely changed into Gaussian noise, and original data is restored from the Gaussian noise in the reverse stage, so that the training of a generating function is realized. In the practical training process of the diffusion model, in order to realize stable operation of the diffusion model on equipment with different capacities, the diffusion model needs to be compressed. In the related art, the diffusion model is generally compressed by a quantization technology, but in the actual quantization process, data corresponding to all time steps are collected for quantization, and the data quality of each time step is lower, so that the accuracy of the quantization model is lower. Disclosure of Invention In view of the foregoing, it is desirable to provide a quantization method, apparatus, computer device, chip, and readable storage medium of a generated model that can improve quantization effects and quantization accuracy. In a first aspect, the present application provides a quantization method for generating a model, including: acquiring a plurality of candidate sampling coefficients of a generation model and generating data corresponding to each operation step respectively; calculating a data acquisition state list corresponding to each candidate sampling coefficient respectively based on each candidate sampling coefficient, a preset initial sampling interval and a maximum step number threshold of the running step; in the generated data corresponding to each running step, data acquisition is carried out based on the data acquisition state list to obtain calibration data; And carrying out quantization screening on the generated model through each calibration data to obtain a target quantization model of the generated model. In one embodiment, based on the to-be-inferred data of the generating model and the maximum step number threshold of the running steps, multiple times of inference is performed on the generating model to obtain generating data corresponding to each running step, wherein the input data of the initial running step of the generating model comprises the to-be-inferred data and noise data, and the input data of other running steps are the to-be-inferred data and the generating data of the previous running step. In one embodiment, the method further comprises: And in a preset attenuation coefficient interval, calculating based on the attenuation coefficient searching step length corresponding to the generation model and a preset sampling attenuation coefficient to obtain a plurality of candidate sampling coefficients of the generation model. In one embodiment, the calculating the data acquisition status list corresponding to each candidate sampling coefficient based on each candidate sampling coefficient, a preset start sampling interval, and a maximum step number threshold of the running steps includes: Determining a preset initial sampling interval as an initial sampling interval according to each candidate sampling coefficient, and processing the candidate sampling coefficients and the initial sampling interval through a preset rounding function, the maximum step number threshold value and the initial sampling interval to obtain a plurality of sampling intervals after the initial sampling interval; and obtaining a data acquisition state list corresponding to the candidate sampling coefficient based on the data sampling interval list and the data sampling interval list. In one embodiment, the performing quantization screening on the generated model through each calibration data to obtain a target quantization model of the generated model includes: respectively carrying out post-training quantization processing on the generated model through each piece of calibration data to obtain an initial quantization model corresponding to the calibration data; and screening each initial quantization model to obtain a target quantization model. In one embodiment, the filtering each initial quantization model to obtain a target quantization model includes: Acquiring a verification data set of the generation model, wherein the verification data set comprises verification input data and verification generation data; Respectively reasoning the verification input data through each initial quantization model to obtain output generation data, and calculating the similarity between the verification generation data and each output generation data; And det