Search

CN-122021772-A - Model training method, electronic equipment, medium and product

CN122021772ACN 122021772 ACN122021772 ACN 122021772ACN-122021772-A

Abstract

The application provides a model training method, electronic equipment, a medium and a product, wherein the method comprises the steps of obtaining sample data, wherein the sample data comprise a characteristic matrix of a sample vulnerability, training a long-term and short-term memory network layer of a network model based on the characteristic matrix to obtain a first training result, training a position coding layer of the network model based on the characteristic matrix to obtain a second training result, determining a loss function value of the network model based on the first training result and the second training result, adjusting parameters of the network model based on the loss function value to obtain a trained network model, and accurately identifying the vulnerability through a new application of the trained network model.

Inventors

  • PENG YUAN
  • CAI QISHEN
  • GUO YAYUN

Assignees

  • 中移(苏州)软件技术有限公司
  • 中国移动通信集团有限公司

Dates

Publication Date
20260512
Application Date
20260127

Claims (10)

  1. 1. A method of model training, the method comprising: acquiring sample data, wherein the sample data comprises a feature matrix of a sample vulnerability; Training a long-term and short-term memory network layer of a network model based on the feature matrix to obtain a first training result; Training a position coding layer of the network model based on the feature matrix to obtain a second training result; determining a loss function value of the network model based on the first training result and the second training result; And adjusting parameters of the network model based on the loss function value to obtain a trained network model.
  2. 2. The method of claim 1, wherein the acquiring sample data comprises: obtaining vulnerability information, wherein the vulnerability information comprises text information of the sample vulnerability; vectorizing the text information to obtain a plurality of word vectors corresponding to the sample loopholes; and obtaining the feature matrix based on a plurality of word vectors.
  3. 3. The method of claim 1, wherein training the long-term memory network layer of the network model based on the feature matrix to obtain the first training result comprises: inputting the feature matrix into an embedding layer of the network model to perform word embedding processing to obtain an embedding vector sequence of the sample vulnerability; training the long-short-term memory network layer based on the embedded vector sequence to obtain time sequence characteristics of different time sequences in the embedded vector sequence, wherein the first training result comprises the time sequence characteristics.
  4. 4. The method of claim 1, wherein training the position-coding layer of the network model based on the feature matrix to obtain a second training result comprises: inputting the feature matrix into an embedding layer of the network model to perform word embedding processing to obtain an embedding vector sequence of the sample vulnerability; and training the position coding layer based on the embedded vector sequence to obtain position features of different positions in the embedded vector sequence, wherein the second training result comprises the position features.
  5. 5. The method of claim 1, wherein the determining the loss function value of the network model based on the first training result and the second training result comprises: Inputting the first training result and the second training result into a fusion layer of the network model to obtain a fusion training result; Inputting the fusion training result into a multi-head attention mechanism layer of the network model to perform weighted fusion to obtain a weighted result, wherein the weighted result is used for indicating the correlation between the first training result and the second training result; the loss function value is determined based on the weighted result.
  6. 6. The method of claim 5, wherein the determining the loss function value based on the weighted result comprises: Training a feedforward network layer of the network model based on the weighted result to obtain a third training result; determining predicted values of a plurality of sample vulnerabilities based on the third training result, wherein the predicted values are used for predicting classification results of the plurality of sample vulnerabilities; the loss function value is determined based on an error between the predicted value and a target value.
  7. 7. The method according to claim 1, wherein the method further comprises: acquiring monitoring text information of a vulnerability monitoring platform; Inputting the monitoring text information into the trained network model for recognition, and determining whether the monitoring text information comprises vulnerability information or not; and if so, generating vulnerability early warning information based on the vulnerability type.
  8. 8. An electronic device comprising a processor and a memory for storing a computer program capable of running on the processor, Wherein the processor is adapted to perform the steps of the method of any of claims 1 to 7 when the computer program is run.
  9. 9. A storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the method according to any of claims 1 to 7.
  10. 10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 7.

Description

Model training method, electronic equipment, medium and product Technical Field The present application relates to the field of computer technologies, and in particular, to a model training method, an electronic device, a medium, and a product. Background Along with the continuous upgrading of technical defects, configuration errors and attack techniques, new vulnerabilities of the Linux operating system are continuously emerging, so that the vulnerabilities are quickly discovered to be core tasks of system security management. In the related art, vulnerability identification mainly depends on risk scanning, and can not accurately identify vulnerabilities. Disclosure of Invention The embodiment of the application provides a model training method, electronic equipment, a medium and a product, which can accurately identify loopholes. The technical scheme of the embodiment of the application is realized as follows: The embodiment of the application provides a model training method, which comprises the following steps: acquiring sample data, wherein the sample data comprises a feature matrix of a sample vulnerability; Training a long-term and short-term memory network layer of a network model based on the feature matrix to obtain a first training result; Training a position coding layer of the network model based on the feature matrix to obtain a second training result; determining a loss function value of the network model based on the first training result and the second training result; And adjusting parameters of the network model based on the loss function value to obtain a trained network model. In the above aspect, the acquiring sample data includes: obtaining vulnerability information, wherein the vulnerability information comprises text information of the sample vulnerability; vectorizing the text information to obtain a plurality of word vectors corresponding to the sample loopholes; and obtaining the feature matrix based on a plurality of word vectors. In the above scheme, the training the long-term and short-term memory network layer of the network model based on the feature matrix to obtain a first training result includes: inputting the feature matrix into an embedding layer of the network model to perform word embedding processing to obtain an embedding vector sequence of the sample vulnerability; training the long-short-term memory network layer based on the embedded vector sequence to obtain time sequence characteristics of different time sequences in the embedded vector sequence, wherein the first training result comprises the time sequence characteristics. In the above scheme, the training the position coding layer of the network model based on the feature matrix to obtain a second training result includes: inputting the feature matrix into an embedding layer of the network model to perform word embedding processing to obtain an embedding vector sequence of the sample vulnerability; and training the position coding layer based on the embedded vector sequence to obtain position features of different positions in the embedded vector sequence, wherein the second training result comprises the position features. In the above aspect, the determining the loss function value of the network model based on the first training result and the second training result includes: Inputting the first training result and the second training result into a fusion layer of the network model to obtain a fusion training result; Inputting the fusion training result into a multi-head attention mechanism layer of the network model to perform weighted fusion to obtain a weighted result, wherein the weighted result is used for indicating the correlation between the first training result and the second training result; the loss function value is determined based on the weighted result. In the above aspect, the determining the loss function value based on the weighted result includes: Training a feedforward network layer of the network model based on the weighted result to obtain a third training result; determining predicted values of a plurality of sample vulnerabilities based on the third training result, wherein the predicted values are used for predicting classification results of the plurality of sample vulnerabilities; the loss function value is determined based on an error between the predicted value and a target value. In the above scheme, the method further comprises: acquiring monitoring text information of a vulnerability monitoring platform; Inputting the monitoring text information into the trained network model for recognition, and determining whether the monitoring text information comprises vulnerability information or not; and if so, generating vulnerability early warning information based on the vulnerability type. The embodiment of the application provides a model training device, which comprises: the acquisition unit is used for acquiring sample data, wherein the sample data comprises a feature matrix o