Search

CN-122021843-A - Model editing method and related device

CN122021843ACN 122021843 ACN122021843 ACN 122021843ACN-122021843-A

Abstract

The application discloses a model editing method and a related device. The method comprises the steps of obtaining knowledge to be edited of a first knowledge model, determining whether the first knowledge model realizes the knowledge to be edited which is required to be learned by the model, judging whether the introduced quantity of a plurality of additional neurons of the first knowledge model is the upper limit quantity, if so, not introducing new additional neurons based on the knowledge to be edited, but selecting one candidate neuron from the plurality of candidate neurons existing in the first knowledge model to serve as a neuron to be multiplexed of the knowledge to be edited, even if the quantity of the knowledge to be edited is increased suddenly, introducing excessive additional neurons in the first knowledge model to avoid the sudden increase of additional parameters of the first knowledge model, and training the parameters of the neuron to be multiplexed based on the knowledge to enable the training parameters of the first knowledge model not to be increased suddenly, so that the editing efficiency and the editing effect of the first knowledge model are improved.

Inventors

  • WU TAIQIANG
  • ZHAO ZHE

Assignees

  • 腾讯科技(深圳)有限公司

Dates

Publication Date
20260512
Application Date
20241111

Claims (15)

  1. 1. A method of model editing, the method comprising: Acquiring knowledge to be edited of a first knowledge model; If the introduced number of the multiple additional neurons of the first knowledge model is the upper limit number, determining the neurons to be multiplexed of the knowledge to be edited from multiple candidate neurons of the first knowledge model; and carrying out parameter training on the neurons to be multiplexed according to the knowledge to be edited to obtain a second knowledge model based on the first knowledge model.
  2. 2. The method of claim 1, wherein the determining the neurons to multiplex of the knowledge to be compiled from among a plurality of candidate neurons of the first knowledge model comprises: Determining a degree of correlation between the knowledge to be edited and each candidate neuron; And determining the neurons to be multiplexed from the candidate neurons according to a plurality of correlations between the knowledge to be edited and the candidate neurons.
  3. 3. The method of claim 1, wherein the determining a degree of correlation between the knowledge to be compiled and each candidate neuron comprises: Activating each candidate neuron based on the knowledge to be edited to obtain state data of each candidate neuron; And determining the correlation degree between the knowledge to be edited and each candidate neuron according to the state data of each candidate neuron.
  4. 4. The method of claim 2, wherein the determining the neuron to be multiplexed from the plurality of candidate neurons according to a plurality of correlations between the knowledge to be edited and the plurality of candidate neurons comprises: determining a maximum correlation degree according to a plurality of correlation degrees between the knowledge to be edited and the candidate neurons; and determining the neurons to be multiplexed according to the candidate neurons corresponding to the maximum correlation degree.
  5. 5. The method of any of claims 1-4, wherein the plurality of candidate neurons comprises the plurality of additional neurons, or wherein the plurality of candidate neurons comprises a plurality of native neurons of a first knowledge model, or wherein the plurality of candidate neurons comprises the plurality of additional neurons and the plurality of native neurons.
  6. 6. The method according to claim 1, wherein the method further comprises: generating additional editing knowledge related to the knowledge to be edited according to the knowledge to be edited; The training of parameters of the neurons to be multiplexed according to the knowledge to be edited to obtain a second knowledge model based on the first knowledge model comprises the following steps: And carrying out parameter training on the neurons to be multiplexed according to the knowledge to be edited and the overall training loss of the first knowledge model to obtain the second knowledge model, wherein the overall training loss comprises a first updating loss of the knowledge to be edited, non-updating loss of other original knowledge and a second updating loss of the additional editing knowledge.
  7. 7. The method of claim 6, wherein the step of obtaining the overall training loss comprises: Determining a first weight corresponding to the first updating loss, a second weight corresponding to the non-updating loss and a third weight corresponding to the second updating loss; and according to the first weight, the second weight and the third weight, carrying out weighted calculation on the first updating loss, the non-updating loss and the second updating loss to obtain the whole training loss.
  8. 8. The method of claim 6, wherein the generating additional editorial knowledge associated with the knowledge to be edited from the knowledge to be edited comprises: Carrying out knowledge generation based on the knowledge to be edited according to the association generation prompt through a knowledge generation model to obtain a plurality of candidate editing knowledge; Aiming at each candidate editing knowledge, carrying out association detection on the candidate editing knowledge and the knowledge to be edited according to association detection prompts through an association detection model to obtain an association result between the candidate editing knowledge and the knowledge to be edited; And if the association result indicates that the candidate editing knowledge has association with the knowledge to be edited, determining the additional editing knowledge according to the candidate editing knowledge.
  9. 9. The method of claim 8, wherein the method further comprises: if the association result indicates that the candidate editing knowledge and the knowledge to be edited do not have association, negative-sample knowledge is determined according to the candidate editing knowledge; And carrying out model reinforcement on the knowledge generation model according to the negative sample knowledge to obtain a reinforced knowledge generation model.
  10. 10. The method of claim 6, wherein the generating additional editorial knowledge associated with the knowledge to be edited from the knowledge to be edited comprises: And generating the additional editing knowledge according to the knowledge to be edited through the knowledge graph to which the knowledge to be edited belongs.
  11. 11. The model editing device is characterized by comprising an acquisition unit, a determination unit and a training unit; The acquisition unit is used for acquiring knowledge to be edited of the first knowledge model; The determining unit is configured to determine a neuron to be multiplexed of the knowledge to be edited from a plurality of candidate neurons of the first knowledge model if the number of introduction of the plurality of additional neurons of the first knowledge model is an upper limit number; And the training unit is used for carrying out parameter training on the neurons to be multiplexed according to the knowledge to be edited, and obtaining a second knowledge model based on the first knowledge model.
  12. 12. The apparatus according to claim 11, wherein the determining unit is configured to: Determining a degree of correlation between the knowledge to be edited and each candidate neuron; And determining the neurons to be multiplexed from the candidate neurons according to a plurality of correlations between the knowledge to be edited and the candidate neurons.
  13. 13. A computer device, the computer device comprising a processor and a memory: the memory is used for storing a computer program and transmitting the computer program to the processor; the processor is configured to perform the method of any of claims 1-10 according to instructions in the computer program.
  14. 14. A computer readable storage medium for storing a computer program which, when run on a computer device, causes the computer device to perform the method of any one of claims 1-10.
  15. 15. A computer program product comprising a computer program, characterized in that the computer program, when run on a computer device, causes the computer device to perform the method of any of claims 1-10.

Description

Model editing method and related device Technical Field The present application relates to the field of computer technologies, and in particular, to a model editing method and a related device. Background At present, knowledge models based on large language models learn enough knowledge, but knowledge learned by the knowledge models may need to be updated, that is, the knowledge models need to be edited by the knowledge to be edited, and the original knowledge is updated to obtain the edited knowledge models for the knowledge to be edited. In the related technology, the model editing method of the knowledge model refers to that for each knowledge to be edited, an additional neuron is introduced on the basis of the original neuron in the knowledge model, and the parameters of the additional neuron are trained to obtain the edited knowledge model. However, in the above method, each knowledge to be edited needs to introduce an additional neuron, and as the number of the knowledge to be edited increases, the number of the additional neurons introduced increases, which causes the additional parameters of the knowledge model to increase dramatically, and seriously affects the editing efficiency and the editing effect of the knowledge model. Disclosure of Invention In order to solve the technical problems, the application provides a model editing method and a related device, wherein excessive additional neurons are not introduced into a knowledge model, and additional parameters of the knowledge model are not increased rapidly, so that training parameters of the knowledge model are not increased rapidly, the editing efficiency and the editing effect of the knowledge model can be improved, and the knowledge model can be edited rapidly and accurately to obtain an edited knowledge model. The embodiment of the application discloses the following technical scheme: in one aspect, an embodiment of the present application provides a method for editing a model, including: Acquiring knowledge to be edited of a first knowledge model; If the introduced number of the multiple additional neurons of the first knowledge model is the upper limit number, determining the neurons to be multiplexed of the knowledge to be edited from multiple candidate neurons of the first knowledge model; and carrying out parameter training on the neurons to be multiplexed according to the knowledge to be edited to obtain a second knowledge model based on the first knowledge model. On the other hand, the embodiment of the application provides a model editing device, which comprises an acquisition unit, a determination unit and a training unit; The acquisition unit is used for acquiring knowledge to be edited of the first knowledge model; The determining unit is configured to determine a neuron to be multiplexed of the knowledge to be edited from a plurality of candidate neurons of the first knowledge model if the number of introduction of the plurality of additional neurons of the first knowledge model is an upper limit number; And the training unit is used for carrying out parameter training on the neurons to be multiplexed according to the knowledge to be edited, and obtaining a second knowledge model based on the first knowledge model. In another aspect, an embodiment of the present application provides a computer device including a processor and a memory: the memory is used for storing a computer program and transmitting the computer program to the processor; the processor is configured to perform the method of any of the preceding aspects according to instructions in the computer program. In another aspect, embodiments of the present application provide a computer readable storage medium for storing a computer program which, when run on a computer device, causes the computer device to perform the method of any one of the preceding aspects. In another aspect, embodiments of the present application provide a computer program product comprising a computer program which, when run on a computer device, causes the computer device to perform the method of any of the preceding aspects. According to the technical scheme, firstly, the knowledge to be edited of the first knowledge model is acquired, so that knowledge to be learned for editing the model is achieved by the first knowledge model. Then, judging whether the introduced quantity of the multiple additional neurons of the first knowledge model is the upper limit quantity, if so, indicating that the multiple additional neurons of the upper limit quantity are introduced into the first knowledge model, at the moment, not introducing new additional neurons based on the knowledge to be edited, but selecting one candidate neuron from the multiple candidate neurons existing in the first knowledge model to serve as the neurons to be multiplexed of the knowledge to be edited, and even if the quantity of the knowledge to be edited is increased suddenly, introducing excessive additional neurons into the first