CN-121997927-A - Neural network model-based few-sample named entity recognition method and system
Abstract
The invention discloses a neural network model-based method and a neural network model-based system for identifying named entities with few samples, wherein the method comprises the following steps of obtaining an original input text to be identified, and retrieving from an external knowledge base to obtain a group of candidate knowledge sentences; filtering candidate knowledge sentences by adopting a dynamic similarity threshold algorithm, screening out enhanced knowledge, inputting the enhanced knowledge sentences into a first encoder to generate a context text representation matrix of depth fusion knowledge, inputting the matrix into a sequence labeling classification layer, classifying each word element to identify the boundary span of a potential entity in the text, inputting the text corresponding to the identified entity boundary span into a second encoder to generate entity embedded representation, and determining the final category of the entity by calculating the distance between the entity embedded representation and a predefined category prototype to finish named entity identification. The invention solves the problems caused by data sparseness and semantic ambiguity in a few-sample scene, and improves the accuracy of named entity identification.
Inventors
- REN YAFENG
- ZHONG ZHENGSHENG
- PENG QIONG
Assignees
- 广东外语外贸大学
Dates
- Publication Date
- 20260508
- Application Date
- 20251029
Claims (10)
- 1. The method for identifying the named entities with few samples based on the neural network model is characterized by comprising the following steps of: s1, acquiring an original input text to be identified, and retrieving a group of candidate knowledge sentences from an external knowledge base based on the original input text; S2, filtering the candidate knowledge sentences by adopting a dynamic similarity threshold algorithm, and screening out enhanced knowledge matched with the original input text semanteme; S3, performing sequence splicing on the original input text and the enhanced knowledge to form an input sequence with enhanced knowledge, inputting the input sequence into a first encoder, and generating a context text representation matrix of depth fusion knowledge; S4, inputting the context text representation matrix into a sequence labeling classification layer, classifying each word element in the input sequence, and identifying boundary spans of one or more potential entities in the text; s5, inputting the text corresponding to the identified entity boundary span into a second encoder to generate an entity embedded representation, and determining the final category of the entity by calculating the distance between the entity embedded representation and a predefined category prototype to complete named entity identification.
- 2. The method for identifying named entities with few samples based on neural network model as claimed in claim 1, wherein said step S2 specifically comprises: s21, calculating semantic similarity scores between each candidate knowledge statement and the original input text; S22, determining the length type according to the sequence length of the original input text, wherein the length type comprises short sentences, medium long sentences and long sentences; s23, reserving candidate knowledge sentences with semantic similarity scores falling in screening intervals corresponding to the length types of the original input text as enhancement knowledge.
- 3. The method for identifying named entities with few samples based on neural network model as claimed in claim 2, wherein said step S21 specifically comprises: using the original input text and a candidate knowledge sentence as two text sequences to perform semantic similarity analysis; The cosine similarity of word element embedding in two text sequences is calculated to obtain the accuracy, recall rate and semantic similarity score, and the accuracy, recall rate and semantic similarity score are expressed as follows by adopting the formula: ; ; ; Wherein, the An i-th embedded vector representing the original input text, M represents the sequence length of the original input text, n represents the sequence length of the candidate knowledge sentence; The recall rate is indicated as being the result of the recall, The accuracy rate is indicated as a function of the accuracy, Representing a semantic similarity score.
- 4. The method for identifying named entities with few samples based on neural network model as claimed in claim 1, wherein said step S3 specifically comprises: s31, fusing the original input text and the enhanced knowledge to obtain an input sequence with enhanced knowledge The formula is expressed as: ; where ∈ represents the feature stitching operation, Representing the span of entities in the original input text, Representing the retrieved and from the external knowledge base Related enhancement knowledge, [ SEP ] represents a separator; S32, inputting the input sequence into a first encoder, carrying out context modeling on all the words in the sequence by utilizing a multi-head self-attention mechanism, and integrating semantic information of enhanced knowledge into the representation of each word to obtain a context text representation.
- 5. The method for identifying named entities with few samples based on neural network model as claimed in claim 1, wherein said step S4 specifically comprises: s41, inputting the context text representation matrix into a sequence labeling classification layer, and carrying out matrix matching The embedded vectors in the method are subjected to sequence labeling, the probability that each word element belongs to an entity is calculated, and the probability is expressed as follows by adopting a formula: ; Wherein, the And The weight matrix and the bias vector of the linear layer are represented respectively, Representation matrix An embedded vector of the ith dimension; s42, adopting a two-class span detection model to carry out span detection on each word element corresponding to the probability, and identifying the boundary span of the potential entity in the text.
- 6. The neural network model-based recognition method of named entities with few samples as claimed in claim 5, wherein the two classification spans detection model performs parameter optimization by minimizing cross entropy loss between predictive probability distribution and real labels; The cross entropy loss is expressed as follows: ; Where L is the total length of the sequence, Is the true label of the ith lemma.
- 7. The method for identifying named entities with few samples based on neural network models according to claim 1, wherein in step S5, the second encoder performs training by using supervised contrast learning, and specifically comprises: For each entity, performing sequence splicing on the span text and the semantic category names of the entity by two different sequences to construct a pair of semantically equivalent positive sample pairs which are different in sequence and added into the data set; Inputting the positive sample pairs into a second encoder to generate an entity embedded representation, and adopting a formula to be: ; ; where ∈indicates a connection operator, And A pair of aligned samples representing the construction is shown, A second encoder is shown as such, Representing the word element of the i-th entity, Indicating that the i-th correct entity tag is, Is a conversion function.
- 8. The method for identifying named entities with few samples based on neural network models as claimed in claim 7, wherein in step S5, the second encoder performs optimization by using a supervised contrast learning loss function, and the method is expressed as follows: ; ; ; Wherein, the Represents contrast loss, I represents the total number of samples in the training batch, Representing all positive sample sets of the same class as sample i, r z represents the sample true class label with index z, r i represents the true class label of sample i, And Representing samples constructed in terms of "entity-tag" and "tag-entity", respectively; And tau represents the temperature super-parameter.
- 9. The neural network model based recognition method of claim 1, wherein the external knowledge base is a large scale structured or semi-structured knowledge source including wikipedia, wordNet and convergenet.
- 10. A neural network model-based few-sample named entity recognition system, characterized in that a neural network model-based few-sample named entity recognition method as claimed in any one of claims 1-9 is adopted, comprising the following modules: The knowledge acquisition module is used for acquiring an original input text to be identified and retrieving a group of candidate knowledge sentences from an external knowledge base based on the original input text; The knowledge screening module is used for filtering the candidate knowledge sentences by adopting a dynamic similarity threshold algorithm and screening out the enhanced knowledge which is matched with the original input text semanteme; the text representation module is used for performing sequence splicing on the original input text and the enhanced knowledge to form an input sequence with enhanced knowledge, inputting the input sequence into the first encoder and generating a context text representation matrix of depth fusion knowledge; The span detection module is used for inputting the context text representation matrix into a sequence labeling classification layer, classifying each word element in the input sequence and identifying the boundary span of one or more potential entities in the text; And the type classification module is used for inputting the text corresponding to the identified entity boundary span into the second encoder to generate an entity embedded representation, and determining the final category of the entity by calculating the distance between the entity embedded representation and the predefined category prototype to complete the named entity identification.
Description
Neural network model-based few-sample named entity recognition method and system Technical Field The invention relates to the technical field of natural language processing, in particular to a method and a system for identifying named entities with few samples based on a neural network model. Background Named entity recognition (NamedEntityRecognition, NER) is a basic task in information extraction and natural language processing, aims at identifying and classifying entities in texts, and is very important for downstream applications such as building knowledge graphs, question-answering systems and the like. Conventional deep learning models typically require extensive labeling data to achieve good results, but in many practical scenarios the labeling data is scarce, which motivates the study of recognition of few sample named entities (Few-ShotNER, FS-NER). In the field of recognition of named entities with few samples, a method based on metric learning, in particular a prototype network (PrototypicalNetworks), has become a mainstream technical route. Such methods generate a prototype for each entity class by a small number of labeled samples (i.e., support sets) and then classify the query samples by computing their distances from each type of prototype in the embedded space. Despite the advances made in the prototype network in less sample learning, there are some inherent challenges. Firstly, knowledge acquisition of a model is completely dependent on a limited support set provided by each task, when support set samples are rare or have deviation, the generated prototype representation capability is insufficient, the generalization capability of the model on new entities is limited, and the entity with larger semantic difference from the support set samples is difficult to identify. Second, while there have been studies attempting to introduce external knowledge bases to enhance model performance, these approaches mostly retrieve and fuse knowledge for each new task in real-time during the model reasoning phase. This "on-line" processing mechanism inevitably introduces a huge computational burden, resulting in a significant increase in inference delay, contrary to the original purpose of low overhead and rapid adaptation required for low sample learning, greatly limiting its practical application value. In addition, direct introduction of unscreened external knowledge may also bring noise or redundant information, interfering with efficient learning of the model. Therefore, how to integrate high quality external knowledge in a task of identifying named entities with few samples to enhance model generalization capability, while avoiding introducing high computational cost in the reasoning stage, is a problem that needs to be solved by those skilled in the art. Disclosure of Invention In view of the above, the invention provides a neural network model-based method and a neural network model-based system for identifying named entities with few samples, which are characterized in that knowledge integration is pre-arranged in a pre-training stage, and an efficient knowledge screening strategy is designed, so that the accuracy and the robustness of the model in a data sparse scene are remarkably improved on the premise of not sacrificing the reasoning efficiency. In order to achieve the above purpose, the present invention adopts the following technical scheme: In a first aspect, the present invention provides a neural network model-based method for identifying named entities with few samples, comprising the following steps: s1, acquiring an original input text to be identified, and retrieving a group of candidate knowledge sentences from an external knowledge base based on the original input text; S2, filtering the candidate knowledge sentences by adopting a dynamic similarity threshold algorithm, and screening out enhanced knowledge matched with the original input text semanteme; S3, performing sequence splicing on the original input text and the enhanced knowledge to form an input sequence with enhanced knowledge, inputting the input sequence into a first encoder, and generating a context text representation matrix of depth fusion knowledge; S4, inputting the context text representation matrix into a sequence labeling classification layer, classifying each word element in the input sequence, and identifying boundary spans of one or more potential entities in the text; s5, inputting the text corresponding to the identified entity boundary span into a second encoder to generate an entity embedded representation, and determining the final category of the entity by calculating the distance between the entity embedded representation and a predefined category prototype to complete named entity identification. Further, the external knowledge base is a large scale structured or semi-structured knowledge source including wikipedia, wordNet, and ConceptNet. Further, the step S2 specifically includes: s21, calculatin