CN-115905865-B - Training method of text merging judgment model and text merging judgment method

CN115905865BCN 115905865 BCN115905865 BCN 115905865BCN-115905865-B

Abstract

The embodiment of the specification discloses a training method, a training device, a storage medium and an electronic device of a text merging judgment model, wherein the specification is realized by constructing at least one non-merging positive sample group and at least one merging negative sample group, training the text merging judgment model through the positive and negative sample groups until the text merging judgment model converges, so that the text merging judgment model can be used in a task of judging whether two texts are merged or not.

Inventors

JING ZHIGANG

Assignees

蚂蚁财富(上海)金融信息服务有限公司

Dates

Publication Date: 20260505
Application Date: 20221122

Claims (12)

1. A training method of a text merge judgment model, the method comprising: Acquiring at least one positive sample group and at least one negative sample group, wherein the positive sample group comprises two non-combinable texts, and the negative sample group comprises two combinable texts; training the text merge judgment model through the at least one positive sample group and the at least one negative sample group until the text merge judgment model converges; wherein the acquiring at least one negative set of samples comprises: Acquiring at least one sample text to be segmented; Dividing the sample text to be divided according to preset symbols in the at least one sample text to be divided to obtain at least one negative sample group; the method includes the steps of respectively dividing the sample text to be divided according to preset characters in the at least one sample text to be divided to obtain at least one negative sample group, and the method includes the steps of: respectively determining characters positioned in the middle position of each sample text to be segmented as target characters; Detecting whether the preset symbol exists in N characters positioned on the left side of the target character or not, and detecting whether the preset symbol exists in N characters positioned on the right side of the target character or not, wherein N is an integer larger than 1; If a preset symbol exists in N characters positioned on the left side of the target character or the preset symbol exists in N characters positioned on the right side of the target character, dividing each sample text to be divided by taking the preset symbol as a boundary to obtain at least one negative sample group.
2. The method according to claim 1, wherein the dividing the sample text to be divided according to the preset characters in the at least one sample text to be divided to obtain at least one negative sample group includes: And dividing the sample text to be divided by taking each preset symbol as a boundary according to the position corresponding to each preset character in the sample text to be divided, so as to obtain a negative sample group corresponding to each preset symbol.
3. The method of claim 1, wherein the text merge judgment model comprises a plurality of encoders, at least one full connection layer, and a judgment; the plurality of encoders are used for encoding the text to obtain a plurality of feature vectors corresponding to the text; The at least one full-connection layer is used for carrying out full-connection processing on a plurality of feature vectors corresponding to the two texts respectively to obtain at least one connection result; And the judging device is used for judging whether the at least two texts can be combined according to the at least one connection result.
4. The method of claim 3, wherein the at least one fully connected layer comprises one or more fully connected layers that sequentially connect all feature vectors, fully connected layers that connect feature vectors corresponding to head characters of each of the texts, and fully connected layers that connect feature vectors corresponding to head characters of one of the texts with feature vectors corresponding to tail characters of another of the texts.
5. A method according to claim 3, the determiner being specifically adapted to: constraint processing is carried out on the at least one connection result to obtain the probability that the at least two texts can be combined; And judging whether the at least two texts can be combined according to the probability that the at least two texts can be combined.
6. The method of claim 3, wherein the plurality of encoders are one or more of a bi-directional encoder representing a BERT model, an encoder of a recurrent neural network, and an encoder of a convolutional neural network.
7. A method of text merge determination, the method comprising: Acquiring two texts to be detected; Inputting the two texts to be detected into a text merging judgment model to obtain a judgment result of whether the two texts to be detected can be merged, wherein the text merging judgment model is a model trained by the training method of the text merging judgment model according to any one of claims 1 to 6.
8. A training device for text merge judgment model, the device comprising: A sample acquisition module, configured to acquire at least one positive sample group, and acquire at least one negative sample group, where the positive sample group includes two text that cannot be combined, and the negative sample group includes two text that can be combined; the model training module is used for training the text merging judgment model through the at least one positive sample group and the at least one negative sample group until the text merging judgment model converges; Wherein, the sample acquisition module includes: the sample acquisition unit is used for acquiring at least one sample text to be segmented; the sample segmentation unit is used for respectively segmenting the sample text to be segmented according to preset symbols in the at least one sample text to be segmented to obtain at least one negative sample group; wherein the sample segmentation unit comprises: the target determining subunit is used for respectively determining the characters positioned in the middle position of each sample text to be segmented as target characters; A symbol detection subunit, configured to detect whether the preset symbol exists in N characters located on the left side of the target character, and detect whether the preset symbol exists in N characters located on the right side of the target character, where N is an integer greater than 1; And the target segmentation subunit is used for segmenting each sample text to be segmented by taking the preset symbol as a boundary to obtain at least one negative sample group if the preset symbol exists in N characters positioned on the left side of the target character or the preset symbol exists in N characters positioned on the right side of the target character.
9. An apparatus for text merge determination, the apparatus comprising: The text acquisition module is used for acquiring two texts to be detected; The result obtaining module is used for inputting the two texts to be detected into a text merging judgment model to obtain a judgment result of whether the two texts to be detected can be merged, wherein the text merging judgment model is a model trained by the training method of the text merging judgment model according to any one of claims 1 to 6.
10. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method steps of any one of claims 1 to 7.
11. A computer program product storing a plurality of instructions adapted to be loaded by a processor and to perform the method steps of any of claims 1 to 7.
12. An electronic device comprising a processor and a memory, wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1-7.

Description

Training method of text merging judgment model and text merging judgment method Technical Field The present disclosure relates to the field of natural language processing, and in particular, to a training method for a text merging judgment model and a text merging judgment method. Background Typically, a long text is split into multiple sentences that can pass through. "," | ",". However, due to the very complex context of text generation, the entered text may have misuse segmentation. For example, when a user inputs a text through a touch screen of a mobile terminal, but erroneously uses a divided symbol, uses a large number of spaces, and erroneously uses a line, for example, when a user inputs a text through voice, the condition of a voice input environment is bad or when the user records abnormally stops, the situation that the voice input text is divided erroneously may occur. Therefore, judging whether two sentences, i.e. two short texts, can be combined is always one of basic tasks in the field of artificial intelligence natural language processing, and is a basic support technology for upper-layer applications such as text query, intelligent question-answering and the like. Disclosure of Invention The embodiment of the specification provides a text merging judging method, a device, a storage medium and electronic equipment, which can train a text merging judging model, improve the robustness of the text merging model and improve the accuracy of judging whether two texts can be merged or not through the text merging judging model. The technical scheme is as follows: In a first aspect, an embodiment of the present disclosure provides a training method of a text merge judgment model, where the method includes: Acquiring at least one positive sample group and at least one negative sample group, wherein the positive sample group comprises two non-combinable texts, and the negative sample group comprises two combinable texts; Training the text merge judgment model through the at least one positive sample group and the at least one negative sample group until the text merge judgment model converges. In a second aspect, embodiments of the present disclosure provide a method for text merge determination, where the method includes: Acquiring two texts to be detected; And inputting the two texts to be detected into a text merging judgment model to obtain a judgment result of whether the two texts to be detected can be merged, wherein the text merging judgment model is a model trained by the training method of the text merging judgment model in the first aspect. In a third aspect, an embodiment of the present disclosure provides a training device for a text merge judgment model, where the method includes: A sample acquisition module, configured to acquire at least one positive sample group, and acquire at least one negative sample group, where the positive sample group includes two text that cannot be combined, and the negative sample group includes two text that can be combined; And the model training module is used for training the text merging judgment model through the at least one positive sample group and the at least one negative sample group until the text merging judgment model converges. In a fourth aspect, embodiments of the present disclosure provide an apparatus for text merge judgment, where the apparatus includes: The text acquisition module is used for acquiring two texts to be detected; the result acquisition module is used for inputting the two texts to be detected into a text merging judgment model to obtain a judgment result of whether the two texts to be detected can be merged, wherein the text merging judgment model is a model trained by the training method of the text merging judgment model in the first aspect. In a fifth aspect, the present description provides a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-described method steps. In a sixth aspect, the present description provides a computer program product storing a plurality of instructions adapted to be loaded by a processor and to perform the above-described method steps. In a seventh aspect, the present description provides an electronic device, which may comprise a processor and a memory, wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps. The technical scheme provided by some embodiments of the present specification has the following beneficial effects: According to the embodiment of the specification, at least one positive sample group and at least one negative sample group are reasonably constructed, the positive sample group comprises the text which cannot be combined, the negative sample group comprises the text which can be combined, the text combination judging model can learn whether the two texts have a combinable relation or not in a self-supervision mode through the