CN-115455939-B - Chapter-level event extraction method, apparatus, device and storage medium
Abstract
The application discloses a chapter-level event extraction method, a device, equipment and a storage medium, which are used for directly extracting event parameters without recognizing trigger words for a target chapter carrying title information, the event extraction result is composed of the parameter values, the event names and the event types of the extracted event parameters, so that the method is more in line with chapter characteristics, and the whole processing flow is simpler because the identification of trigger words is omitted. In the event parameter extraction process, sentence segmentation is carried out on the target chapters, and parameter values of various types of event parameters in each sentence are sequentially extracted according to the event parameter type template. And integrating the parameter values of the type of event parameters extracted from each sentence for each type of event parameters to obtain integrated parameter values. The integrity of parameter values of various event parameters extracted at chapter level is guaranteed, and the problem of confusion of marked event parameters is avoided.
Inventors
- DIAO YONGXIANG
- WANG YUJIE
- WU FEI
- FANG SIAN
Assignees
- 合肥讯飞数码科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20220921
Claims (11)
- 1. A chapter-level event extraction method, comprising: Performing sentence segmentation processing on a target chapter to obtain a sentence sequence, wherein the target chapter carries title information; Sequentially extracting parameter values of various types of event parameters in each sentence in the sentence sequence according to a set event parameter type template, wherein the event parameter type template comprises a plurality of types of event parameters; for each type of event parameters in the event parameter type template, sequentially combining the parameter values of the type of event parameters extracted from each sentence according to the ordering sequence of each sentence in the sentence sequence to obtain an integrated parameter value of the type of event parameters, wherein if the parameter value of the type of event parameters extracted from a certain sentence is null, a preset character is utilized to replace the parameter value of the type of event parameters; determining the event name of the target chapter based on the title information of the target chapter; Acquiring the field information of the target chapter, and determining the event type of the target chapter based on the field information; And forming an event extraction result of the target chapter by the event type, the event name and the integrated parameter values of the event parameters of various types.
- 2. The method of claim 1, wherein the set event parameter type template is a preset event parameter type template corresponding to the event type of the target chapter; or, the set event parameter type template is a preset unified event parameter type template corresponding to each event type.
- 3. The method according to claim 1, wherein extracting parameter values of each type of event parameter in each sentence in the sentence sequence in turn comprises: marking each sentence in the sentence sequence by adopting a pre-trained sequence marking model to obtain parameter values of each type of event parameters; the sequence labeling model is obtained by training a training sentence labeled with an event parameter type label to which each word in the sentence belongs as training data.
- 4. The method of claim 1, wherein the process of determining the event name of the target chapter based on the title information of the target chapter comprises: Taking the title information of the target chapter as an event name of the target chapter; Or alternatively, the first and second heat exchangers may be, And extracting the subject of the title information of the target chapter as the event name of the target chapter.
- 5. The method of claim 1, wherein the process of determining the event type of the target chapter based on the domain information to which the target chapter belongs comprises: Selecting a target event type closest to the field information of the target chapter from a set event type template as the event type of the target chapter; Wherein the event type template comprises a plurality of event types which are set.
- 6. The method according to any one of claims 1-5, wherein the number of target chapters is plural, and further comprising, after obtaining the event extraction result for each target chapter: And carrying out event association analysis from the semantic dimension and/or the statistical dimension based on event extraction results of a plurality of the target chapters.
- 7. The method of claim 6, wherein performing event correlation analysis from a semantic dimension based on event extraction results of a plurality of the target chapters, comprises: Taking event types as classification conditions, and dividing a plurality of target chapters into at least one event set of the same type based on event extraction results of the target chapters; and/or the number of the groups of groups, For each type of event parameters, respectively calculating semantic similarity of integrated parameter values of the type of event parameters of the object chapters; and determining that a semantically association relationship exists between the two target chapters with the semantic similarity exceeding the set similarity threshold value on the type event parameters.
- 8. The method of claim 6, wherein performing event correlation analysis from a statistical dimension based on event extraction results of a plurality of the target chapters, comprises: for the event extraction results of any two target chapters, respectively comparing whether the integration parameter values of various event parameters of the two are the same; Based on the comparison result, determining that a co-occurrence relationship exists between the two target chapters meeting the first condition on the event, and determining that a compliance relationship exists between the two target chapters meeting the second condition on the event, wherein: The first condition is that the integration parameter values of the time parameter and the place parameter in the event extraction results of the two target chapters are respectively the same; The second condition is that the integration parameter values of the personal parameter and the place parameter in the event extraction result of the two target chapters are respectively the same, and the integration parameter values of the time parameter are different.
- 9. A chapter level event extraction apparatus, comprising: the sentence dividing unit is used for carrying out sentence dividing processing on the target chapter to obtain a sentence sequence; The event parameter extraction unit is used for sequentially extracting parameter values of various types of event parameters in each sentence in the sentence sequence according to a set event parameter type template, wherein the event parameter type template comprises a plurality of set types of event parameters; The event parameter integration unit is used for sequentially combining the parameter values of the type event parameters extracted from each sentence according to the ordering sequence of each sentence in the sentence sequence to obtain integrated parameter values of the type event parameters, wherein if the parameter values of the type event parameters extracted from a certain sentence are empty, the parameter values of the type event parameters are replaced by preset characters; an event name determining unit for determining an event name of the target chapter based on the title of the target chapter; The event type determining unit is used for acquiring the field information of the target chapter and determining the event type of the target chapter based on the field information; And the event extraction result determining unit is used for forming the event extraction result of the target chapter by the event type, the event name and the integrated parameter value of each type of event parameter.
- 10. A chapter-level event extraction device is characterized by comprising a memory and a processor; the memory is used for storing programs; the processor is configured to execute the program to implement the steps of the chapter level event extraction method according to any one of claims 1-7.
- 11. A storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, performs the steps of the chapter-level event extraction method of any one of claims 1-7.
Description
Chapter-level event extraction method, apparatus, device and storage medium Technical Field The present application relates to the field of natural language processing technology, and in particular, to a chapter-level event extraction method, apparatus, device, and storage medium. Background An event refers to the occurrence of a particular thing, usually described as a change in state, at a particular time, place, involving one or more participants. In the field of natural language processing, an event extraction task belongs to a relatively common task, and the event extraction task mainly converts the form of unstructured text describing event information into structured text. The method is widely applied to the fields of network public opinion monitoring, emergency alarming and information collection. The existing event extraction method is generally aimed at sentence-level event extraction, and few schemes for researching chapter-level event extraction exist. The individual scheme for extracting the event at the chapter level is also to extract the event at the sentence level, for example, firstly extract the trigger word of the whole chapter, further utilize the extracted trigger word information to input the whole chapter into the sequence labeling model to label the event parameters (also called argument parameters), and finally form the event extraction result of the chapter by the trigger word and the event parameters. The existing chapter-level event extraction scheme does not consider the chapter characteristics, and the problem of complicated processing flow easily exists by adopting the same scheme as the sentence-level event extraction. And, the problem of confusion of marked event parameters easily occurs in a mode of marking the whole chapter in sequence to obtain the event parameters. Disclosure of Invention In view of the above problems, the present application provides a method, apparatus, device and storage medium for extracting chapter-level events, so as to solve the problems of complicated processing flow and easy occurrence of confusion of marked event parameters in the existing chapter-level event extraction scheme. The specific scheme is as follows: In a first aspect, a chapter-level event extraction method is provided, including: Performing sentence segmentation processing on a target chapter to obtain a sentence sequence, wherein the target chapter carries title information; Sequentially extracting parameter values of various types of event parameters in each sentence in the sentence sequence according to a set event parameter type template, wherein the event parameter type template comprises a plurality of types of event parameters; For each type of event parameters, integrating the parameter values of the type of event parameters extracted from each sentence in the sentence sequence to obtain integrated parameter values of the type of event parameters; determining the event name of the target chapter based on the title information of the target chapter; Acquiring the field information of the target chapter, and determining the event type of the target chapter based on the field information; And forming an event extraction result of the target chapter by the event type, the event name and the integrated parameter values of the event parameters of various types. In a second aspect, there is provided a chapter-level event extraction apparatus comprising: the sentence dividing unit is used for carrying out sentence dividing processing on the target chapter to obtain a sentence sequence; The event parameter extraction unit is used for sequentially extracting parameter values of various types of event parameters in each sentence in the sentence sequence according to a set event parameter type template, wherein the event parameter type template comprises a plurality of set types of event parameters; The event parameter integration unit is used for integrating the parameter values of the type event parameters extracted from each sentence in the sentence sequence for each type event parameter to obtain integrated parameter values of the type event parameters; an event name determining unit for determining an event name of the target chapter based on the title of the target chapter; The event type determining unit is used for acquiring the field information of the target chapter and determining the event type of the target chapter based on the field information; And the event extraction result determining unit is used for forming the event extraction result of the target chapter by the event type, the event name and the integrated parameter value of each type of event parameter. In a third aspect, a chapter level event extraction apparatus is provided that includes a memory and a processor; the memory is used for storing programs; the processor is configured to execute the program to implement the steps of the chapter level event extraction method as described above. In a fourth aspect,