CN-121997162-A - AI-based event intelligent analysis method and system
Abstract
The invention relates to an AI-based event intelligent analysis method and system, wherein the method comprises the following steps of obtaining multi-mode original data related to an event, extracting characteristic representation of each mode data, constructing a layered multi-mode characteristic diagram comprising intra-mode subgraphs and inter-mode association diagrams based on the extracted characteristics, sequentially carrying out intra-mode GAT aggregation and modal perception inter-mode attention calculation by using a graph attention network to obtain inter-mode aggregated global characteristics, finally outputting event type probability distribution through a classification layer, identifying the event type, constructing a layered multi-mode event knowledge diagram based on the identified event type and the multi-mode characteristic diagram, analyzing logical relations and evolution paths between the events by using time sequence modeling and causal reasoning models, and outputting analysis results. Compared with the prior art, the method has the advantages of improving the accuracy and the precision of multi-mode event type identification, improving the analysis efficiency and the like.
Inventors
- YANG ZHOUBIN
- YIN CHENGLIANG
- HUANG HAOFENG
- GAO LIANGQUAN
Assignees
- 上海智能网联汽车技术中心有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20251208
Claims (10)
- 1. An AI-based event intelligent analysis method is characterized by comprising the following steps: Acquiring event-related multi-mode original data, and extracting characteristic representation of each mode data; Constructing a layered multi-mode feature map comprising a intra-mode subgraph and a cross-mode association map based on the extracted features, sequentially performing intra-mode GAT aggregation and modal sensing cross-mode attention calculation by using a map attention network to obtain cross-mode aggregated global features, and finally outputting event type probability distribution through a classification layer to identify event types; based on the identified event types and the multi-modal feature graphs, constructing a hierarchical multi-modal event knowledge graph, analyzing the logic relationship and the evolution path between the events by utilizing a time sequence modeling and causal reasoning model, and outputting an analysis result.
- 2. The intelligent event analysis method based on AI according to claim 1, wherein the intra-mode subgraph is determined according to the number of acquired multi-mode original data modes, each mode corresponds to one intra-mode subgraph, wherein for a text mode, nodes of the corresponding intra-mode subgraph are defined as semantic units of event related text, are characterized by semantic vectors extracted through a pre-training language model, for an image mode, nodes of the corresponding intra-mode subgraph are defined as key visual objects in an image, are characterized by visual vectors extracted through the pre-training visual model, for an audio mode, nodes of the corresponding intra-mode subgraph are defined as acoustic events, are characterized by acoustic vectors extracted through an MFCC, nodes of the corresponding intra-mode subgraph are defined as video key frames, are characterized by time sequence visual vectors extracted through the pre-training visual feature extraction model, and in the same intra-mode subgraph, inter-node edge weights are calculated based on feature similarity.
- 3. The intelligent event analysis method based on AI according to claim 1, wherein the nodes of the cross-modal associated graph are defined as core nodes connecting different modal subgraphs, namely corresponding semantic units, key visual objects, alignment units of acoustic events and video key frames, and are characterized by mean value stitching of node features of the corresponding modal subgraphs, and edge weights between different modal subgraph nodes and cross-modal associated graph nodes are calculated based on cross-modal feature alignment scores.
- 4. The AI-based intelligent analysis method of claim 1, wherein the intra-modality GAT aggregation comprises independently performing GAT computation on each intra-modality subgraph, aggregating intra-modality local association features, and preserving intrinsic characteristics of each modality, comprising the steps of: And (3) calculating the intra-mode attention coefficient, namely calculating the attention coefficient of the node i to the neighborhood node j for the intra-mode subgraph of the mode m: , Wherein, the Is a characteristic projection matrix of the mode m, Is the characteristic of the node i under the mode m, the I is characteristic splicing operation, A neighborhood node set of the node i under the mode m; Intra-modality aggregation feature output, computing an aggregation feature of intra-modality sub-graph nodes i for each modality m based on intra-modality attention coefficients : , Wherein σ is the activation function; The aggregate characteristics of all nodes in the modes are synthesized to obtain global aggregate characteristics of each mode: , Wherein, the As the number of nodes for mode m, Is a global aggregation feature of modality m.
- 5. The AI-based event intelligent analysis method of claim 1, wherein the modal awareness cross-modal attention calculation specifically comprises the steps of: And (3) calculating the importance weight of each mode based on prior knowledge and real-time feature distribution of the current event scene: , Wherein, the Importance weight for modality M, M is the set of modalities, As a saliency score for modality m, Is a single-layer sensing machine, and is characterized by that, Importance weight obtained by intra-mode GAT aggregation for mode m; Cross-modal attention coefficient calculation, namely, based on a cross-modal association diagram, fusing modal importance weights to calculate cross-modal attention coefficients: For a cross-modal attention coefficient between a node i in a modal intra-sub-graph of modality m and a node j in a modal intra-sub-graph of modality k, first a first attention coefficient between the node i in the modal intra-sub-graph of modality m and a node c in a cross-modal correlation graph is calculated : , Wherein, the Is an aggregate feature of node i in the intra-modality subgraph of modality m, As a feature of the node c, 、 Representing the cross-modal projection matrix, Is the edge weight between node i of modality m and node c in the cross-modality association graph, Representing a core node set of a cross-modality association graph associated with a node i of modality m, i being a feature stitching operation; Thereafter, a second attention coefficient between node c in the cross-modality association graph and node j in the intra-modality subgraph of modality k is calculated : , Wherein, the As the importance weight of the modality k, Is an aggregate feature of node j in the intra-modality subgraph of modality k, Is a cross-modality projection matrix for modality k, For the set of nodes of modality k associated with node c in the cross-modality association graph, An edge weight between a node c in the cross-modal association graph and a node j in the modal intra-graph of the modality k; Computing a cross-modal attention coefficient between node i in modality m and node j in modality k based on the first and second attention coefficients ; , Cross-modal aggregation feature output, namely obtaining cross-modal aggregated global features based on cross-modal attention coefficient aggregation The method comprises the following steps: , Wherein, the The activation function is represented as a function of the activation, Representing a set of modality k nodes indirectly associated with node i in the cross-modality association graph.
- 6. The intelligent AI-based event analysis method of claim 1, wherein the event type is a predefined closed set including lifecycle events, interaction events, transaction events, business events, judicial events, military events, and political events.
- 7. The intelligent AI-based event analysis method of claim 1, wherein the loss function of the graph-annotation-force network is defined as: , Wherein, the In order for the cross-entropy loss to occur, To annotate the predicted event types output by the force network, For the event type true value, For the regularization coefficient(s), 、 Respectively corresponding node sets of modes M and k, M is a mode set, Is an aggregate feature of node i in the intra-modality subgraph of modality m, Is an aggregate feature of node j in the intra-modality subgraph of modality k.
- 8. The AI-based event intelligent analysis method of claim 1, wherein the hierarchical multi-modal event knowledge graph adopts a three-layer architecture and is aligned with the hierarchical multi-modal feature graph, wherein an event core layer corresponds to an event type identification result, a modal entity layer corresponds to a modal intra-sub-graph, a single-modal entity and intra-modal association are stored, a cross-modal association layer corresponds to a cross-modal association graph, different modal entities are connected, and a cross-modal alignment relationship is stored.
- 9. The AI-based event intelligent analysis method of claim 8, wherein the constructing of the hierarchical multi-modal event knowledge graph comprises the steps of: defining entity types, relation types and attributes of a hierarchical multi-mode event knowledge graph, wherein the entity types are event entities for an event core layer, are modal entities for a modal entity layer, comprise text entities, image entities, audio entities and video entities, and are cross-modal association entities for a cross-modal association layer, and comprise intra-modal relations, cross-modal relations and event-entity association relations; Based on the intermediate processing process and output of the hierarchical multi-mode feature map and the graph annotation meaning network, the modal entity and the cross-modal associated entity are extracted from the hierarchical multi-mode feature map, the relation extraction is carried out by combining with semantic rules, and attribute information is obtained through feature sequence mapping.
- 10. An AI-based intelligent analysis system for implementing the method of any of claims 1-9, the system comprising: The data acquisition and feature extraction module is used for acquiring multi-mode original data related to the event and extracting feature representation of each mode of data; The event type identification module is used for constructing a layered multi-mode feature map comprising a mode inner sub-map and a cross-mode association map based on the extracted features, sequentially carrying out intra-mode GAT aggregation and mode perception cross-mode attention calculation by using a map attention network to obtain cross-mode aggregated global features, and finally outputting event type probability distribution through a classification layer to identify event types; The event analysis module is used for constructing a hierarchical multi-mode event knowledge graph based on the identified event type and the multi-mode feature graph, analyzing the logic relationship and the evolution path between the events by utilizing the time sequence modeling and the causal reasoning model, and outputting an analysis result.
Description
AI-based event intelligent analysis method and system Technical Field The invention relates to the technical field of big data analysis, in particular to an AI-based event intelligent analysis method and system. Background At present, some event analysis systems based on rules or traditional machine learning methods exist at home and abroad, for example, event detection and classification are performed on text, image or video data through keyword matching, pattern recognition or statistical analysis and other modes. Some systems employ deep learning models (e.g., CNN, RNN, transformer, etc.) for preliminary feature extraction and event recognition. Chinese patent CN119557603A discloses an event analysis method, system and device based on generating tasks and multiple modes, wherein the event analysis method based on generating tasks and multiple modes is adopted, multi-mode data (text, image and audio) is acquired, structured event information is generated by utilizing a large generation model, an event map is constructed, and feature fusion and optimization are carried out through a cross-mode attention mechanism and a graph neural network, so that emotion feature vectors and event prediction maps are generated for rapidly and accurately grasping the development trend of an event. However, the cross-modal attention mechanism of the method adopts single-level interaction logic, inherent differences of different modal characteristics are difficult to coordinate, modal exclusive information is easy to ignore in the interaction process, the event map construction of the method relies on structural information generated by a large model, information association fracture of the event map construction and information consistency of an early-stage characteristic processing link is difficult to guarantee. Disclosure of Invention The invention aims to overcome the defects of the prior art and provide an AI-based event intelligent analysis method and system. The aim of the invention can be achieved by the following technical scheme: according to a first aspect of the present invention, there is provided an AI-based event intelligent analysis method, the method comprising the steps of: Acquiring event-related multi-mode original data, and extracting characteristic representation of each mode data; Constructing a layered multi-mode feature map comprising a intra-mode subgraph and a cross-mode association map based on the extracted features, sequentially performing intra-mode GAT aggregation and modal sensing cross-mode attention calculation by using a map attention network to obtain cross-mode aggregated global features, and finally outputting event type probability distribution through a classification layer to identify event types; based on the identified event types and the multi-modal feature graphs, constructing a hierarchical multi-modal event knowledge graph, analyzing the logic relationship and the evolution path between the events by utilizing a time sequence modeling and causal reasoning model, and outputting an analysis result. The method comprises the steps of determining a modal inner sub-graph according to the number of acquired multi-modal original data, wherein each modal corresponds to one modal inner sub-graph, nodes of the corresponding modal inner sub-graph are defined as semantic units of event-related texts and are characterized by semantic vectors extracted through a pre-training language model, nodes of the corresponding modal inner sub-graph are defined as key visual objects in images and are characterized by visual vectors extracted through a pre-training visual model for image modalities, nodes of the corresponding modal inner sub-graph are defined as acoustic events and are characterized by acoustic vectors extracted through MFCCs for audio modalities, nodes of the corresponding modal inner sub-graph are defined as video key frames and are characterized by time sequence visual vectors extracted through a pre-training visual feature extraction model for video modalities, and edge weights among nodes in the same modal inner sub-graph are calculated based on feature similarity. The nodes of the cross-modal associated graph are defined as core nodes connected with different modal subgraphs, namely corresponding semantic units, key visual objects, acoustic events and alignment units of video key frames, and are characterized by mean value stitching of node features of the corresponding modal inner subgraphs, and edge weights between the nodes of the different modal inner subgraphs and the nodes of the cross-modal associated graph are calculated based on cross-modal feature alignment scores. The intra-mode GAT aggregation is specifically to independently execute GAT calculation on each intra-mode subgraph, aggregate intra-mode local association characteristics and reserve inherent characteristics of each mode, and comprises the following steps: And (3) calculating the intra-mode attention coeffici