Search

CN-121981844-A - Enterprise tax risk path tracking method based on multi-mode time sequence knowledge graph

CN121981844ACN 121981844 ACN121981844 ACN 121981844ACN-121981844-A

Abstract

The invention provides an enterprise tax risk path tracking method based on a multi-modal time sequence knowledge graph, which comprises the following steps of collecting multi-modal data of an enterprise from a distributed data source, wherein the multi-modal data comprises structured tax data, semi-structured text data and unstructured audio/video data, constructing the time sequence knowledge graph based on the multi-modal data, wherein the entity relationship side in the time sequence knowledge graph carries a time stamp attribute and is used for representing dynamic characteristics of the relationship change between enterprises, and identifying a risk event by utilizing a multi-modal fusion model.

Inventors

  • CHENG JIANGYAN
  • LI WEISHI
  • LI YING

Assignees

  • 福建信息职业技术学院

Dates

Publication Date
20260505
Application Date
20260407

Claims (10)

  1. 1. The enterprise tax risk path tracking method based on the multi-mode time sequence knowledge graph is characterized by comprising the following steps of: The method comprises a multi-mode data acquisition step, a time sequence map construction step, a risk event identification step, a conduction path tracking step and a visual display step, wherein the multi-mode data acquisition step acquires multi-mode data of enterprises from a distributed data source, the multi-mode data comprises structured tax data, semi-structured text data and unstructured audio and video data, the time sequence map construction step is used for constructing a multi-mode time sequence knowledge map based on the multi-mode data, the entity relationship sides in the time sequence knowledge map carry timestamp attributes and are used for representing dynamic characteristics of the relationship changes among enterprises along with time, the risk event identification step is used for analyzing the real-time data by utilizing a multi-mode fusion model, identifying potential tax risk events and generating risk feature tensors, the conduction path tracking step is used for reversely searching in the multi-mode time sequence knowledge map based on time sequence constraint, mining and outputting conduction path subgraphs among related enterprises, and the visual display step is used for conducting path subgraphs and generating inspection reports containing risk tracing information.
  2. 2. The enterprise tax risk path tracking method based on the multi-modal time sequence knowledge graph of claim 1, wherein the multi-modal data acquisition step is further characterized by comprising the steps of accessing a distributed data source, acquiring multi-modal heterogeneous data of a target enterprise group, wherein the multi-modal heterogeneous data comprises structured tax data, semi-structured text data and unstructured audio/video data, and extracting time sequence feature vectors from the unstructured audio/video data respectively by utilizing a pre-trained feature extraction model.
  3. 3. The method for tracking the enterprise tax risk path based on the multi-modal time sequence knowledge graph of claim 1, wherein the time sequence graph construction step is further characterized by constructing a knowledge graph comprising enterprise entities, personnel entities, article entities and event entities, storing the transaction relationship in the structured tax data, the dynamic event corresponding to the time sequence feature vector and the entity attribute change as time sequence edges in a graph database, wherein each time sequence edge at least comprises an effective time interval and a relationship type label, and forming the multi-modal time sequence knowledge graph supporting time slice query, and the multi-modal time sequence knowledge graph comprises the following steps: entity set Comprising a business entity e, a personnel entity p, an article entity g and an event entity ev, and a time sequence edge Each edge is defined as a four-tuple edgev = (h, r, T, attrs), where Is taken as a main object and a customer, As a type of relationship it is, Attrs is a dynamic attribute for the effective time interval.
  4. 4. The method for tracking enterprise tax risk path based on multi-modal time series knowledge graph of claim 1, wherein the risk event identification step is further embodied by identifying risk signals through a multi-modal fusion model based on real-time collected increment data, and aligning and fusing risk features of different modalities to generate a standardized risk event feature tensor, the tensor comprises a risk subject identifier, a risk type, a confidence level and an occurrence timestamp, and the multi-modal fusion model comprises the steps of extracting multi-modal features X = Feature alignment and fusion by cross-attention layers, generating a risk feature tensor R: r= MultiHeadAttention (X, X) Risk_event=risk event feature is generated by mapping the multi-layer perceptron MLP to the risk class space =softmax(MLP(R))。
  5. 5. The method for tracing a tax risk path of an enterprise based on a multi-modal time series knowledge graph of claim 4, wherein the conducting path tracing step further comprises setting a time trace window and a conducting depth threshold with the enterprise entity pointed by the risk event feature tensor as a starting point, and executing a reverse time series migration algorithm in the multi-modal time series knowledge graph, wherein the reverse time series migration algorithm requires that the time stamps of all time series edges on the path satisfy monotonically decreasing and fall within the time trace window when traversing the path, thereby mining a complete conducting link from a risk source to a current risk burst point, and outputting a risk conducting sub-graph comprising intermediate conducting nodes, the reverse time series migration algorithm takes a risk subject as a main body of the risk Starting at the time backtracking window In, reverse walk is performed, and the walk path p= Wherein Strict time monotonic decreasing constraint needs to be satisfied: , And all time stamps 。
  6. 6. The enterprise tax risk path tracking method based on the multi-mode time sequence knowledge graph of claim 5, wherein the visual display step is further characterized in that the risk conduction sub-graph is subjected to visual rendering, a structured auditing task instruction set containing evidence obtaining guidance is automatically generated based on path characteristics in the risk conduction sub-graph, nodes and edges are subjected to differential dyeing according to risk attenuation coefficients, positions and time points of key abnormal documents are marked in the risk conduction sub-graph, and the structured auditing task instruction set comprises a field checking list marked as a 'risk source' enterprise, a bank running water penetrating query instruction of a 'middle gap' enterprise and a collaborative checking notification template of a 'risk receiving end' enterprise.
  7. 7. The enterprise tax risk path tracking method based on the multi-modal time sequence knowledge graph of claim 2, wherein the multi-modal heterogeneous data acquisition and feature extraction steps specifically comprise the steps of locally deploying feature extraction models on each data holder by adopting a federal learning framework, and only interactively encrypting model gradients or feature vectors to avoid the original tax data from going out of a domain and realize multi-source data feature alignment under privacy protection.
  8. 8. The method for tracking enterprise tax risk path based on multi-modal time series knowledge graph of claim 3, wherein the multi-modal time series knowledge graph construction step further comprises the steps of, for the unstructured audio and video data, extracting time series feature vectors comprising a vehicle in-out frequency change curve, a start rate/shutdown event identification of an enterprise production and operation place obtained based on video stream analysis, meeting summary keywords and emotion tendency scores obtained based on audio analysis, and attaching the vehicle in-out frequency change curve, shutdown event identification and emotion tendency scores to corresponding enterprise entities or event entities in the form of time series edges as dynamic attributes.
  9. 9. The method for tracking enterprise tax risk path based on multi-modal time series knowledge graph of claim 4, wherein the multi-modal fusion model in the risk event tensor generation step is a attention mechanism-based transducer architecture, and the model performs weighted fusion on statistical outliers of structured data, negative emotion scores of text data and abnormal fluctuation features of audio and video data to generate a unified risk event characterization vector and maps the unified risk event characterization vector to a preset risk category label space.
  10. 10. The method for tracing enterprise tax risk path based on multi-modal timing knowledge graph of claim 5, wherein in the step of tracing conduction path under timing constraint, the reverse timing walk algorithm specifically comprises performing probability evaluation on candidate conduction path by using path scoring function based on timing graph neural network, marking path of closed loop structure as annular conduction risk mode when detecting that funds flow, invoice flow and goods flow form closed loop structure on time sequence, and incorporating the loop structure into outputted risk conduction subgraph, introducing Monte Carlo tree search mechanism during walk to balance breadth and depth of path exploration, preferentially mining multi-stage bridge conduction path with hidden depth exceeding preset threshold, introducing UCT (Upper Confidence Bounds for Trees) -based path scoring function to guide reverse walk for current node s, and for the current node s, candidate precursor node s The UCT value of (c) is calculated as follows: And wherein: Is a path cost function representing the passing of from the current node Backtracking the accumulated confidence of the risk source; Is the total number of times node s is accessed, Is a side The number of times selected, C, is a search constant, controlling the balance between search and utilization.

Description

Enterprise tax risk path tracking method based on multi-mode time sequence knowledge graph Technical Field The invention relates to tax risk analysis technology, in particular to an enterprise tax risk path tracking method based on a multi-mode time sequence knowledge graph. Background In the current enterprise business ecology, a highly complex interaction system is formed among enterprises through a supply chain network, a stock right association architecture and a fund borrowing channel, and tax risks represent obvious trans-enterprise conduction characteristics. Risk events such as tension in the funds chain, abnormal revocation of invoices, or shutdown of the business, etc., often do not occur in isolation, but rather spread gradually along the associated path, forming a multistage chain reaction. The traditional tax auditing means mainly rely on a static financial index threshold comparison mechanism or carry out isolated analysis aiming at a single enterprise, and can only capture instantaneous risk points, so that the time sequence evolution process of risks in a dynamic association network can not be effectively monitored. While the prior art has attempted to introduce enterprise relationship graphs for risk modeling, such as building knowledge graphs based on financial multi-source data to identify abnormal transaction nodes, or locating risk links through four-stream consistency comparisons of commodity, contractual, and fund streams with invoice streams, these approaches still suffer from systematic drawbacks. Firstly, most of the adopted knowledge maps are in a static snapshot form, and the lack of time dimension modeling capability can not be used for describing dynamic conduction tracks between upstream enterprises and downstream enterprises along with the time of key risk factors such as fund chain breakage, invoice red punching and the like, for example, the time sequence of a risk source enterprise and a subsequent conduction node can not be distinguished. Secondly, the data processing is limited in the category of structured form data, unstructured multi-mode information sources such as enterprise bulletin texts, public opinion monitoring data, production management video streams, conference recordings and the like cannot be integrated, so that risk signals are not covered fully, for example, key clues such as vehicle in-out frequency dip reflected in video monitoring or negative emotion of a management layer detected in audio analysis are ignored, and obvious hysteresis exists in risk identification. Finally, in the aspect of path mining, facing a large-scale enterprise network, the existing graph algorithm such as a distributed processing model based on a MapReduce framework has low calculation efficiency when dealing with a long-chain transmission path or a loop structure such as fund reflux and the like, is difficult to automatically extract a complete transmission link with sequential logic, and the generated path result lacks of interpretability and cannot provide visual risk tracing basis for inspection personnel. Disclosure of Invention In order to overcome the problems, the invention aims to provide the enterprise tax risk path tracking method based on the multi-mode time sequence knowledge graph, which can effectively track dynamic conduction paths of tax risks among related enterprises and improve the accuracy and efficiency of risk identification. The enterprise tax risk path tracking method based on the multi-mode time sequence knowledge graph comprises the following steps of: The method comprises a multi-mode data acquisition step, a time sequence map construction step, a risk event identification step, a conduction path tracking step and a visual display step, wherein the multi-mode data acquisition step acquires multi-mode data of enterprises from a distributed data source, the multi-mode data comprises structured tax data, semi-structured text data and unstructured audio and video data, the time sequence map construction step is used for constructing a multi-mode time sequence knowledge map based on the multi-mode data, the entity relationship sides in the time sequence knowledge map carry timestamp attributes and are used for representing dynamic characteristics of the relationship changes among enterprises along with time, the risk event identification step is used for analyzing the real-time data by utilizing a multi-mode fusion model, identifying potential tax risk events and generating risk feature tensors, the conduction path tracking step is used for reversely searching in the multi-mode time sequence knowledge map based on time sequence constraint, mining and outputting conduction path subgraphs among related enterprises, and the visual display step is used for conducting path subgraphs and generating inspection reports containing risk tracing information. The multi-modal data acquisition step is further embodied in the steps of accessing a distributed data source