Search

CN-121981828-A - Abnormal transaction behavior analysis and judgment method and system based on multi-source data

CN121981828ACN 121981828 ACN121981828 ACN 121981828ACN-121981828-A

Abstract

The invention provides a multisource data-based abnormal transaction behavior analysis and judgment method and system, and relates to the technical field of abnormal transaction data analysis, wherein the method comprises the steps of constructing multisource account association topology and generating an acquisition instruction set; the method comprises the steps of obtaining multi-source heterogeneous funds transaction data and unifying formats, extracting characteristics to construct data quality assessment vectors, calculating credibility weights to carry out hierarchical labeling, constructing a weighted funds flow network structure, identifying abnormal transaction paths and scoring, screening abnormal transaction subgraphs with highest credibility, and generating evidence analysis results. The invention realizes intelligent collection, quality evaluation and credible association analysis of multi-source data, and improves collection efficiency and accuracy of transaction crime evidence.

Inventors

  • SU WEN
  • XIE ZUOQI

Assignees

  • 北京金安创世科技有限公司

Dates

Publication Date
20260505
Application Date
20260123

Claims (10)

  1. 1. The abnormal transaction behavior analysis and judgment method based on the multi-source data is characterized by comprising the following steps of: Constructing a multi-source account association topology based on a target account identification set of a transaction to be analyzed, and generating acquisition instruction sets aiming at different data sources according to the multi-source account association topology; operating the collection instruction set, obtaining multi-source heterogeneous fund transaction data, performing format conversion and field alignment, and generating a unified intermediate data set; Calculating the credibility weight of each data record based on the data quality assessment vector, and carrying out hierarchical annotation on the unified intermediate data set according to the credibility weight to obtain a hierarchical annotation data set; constructing a multi-dimensional fund flow direction association graph based on the hierarchical annotation data set, and embedding the credibility weight into the topological relation of the multi-dimensional fund flow direction association graph as an edge weight attribute to obtain a weighted fund flow direction network structure; Identifying an abnormal transaction path based on the weighted fund flow direction network structure, scoring the confidence degree of the abnormal transaction path by utilizing the side weight attribute, screening out an abnormal transaction sub-graph with the highest confidence degree, and carrying out structural coding on evidence nodes and transaction link information in the abnormal transaction sub-graph with the highest confidence degree to generate an evidence analysis result.
  2. 2. The method of claim 1, wherein constructing a multi-source account association topology based on a set of target account identifications of transactions to be analyzed, the generating collection instruction sets for different data sources according to the multi-source account association topology comprises: analyzing account type attributes and account opening mechanism attributes of all account identifications in the target account identification set, and establishing a mapping relation table between the account identifications and data source types; Determining an initial data source set based on the mapping relation table, extracting a historical transaction record from the initial data source set, identifying an associated account identifier which has a fund exchange relation with the target account identifier set in the historical transaction record, and incorporating the associated account identifier into an expanded account identifier set; Establishing an account association strength matrix based on transaction frequency and accumulated transaction amount between each account identifier in the expanded account identifier set and each account identifier in the target account identifier set; Traversing each account node in the multi-source account association topology, and determining the data source interface type corresponding to each account node according to the account type attribute and the account opening mechanism attribute corresponding to each account node; grouping account nodes in the multi-source account association topology according to the data source interface type, and generating the collection instruction set for different groups.
  3. 3. The method of claim 1, wherein executing the collection instruction set to obtain multi-source heterogeneous funds transaction data and perform format conversion and field alignment, generating the unified intermediate data set comprises: Transmitting a data request to a data source interface corresponding to each acquisition instruction in the acquisition instruction set, and receiving fund transaction data returned by each data source interface, wherein the fund transaction data comprises a structured data format and a semi-structured data format; Analyzing the data format type and the data coding mode of the fund transaction data, and analyzing the structured data format and the semi-structured data format into an accessible field set; extracting all field names from the field set, carrying out semantic analysis, identifying target fields representing transaction time, transaction amount, transaction account and transaction flow direction, and constructing a mapping rule set between a source field and the target field based on data content characteristics of each field in the field set; Performing format conversion processing on the fund transaction data according to the mapping rule set, and uniformly converting the structured data format and the semi-structured data format into a standardized data format containing the target field; And performing field alignment processing on the standardized data format, identifying a missing target field in the standardized data format, and inserting a null value identifier for the missing target field to form the unified intermediate data set with complete fields.
  4. 4. The method of claim 1, wherein calculating a confidence weight for each data record based on the data quality assessment vector, and wherein hierarchically labeling the unified intermediate data set according to the confidence weights, comprises: Based on the data quality evaluation vector, acquiring a transaction time sequence feature vector and a transaction main behavior feature vector corresponding to each data record; Grouping all data records in the unified intermediate data set according to transaction account identification, calculating a similarity matrix between transaction time sequence feature vectors and between transaction main body behavior feature vectors of all data records in each group, identifying the number of consistent data records of all data records in the group based on the similarity matrix, and taking the ratio of the number of consistent data records to the total number of data records in the group as the credibility weight of all data records; Constructing a data record co-occurrence network based on transaction account co-occurrence relations among all data records in the unified intermediate data set, calculating a clustering coefficient of each data record node in the data record co-occurrence network, dividing the unified intermediate data set into a plurality of clustered data subsets based on the clustering coefficient; And adding corresponding credibility level labeling identifiers for each data record in the plurality of clustered data subsets, and merging the plurality of clustered data subsets based on the credibility level labeling identifiers to form the layered labeling data set.
  5. 5. The method of claim 1, wherein constructing a multi-dimensional funds flow direction association graph based on the hierarchical annotation dataset, embedding the credibility weights as edge weight attributes in a topological relation of the multi-dimensional funds flow direction association graph, and obtaining a weighted funds flow direction network structure comprises: extracting transaction account identifiers and transaction counter-party account identifiers of all data records from the hierarchical annotation data set as nodes, extracting transaction flow direction identifiers as directed edges, and constructing a basic fund flow direction association map; Extracting transaction time stamps and transaction amounts of all data records from the hierarchical annotation data set, adding time dimension attributes to all directed edges in the basic fund flow direction association map based on the transaction time stamps, and adding amount dimension attributes to all directed edges in the basic fund flow direction association map based on the transaction amounts to form the multi-dimensional fund flow direction association map; Extracting credibility weight values corresponding to all data records from the hierarchical annotation data set, identifying source data records corresponding to all directed edges in the multidimensional fund flow direction association map, and carrying out association binding on the credibility weight values of the source data records and the directed edges to serve as edge weight attributes of the directed edges; Detecting that a plurality of directed edges are connected with the same pair of nodes in the multidimensional fund flow direction association map, extracting respective edge weight attributes of the directed edges, performing aggregation operation to obtain edge weight attributes of merging edges connected with the same pair of nodes, and finally obtaining the weighted fund flow direction network structure.
  6. 6. The method of claim 1, wherein identifying an abnormal transaction path based on the weighted funds flow network structure, confidence scoring the abnormal transaction path using the edge weight attribute, screening out a highest confidence abnormal transaction sub-graph comprising: extracting the input degree value and the output degree value of each node from the weighted fund flow network structure, calculating the absolute value of the difference between the input degree value and the output degree value of each node as the degree deviation value of each node, and identifying the abnormal node of which the degree deviation value exceeds a preset deviation threshold value; Performing depth-first traversal in the weighted fund flow network structure by taking the abnormal node as a starting point, performing path expansion along the flow direction of the directed edge, stopping expansion when the path length reaches the preset path length limit or traverses to the leaf node, and recording all nodes and the directed edge which pass through in the traversal process to form a plurality of abnormal transaction paths; Extracting edge weight attribute values of all directed edges in the abnormal transaction paths, performing weighted product operation on the edge weight attribute values, and performing ratio operation on the edge weight attribute values and the path length of the abnormal transaction paths to obtain confidence scores corresponding to the abnormal transaction paths; The abnormal transaction paths are arranged in descending order according to the confidence scores, and a plurality of high-confidence abnormal transaction paths with the confidence scores higher than a preset confidence threshold are screened; and carrying out topological structure combination on the high-confidence abnormal transaction paths with the shared nodes to form the highest-confidence abnormal transaction subgraph.
  7. 7. The method of claim 1, wherein structurally encoding the evidence nodes in the highest confidence abnormal transaction sub-graph with transaction link information, generating evidence analysis results comprises: identifying account nodes from the nodes in the highest-confidence abnormal transaction subgraph based on the node type identification, and marking the account nodes with the number of connecting edges exceeding a preset connection threshold in the highest-confidence abnormal transaction subgraph as evidence nodes; Extracting a directed edge set taking each evidence node as an endpoint from the highest confidence abnormal transaction subgraph, traversing the directed edge set, extracting time dimension attribute, amount dimension attribute and edge weight attribute of each directed edge, and combining to form transaction link information; assigning unique identification codes to each evidence node, constructing an evidence node coding table based on the unique identification codes, and carrying out association mapping on each unique identification code in the evidence node coding table and transaction link information of the corresponding evidence node to form evidence node structured data; extracting directed edge sequences connected with different evidence nodes from the highest confidence abnormal transaction subgraph, constructing a topological connection relation between the evidence nodes based on the directed edge sequences, carrying out path coding on each directed edge sequence in the topological connection relation, and carrying out structural association on the path coding and transaction link information of each directed edge on the directed edge sequence to form transaction link structural data; And fusing the evidence node structured data with the transaction link structured data to generate the evidence analysis result.
  8. 8. An abnormal transaction behavior analysis and judgment system based on multi-source data, for implementing the method as claimed in any one of claims 1 to 7, comprising: The system comprises a first unit, a second unit and a third unit, wherein the first unit is used for constructing a multi-source account association topology based on a target account identification set of a transaction to be analyzed, and generating acquisition instruction sets aiming at different data sources according to the multi-source account association topology; The second unit is used for operating the acquisition instruction set, acquiring multi-source heterogeneous fund transaction data, performing format conversion and field alignment, and generating a unified intermediate data set; The third unit is used for extracting the transaction time sequence characteristics and the transaction main body behavior characteristics in the unified intermediate data set and constructing a data quality evaluation vector, calculating the credibility weight of each data record based on the data quality evaluation vector, and carrying out layering labeling on the unified intermediate data set according to the credibility weight to obtain a layering labeling data set; a fourth unit, configured to construct a multidimensional funds flow direction association graph based on the hierarchical annotation dataset, and embed the credibility weight as an edge weight attribute in a topological relation of the multidimensional funds flow direction association graph, so as to obtain a weighted funds flow direction network structure; And a fifth unit, configured to identify an abnormal transaction path based on the weighted fund flow direction network structure, score the confidence degree of the abnormal transaction path by using the edge weight attribute, screen out a highest confidence degree abnormal transaction subgraph, and perform structural coding on the evidence node and the transaction link information in the highest confidence degree abnormal transaction subgraph to generate an evidence analysis result.
  9. 9. An electronic device, comprising: A processor; A memory for storing processor-executable instructions; Wherein the processor is configured to invoke the instructions stored in the memory to perform the method of any of claims 1 to 7.
  10. 10. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1 to 7.

Description

Abnormal transaction behavior analysis and judgment method and system based on multi-source data Technical Field The invention relates to the technical field of abnormal transaction data analysis, in particular to an abnormal transaction behavior analysis and judgment method and system based on multi-source data. Background With the increasing complexity and concealment of transaction criminal activity, data analysis and evidence collection play a vital role in case investigation. Traditional crime investigation mainly relies on manual analysis and a single data source, and cannot effectively cope with complex funds flow networks among large-scale, cross-platform and multi-subject. The abnormal transaction behavior intelligent acquisition and analysis system needs to integrate multi-source heterogeneous data from bank transaction records, third party payment platforms, virtual currency exchanges and the like, and conduct omnibearing tracking and evidence chain construction on fund flow through an intelligent method. The multi-source heterogeneous data acquisition lacks unified standards and automation mechanisms, and the data formats of different institutions and platforms are obviously different, so that the data integration efficiency is low, and a complete fund flow map is difficult to form. The traditional analysis method lacks a scientific evaluation system for data credibility, and cannot effectively distinguish the authenticity and importance of data from different sources, so that deviation exists in an analysis result or key evidence clues are omitted. The existing fund flow direction analysis system mostly adopts a static map construction method, lacks a dynamic weight adjustment mechanism, cannot adaptively optimize the abnormal transaction path identification process according to the data credibility, and reduces the accuracy and the interpretability of evidence chain construction. Disclosure of Invention The embodiment of the invention provides an abnormal transaction behavior analysis and judgment method and system based on multi-source data, which can solve the problems in the prior art. In a first aspect of the embodiment of the present invention, a method for intelligently collecting, analyzing and studying and judging gold analysis evidence based on multi-source data is provided, including: Constructing a multi-source account association topology based on a target account identification set of a transaction to be analyzed, and generating acquisition instruction sets aiming at different data sources according to the multi-source account association topology; operating the collection instruction set, obtaining multi-source heterogeneous fund transaction data, performing format conversion and field alignment, and generating a unified intermediate data set; Calculating the credibility weight of each data record based on the data quality assessment vector, and carrying out hierarchical annotation on the unified intermediate data set according to the credibility weight to obtain a hierarchical annotation data set; constructing a multi-dimensional fund flow direction association graph based on the hierarchical annotation data set, and embedding the credibility weight into the topological relation of the multi-dimensional fund flow direction association graph as an edge weight attribute to obtain a weighted fund flow direction network structure; Identifying an abnormal transaction path based on the weighted fund flow direction network structure, scoring the confidence degree of the abnormal transaction path by utilizing the side weight attribute, screening out an abnormal transaction sub-graph with the highest confidence degree, and carrying out structural coding on evidence nodes and transaction link information in the abnormal transaction sub-graph with the highest confidence degree to generate an evidence analysis result. Constructing a multi-source account association topology based on a target account identification set of a transaction to be analyzed, and generating acquisition instruction sets for different data sources according to the multi-source account association topology comprises: analyzing account type attributes and account opening mechanism attributes of all account identifications in the target account identification set, and establishing a mapping relation table between the account identifications and data source types; Determining an initial data source set based on the mapping relation table, extracting a historical transaction record from the initial data source set, identifying an associated account identifier which has a fund exchange relation with the target account identifier set in the historical transaction record, and incorporating the associated account identifier into an expanded account identifier set; Establishing an account association strength matrix based on transaction frequency and accumulated transaction amount between each account identifier in the expanded