Search

CN-121998756-A - Co-transaction behavior recognition method and system based on dynamic association strength

CN121998756ACN 121998756 ACN121998756 ACN 121998756ACN-121998756-A

Abstract

The invention discloses a method and a system for identifying common transaction behaviors based on dynamic association strength, and belongs to the field of big data analysis and financial science and technology. Aiming at the problems of single dimension, low performance and lack of timeliness of the conventional financial common transaction identification method, the method firstly extracts and normalizes the frequency, amount and time multidimensional features of transaction entity pairs based on a sliding time window, then uses an FP-Growth algorithm to mine frequent item sets in a cluster through improved DBSCAN cluster coarse-screening data, then builds a correlation strength scoring function to screen obvious common transaction groups, and finally realizes dynamic update of a model through incremental update, strength attenuation judgment and triggered local recalculation. The invention realizes the efficient, accurate and real-time identification of the common transaction behavior in the financial mass transaction flow data, and improves the intellectualization and timeliness of financial wind control.

Inventors

  • WU SONGSONG
  • ZHANG LEI
  • YAO ZHIQIANG
  • CHEN TING
  • SHI JIANZHEN

Assignees

  • 厦门市美亚柏科信息安全研究所有限公司

Dates

Publication Date
20260508
Application Date
20251216

Claims (12)

  1. 1. The common transaction behavior identification method based on the dynamic association strength is characterized by being applied to identifying common transaction behaviors of entity groups with common transaction counter-parties in massive transaction flow data in the financial field, and comprises the following steps: S1, receiving historical transaction flow data, extracting multi-dimensional characteristics for each pair of transaction entities (i, j) based on a preset sliding time window, wherein the multi-dimensional characteristics comprise frequency characteristics F, amount characteristics M and time characteristics T, performing Z-Score standardization processing on the multi-dimensional characteristics, and outputting a characteristic vector set { Vi, j }, wherein Vi, j is a characteristic vector of the transaction entity pair (i, j); S2, performing improved DBSCAN cluster analysis, namely taking the output feature vector set as input, performing cluster analysis by adopting an improved DBSCAN cluster algorithm, and outputting cluster sets { C1, C2, & gt, ck }; S3, mining an intra-cluster FP-Growth association rule, namely regarding each output cluster Ci as an independent transaction database, setting a minimum support threshold value min_sup, mining frequent item sets in each cluster by applying an FP-Growth algorithm, and outputting the frequent item set sets; S4, calculating the comprehensive score of the association strength, namely constructing an association strength comprehensive score function for each outputted frequent item set to calculate the association strength Sgroup, setting an association strength threshold Sthreshold, screening out frequent item sets with Sgroup more than or equal to Sthreshold as obvious common transaction groups, and carrying out the following steps S5, a dynamic updating mechanism is used for monitoring new transaction data Dnew in real time, only incrementally updating the feature vector Vi, j of the transaction entity pair influenced by Dnew when the new transaction data flows in, periodically scanning the identified obvious common transaction group, and recalculating the association strength of the identified obvious common transaction group If it is new If the change amplitude is lower than the failure threshold SINACTIVE, marking the group as failed and removing, and meanwhile, calculating the change amplitude of the feature vector after incremental updating and the original feature vector, namely, v new -v old , if the change amplitude exceeds a preset threshold And when the number of the obvious change points in the buffer pool reaches a preset scale or reaches a preset trigger time, starting local re-clustering and local association rule mining aiming at the transaction entity pairs and the cluster clusters to which the transaction entity pairs belong in the buffer pool, and updating obvious common transaction groups.
  2. 2. The method as claimed in claim 1, wherein in step S1, the frequency characteristic F is a total number of transactions of the transaction entity pair (i, j) in a sliding time window, and the calculation formula is Wherein T is a transaction time stamp, K is the current time, I (·) is an indication function, and the value is 1 when the transaction exists in the window, otherwise, the value is 0.
  3. 3. The method of claim 1, wherein in step S1, the time characteristic T includes a time decay factor Δt=k-T last , where T last is a timestamp of a last transaction of the transaction entity pair (i, j), and transaction interval stability is a standard deviation of calculated consecutive transaction time intervals Feature vector Vi, j= (F, M total ,M avg ,M std ,ΔT,T std ,).
  4. 4. The method as claimed in claim 1, wherein in step S2, the improved DBSCAN clustering algorithm uses a weighted mahalanobis distance as a distance measure, the weighted mahalanobis distance has a higher weight of the sum feature and the frequency feature than the time feature, and the dynamic adjustment neighborhood radius epsilon and the minimum point MinPts are evaluated by a k-distance graph and a clustering result, and the weighted mahalanobis distance has a calculation formula of Where S is the covariance matrix of the eigenvectors and S -1 is the inverse of S.
  5. 5. The method of claim 1, wherein in step S4, the correlation strength integrated scoring function is Wherein w f is the frequency weight, w m is the amount weight, w t is the time weight, and w f +w m +w t =1;log(F sum +1) is the logarithm of the total transaction frequency of the group; The normalized total sum is processed for the hyperbolic tangent function, exp (-lambda.DeltaT avg ) is the exponential decay function, the average time difference is processed for the exponential decay function, and lambda is the decay coefficient.
  6. 6. The method for identifying a common transaction behavior according to claim 1, wherein in the step S1, parameters of the sliding time window are a window size Twindow =30 days and a sliding step Tslide =1 day.
  7. 7. A co-transaction behavior recognition system based on dynamic association strength, characterized by a method for implementing any of claims 1-6, comprising: The feature extraction module is used for receiving historical transaction flow data, extracting multi-dimensional features for each pair of transaction entities (i, j) based on a preset sliding time window, wherein the multi-dimensional features comprise frequency features F, amount features M and time features T, performing Z-Score standardization processing on the multi-dimensional features, and outputting feature vector sets { Vi, j }, wherein Vi, j are feature vectors of the transaction entity pairs (i, j); The cluster analysis module is used for receiving the feature vector set output by the feature extraction module as input, adopting an improved DBSCAN clustering algorithm to perform cluster analysis and outputting a cluster set { C1, C2, & gt, ck }; The association mining module is used for receiving the output of the cluster analysis module, regarding each output cluster Ci as an independent transaction database, setting a minimum support threshold value min_sup, mining frequent item sets in each cluster by applying an FP-Growth algorithm, and outputting the frequent item set; The intensity scoring module is used for receiving the frequent item set output by the association mining module, constructing an association intensity comprehensive scoring function for each frequent item set output, and calculating the association intensity Sgroup of each frequent item set, setting an association intensity threshold Sthreshold, screening out frequent item sets with Sgroup being more than or equal to Sthreshold as obvious common transaction groups, and carrying out the combination of the frequent item sets; the dynamic updating module is used for monitoring the new transaction data Dnew in real time, only incrementally updating the feature vector Vi, j of the transaction entity pair influenced by Dnew when the new transaction data flows in, periodically scanning the identified obvious common transaction group, and recalculating the association strength of the identified obvious common transaction group If it is new If the change amplitude is lower than the failure threshold SINACTIVE, marking the group as failed and removing, and meanwhile, calculating the change amplitude of the feature vector after incremental updating and the original feature vector, namely, v new -v old , if the change amplitude exceeds a preset threshold And when the number of the obvious change points in the buffer pool reaches a preset scale or reaches a preset trigger time, starting local re-clustering and local association rule mining aiming at the transaction entity pairs and the cluster clusters to which the transaction entity pairs belong in the buffer pool, and updating obvious common transaction groups.
  8. 8. The system for identifying co-transaction behavior according to claim 7, wherein the feature extraction module is configured to extract the multi-dimensional feature by using frequency feature F as follows And calculating an amount characteristic M comprising a total amount, an average amount and an amount fluctuation rate, and a time characteristic T comprising a time attenuation factor and transaction interval stability.
  9. 9. The system of claim 7, wherein the weight w f 、w m 、w t of the correlation strength composite scoring function is trained by a logistic regression algorithm or entropy weight method in the strength scoring module, and w f +w m +w t =1.
  10. 10. The co-transaction behavior recognition system of claim 7, wherein the sliding time window parameter of the feature extraction module is Twindow = 30 days, tslide = 1 day.
  11. 11. An electronic device, comprising: One or more processors; a storage means for storing one or more programs; When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1 to 6.
  12. 12. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1 to 6.

Description

Co-transaction behavior recognition method and system based on dynamic association strength Technical Field The invention belongs to the technical field of big data analysis and data mining, in particular relates to a common transaction behavior recognition technology in the field of financial science and technology, and specifically relates to a common transaction behavior recognition method, a system, electronic equipment and a computer readable storage medium based on dynamic association strength. Background With the rapid development of electronic payment and digital finance, transaction flow data generated by financial subjects such as banks, payment institutions and the like every day are exponentially increased, and the data contains rich transaction behavior characteristics and is an important basis for identifying financial risks and preventing financial fraud. In various financial wind control scenarios, identifying "common transaction behavior" -that is, frequent, specific patterns of transactions that multiple transaction entities take with one or more common transaction partners-is a key element in discovering a partnership financial risk. At present, the identification method aiming at common transaction behaviors in the industry has obvious technical defects that firstly, the identification method based on a rule engine depends on static preset rules, only a simple transaction mode can be identified, when the complex and intentionally avoided fraudulent transactions are faced, the false alarm rate and the false alarm rate are high, the rule maintenance cost is high, secondly, the traditional frequent item set mining method only focuses on whether the transactions are co-occurring, key dimensions such as transaction frequency, amount and time are ignored, the mining result practicability is poor, and meanwhile, serious calculation performance bottlenecks exist under massive data. The common problems in the prior art can be summarized into single dimension, low performance and lack of timeliness, so that a common transaction behavior recognition scheme which can integrate multidimensional transaction characteristics, efficiently process mass data and support dynamic update is needed in the field so as to solve the technical pain point of common transaction recognition in financial wind control. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a method and a system for identifying common transaction behaviors based on dynamic association strength, which aim to realize efficient, accurate and real-time identification of common transaction behaviors in mass transaction flow data in the financial field and improve the intellectualization and timeliness of financial wind control through multi-dimensional feature fusion, improved clustering and association mining algorithm and dynamic updating mechanism. In a first aspect, the present invention provides a method for identifying a common transaction behavior based on dynamic association strength, which is applied to identifying a common transaction behavior of an entity group having a common transaction counter-party in massive transaction flow data in the financial field, and the method comprises the following steps: S1, receiving historical transaction flow data, extracting multi-dimensional characteristics for each pair of transaction entities (i, j) based on a preset sliding time window, wherein the multi-dimensional characteristics comprise frequency characteristics F, amount characteristics M and time characteristics T, performing Z-Score standardization processing on the multi-dimensional characteristics, and outputting a characteristic vector set { Vi, j }, wherein Vi, j is a characteristic vector of the transaction entity pair (i, j); S2, performing improved DBSCAN cluster analysis, namely taking the output feature vector set as input, performing cluster analysis by adopting an improved DBSCAN cluster algorithm, and outputting cluster sets { C1, C2, & gt, ck }; S3, mining an intra-cluster FP-Growth association rule, namely regarding each output cluster Ci as an independent transaction database, setting a minimum support threshold value min_sup, mining frequent item sets in each cluster by applying an FP-Growth algorithm, and outputting the frequent item set sets; S4, calculating the comprehensive score of the association strength, namely constructing an association strength comprehensive score function for each outputted frequent item set to calculate the association strength Sgroup, setting an association strength threshold Sthreshold, screening out frequent item sets with Sgroup more than or equal to Sthreshold as obvious common transaction groups, and carrying out the following steps S5, a dynamic updating mechanism is used for monitoring new transaction data Dnew in real time, only incrementally updating the feature vector Vi, j of the transaction entity pair influenced by Dnew when the new transactio