CN-122020479-A - Financial abnormality accurate positioning method and system based on abnormality propagation and multidimensional numerical fingerprint
Abstract
The invention discloses a financial abnormality accurate positioning method and a system based on abnormality propagation and multidimensional numerical fingerprints, wherein the method adopts a two-stage strategy of firstly identifying abnormal digital categories and then propagating to specific records, analyzes whether each digital category is obviously deviated from theoretical distribution (the present Ford law or even distribution) through binomial inspection, calculates comprehensive abnormality scores by combining rarity weighting and rounding sensitivity weighting, and propagates the scores to each financial record. Compared with the traditional Leave-One-Out method, the invention has the advantage that the computational complexity is reduced from that of the traditional Leave-One-Out method Is reduced to And simultaneously, a complete attribution link from record to digital category is provided, thereby meeting the audit tracing requirement.
Inventors
- CHEN KE
- XU MENGRU
- XIANG JUAN
Assignees
- 湖南涉外经济学院
Dates
- Publication Date
- 20260512
- Application Date
- 20260203
Claims (10)
- 1. The financial abnormality accurate positioning method based on abnormality propagation and multidimensional numerical fingerprint is characterized by comprising the following steps of: S1, preprocessing data, namely, preprocessing financial data set Cleaning and extracting the first digit of each record And last digit ; S2, binomial test significance analysis, namely, for each digital category A binomial test is performed to calculate its significance score: s3, first digit exception propagation: (a) Calculating each leading digit The degree of deviation desired relative to the present ford: ; (b) Calculating rareness weight: ; (c) Calculating a comprehensive anomaly score: ; (d) After normalization, the score is propagated to the corresponding record: ; S4, mantissa exception propagation: (a) Calculating the final digits Degree of deviation from uniform distribution; (b) Applying rounding sensitivity weights to mantissas 0 and 5 ; (C) Calculating comprehensive anomaly scores and transmitting the comprehensive anomaly scores to corresponding records; s5, risk score fusion, namely calculating a final risk score by combining the first digit score, the mantissa score, the time sequence score, the monetary weight and the context factor: S6, outputting the ordered abnormal certificate list and the attribution label.
- 2. The method for accurately locating financial anomalies based on anomaly propagation and multidimensional numerical fingerprinting as set forth in claim 1, wherein the binomial test in step S2 employs a single-tail test, focusing only on the excessive occurrence of numbers.
- 3. The method for accurately locating financial anomalies based on anomaly propagation and multi-dimensional numerical fingerprinting of claim 1, wherein the rarity weights in step S3 (b) are such that an anomaly excessive occurrence of rare numbers (e.g., 8, 9) is weighted higher than high frequency numbers (e.g., 1, 2).
- 4. The method for accurately locating a financial anomaly based on anomaly propagation and multi-dimensional numerical fingerprinting of claim 1, wherein the significance factor in step S3 (c) In the range of Ensuring that the base score remains even if the statistical test is not significant.
- 5. The method for accurately locating financial anomalies based on anomaly propagation and multidimensional numerical fingerprints as set forth in claim 1, wherein the rounding sensitivity weights in step S4 (b) are enhanced for rounding activities common in financial data, giving mantissas 0 and 5 a 2-fold weight.
- 6. The method for accurately locating a financial anomaly based on anomaly propagation and multi-dimensional numerical fingerprinting of claim 1, wherein the context factor in step S5 Non-regular time, such as weekend transactions, late night transactions, end of month/end of season transactions are given a weight greater than 1.
- 7. The method for accurately locating financial anomalies based on anomaly propagation and multidimensional numerical fingerprinting of claim 1, wherein when a record triggers a plurality of anomalies simultaneously, a risk score is correspondingly increased according to the number of the triggering anomalies.
- 8. The method for accurately locating financial anomalies based on anomaly propagation and multidimensional numerical fingerprints as set forth in claim 1, wherein the method has a computational complexity of Real-time processing of large-scale financial datasets is supported.
- 9. A financial abnormality accurate positioning system based on abnormality propagation and multidimensional numerical fingerprint is characterized by comprising The data preprocessing module is used for cleaning the financial data and extracting the first digit, the last digit and the timestamp characteristic; the binomial checking module is used for executing binomial checking on each digital category and outputting statistical significance scores; The anomaly propagation engine comprises a deviation degree calculation unit, a weight calculation unit, a significance fusion unit and a score propagation unit, and is used for propagating the anomaly score of the digital class layer to a specific record; The time sequence analysis module is used for executing time sequence decomposition and residual error anomaly detection; The risk scoring module is used for fusing the multidimensional constraint score, the monetary weight and the context factor and calculating a final risk score; and the visual output module is used for generating an abnormal certificate list, an attribution label and an analysis report.
- 10. The financial anomaly accurate positioning system based on anomaly propagation and multi-dimensional numerical fingerprinting of claim 9, wherein the weight calculation unit in the anomaly propagation engine comprises: Rareness weight calculation subunit, calculating according to the Ford expected probability ; Rounding sensitivity weight calculation subunit, giving 2 times weight to mantissas 0 and 5.
Description
Financial abnormality accurate positioning method and system based on abnormality propagation and multidimensional numerical fingerprint Technical Field The invention relates to the technical field of financial science and technology and big data audit, in particular to a computer processing method and a system for accurately positioning and attributing abnormal financial data from macroscopic distribution to microscopic credential level by utilizing an abnormal propagation algorithm and combining multidimensional statistical constraints. Background In the fields of financial auditing and risk control, traditional anomaly detection methods mainly rely on a rule engine (for example, setting an amount threshold for alarming) or statistical rule analysis of a single dimension (for example, applying the present ford law for one-dimensional digital distribution detection). These methods have the following drawbacks in general: the granularity of detection is rough, namely the existing distribution test (such as chi-square test) can only judge whether the whole data set is abnormal, and can not accurately position a specific record which causes abnormal distribution; the calculation complexity is high and low, the traditional One-by-One-Out method (Leave-One-Out) can locate single-point anomalies, but the calculation complexity is that Real-time analysis of large-scale financial data is difficult to deal with; Most methods rely on empirical thresholds, do not perform significance analysis based on statistical tests, and are prone to false alarms; ignoring the digital characteristics, namely not considering the inherent characteristics of different digital categories, such as that the anomaly of rare numbers is more suspicious, and mantissas 0 and 5 are more easily artificially rounded; The lack of interpretability is that only the risk score is output and the deviation of which digital pattern the anomaly originates from cannot be traced back. Therefore, a financial anomaly detection method and a system capable of realizing accurate positioning from anomaly digital category to specific data record under the premise of ensuring statistical rigor and high calculation efficiency are needed currently. Disclosure of Invention The invention aims to overcome the defects of the prior art and provide a financial abnormality accurate positioning method and a system based on abnormality propagation and multidimensional numerical fingerprints, and the method aims to solve the problems of coarse detection granularity, high calculation complexity, lack of statistical rigor and interpretability and the like in the prior art. The invention provides a financial abnormality accurate positioning method based on abnormality propagation and multidimensional numerical fingerprints, which comprises the following steps: s1, slicing and reorganizing data multidimensional, namely slicing and reorganizing original financial data according to a plurality of dimensions of accounting subjects, accounting periods, business types and the like, constructing multidimensional data tensors, and carrying out standardized cleaning; s2, establishing a combined statistical detection model containing a high-order digital natural law constraint (the Ford law), a low-order digital uniformity constraint (mantissa entropy detection) and a numerical weight density constraint; S3, constructing a time sequence consistency constraint model, namely establishing a time sequence mutation detection model based on time sequence decomposition (such as STL decomposition) and residual analysis aiming at a numerical sequence of the same subject; S4, based on significance analysis and anomaly propagation algorithm of binomial test, realizing accurate positioning and risk scoring from 'anomaly digital class' to 'specific financial record'; s5, generating a risk portrait, namely outputting a high-risk voucher list according to the final risk score sequence, and generating an interpretable attribution label for each abnormal record. Further, the step S1 specifically includes: cleaning original financial running water, and defining a set to be detected . Performing multi-scale mapping: Absolute value processing (for distribution analysis) of the negative numbers; rejecting predefined fixed values (e.g., tax rates 0.06, 0.13, avoiding interference distribution); Extracting first digits And last digit。 Further, the multi-dimensional numerical distribution constraint of step S2 includes: Adopts a combined constraint system: (1) High digit natural Law constraint (Benford's Law) The first digit and the first two digits are detected. The present ford law expects a distribution: The basis for discrimination is to calculate the actual distribution And desired distributionKullback-Leibler divergence between: If it is Exceeding a dynamic thresholdThe set is marked as "high-order anomalies". (2) Low digit Uniformity constraint (Tail Uniformity) For the last or angular position of the v