CN-121304358-B - Accounting data management method, system, equipment and medium based on edge calculation

CN121304358BCN 121304358 BCN121304358 BCN 121304358BCN-121304358-B

Abstract

The application provides an accounting data management method, system, equipment and medium based on edge calculation, which are characterized in that a structured feature vector of newly added accounting document data is extracted; the method comprises the steps of obtaining historical accounting document data, extracting similar document sets similar to the newly-added accounting document data from a high-frequency subject area and a sparse subject area in the historical accounting document data according to data partition granularity, determining the feature vector distance between a structured feature vector of the newly-added accounting document data and each accounting document in the similar document sets, further generating a risk confidence value of the newly-added accounting document data based on the occurrence frequency and the feature vector distance of the similar accounting documents, and updating a compliance check result and the structured feature vector corresponding to the newly-added accounting document data to an accounting document database when the risk confidence value exceeds a set threshold. The technical scheme provided by the application can overcome the interference of natural long tail distribution characteristics of accounting subjects on similar credential retrieval in accounting data management.

Inventors

TANG MENGFEI
XU XIAOYAN
LU MEILING
LI GANG
Ou Qunfang
ZHOU JIAN
CAO ZIXUAN

Assignees

广州民航职业技术学院

Dates

Publication Date: 20260508
Application Date: 20250904

Claims (10)

1. An accounting data management method based on edge calculation is characterized by comprising the following steps: Constructing a light accounting document database at the edge computing node, wherein the light accounting document database is used for storing accounting document data and key value pairs formed by corresponding compliance verification results; when newly added accounting document data is received, extracting a structural feature vector of the newly added accounting document data; Determining data partition granularity based on the distribution density of the historical accounting document data, and extracting a similar document set similar to the newly added accounting document data from a high-frequency subject area and a sparse subject area in the historical accounting document data according to the data partition granularity; Determining the feature vector distance between the structured feature vector of the newly-added accounting document data and each similar document in a similar document set, and further generating a risk confidence value of the newly-added accounting document data based on the occurrence frequency and the feature vector distance of the similar accounting document, wherein the similar accounting document is an accounting document of which the feature vector distance screened from the similar document set is smaller than or equal to a preset distance threshold and is similar to the newly-added accounting document; And when the risk confidence value exceeds a set threshold value, identifying a compliance verification result corresponding to the newly-added accounting document data, and updating the compliance verification result and the structural feature vector corresponding to the newly-added accounting document data to an accounting document database at an edge computing node.
2. The method of claim 1, wherein extracting the structured feature vector of the newly added accounting document data when the newly added accounting document data is received comprises: performing format analysis on the newly added accounting document data to extract a standardized data field set; dividing the standardized set of data fields into a qualitative feature subset and a quantitative feature subset based on accounting element attributes; converting the qualitative feature subset and the quantitative feature subset into a qualitative feature vector and a quantitative feature vector, respectively; determining an initial structured feature vector from the qualitative feature vector and the quantitative feature vector; and performing dimension reduction processing on the initial structured feature vector to obtain the structured feature vector of the newly added accounting document data.
3. The method of claim 1, wherein determining a data partition granularity based on a distribution density of historical accounting document data comprises: carrying out density space estimation on the structured feature vector of the historical accounting document data to obtain local density values of all sample points in a density space; extracting density clustering centers of the historical accounting document data based on all the local density values; Calculating the distribution entropy of the neighborhood samples of the density clustering center; and determining the granularity of the data partition according to the comparison result of the distribution entropy of the neighborhood sample and a preset entropy threshold value.
4. The method of claim 1, wherein extracting similar sets of credentials similar to the newly added accounting credential data from high frequency and sparse regions of the historical accounting credential data based on the data partition granularity comprises: Respectively partitioning the high-frequency subject area and the sparse subject area in the historical accounting document data based on the data partitioning granularity to obtain a high-frequency subject partitioning data set and a sparse subject partitioning data set; determining a target high-frequency subject partition and a target sparse subject partition according to the high-frequency subject partition data set and the sparse subject partition data set; In the target high-frequency subject partition and the target sparse subject partition, respectively calculating cosine similarity of the structural feature vector of each history accounting document data and the structural feature vector of the newly added accounting document data; And identifying a similar credential set similar to the newly added accounting credential data according to all cosine similarities.
5. The method of claim 1, wherein determining a feature vector distance of the structured feature vector of the augmented accounting document data from each similar document in a set of similar documents comprises: performing dimension matching and standardization processing on the structured feature vector of the newly added accounting document data and the structured feature vector of each accounting document in the similar document set to obtain a vector pair to be calculated; and solving the distance of each vector pair to be calculated to obtain the distance between the structured feature vector of the newly added accounting document data and the feature vector of each similar document in the similar document set.
6. The method of claim 1, wherein identifying the compliance verification result corresponding to the newly added accounting document data when the risk confidence value exceeds a set threshold value specifically comprises: When the risk confidence value exceeds a set threshold value, acquiring a target check rule set related to newly-added accounting document data from a preset accounting compliance check rule base; and carrying out item-by-item matching verification on the core elements of the newly-added accounting document data and rule entries in the target verification rule set to obtain a compliance verification result corresponding to the newly-added accounting document data.
7. The method of claim 1, wherein the key-value pairs refer to a data storage form formed by taking a structured feature vector of accounting document data as a key and taking a compliance verification result corresponding to the accounting document data as a value.
8. An edge-computing-based accounting data management system for accounting data management using the method of any one of claims 1-7, the system comprising: the construction module is used for constructing a light accounting document database at the edge computing node and storing the accounting document data and key value pairs formed by the corresponding compliance verification results; the processing module is used for extracting the structural feature vector of the newly added accounting document data when the newly added accounting document data is received; The processing module is further used for determining data partition granularity based on the distribution density of the historical accounting document data, and extracting a similar document set similar to the newly-added accounting document data from a high-frequency subject area and a sparse subject area in the historical accounting document data according to the data partition granularity; The processing module is further configured to determine a feature vector distance between the structured feature vector of the newly added accounting document data and each similar document in the similar document set, and further generate a risk confidence value of the newly added accounting document data based on the occurrence frequency of the similar accounting documents and the feature vector distance; And the execution module is used for identifying the compliance verification result corresponding to the newly-added accounting document data when the risk confidence value exceeds a set threshold value, and updating the compliance verification result corresponding to the newly-added accounting document data and the structural feature vector to an accounting document database at an edge computing node.
9. A computer device comprising a memory storing code and a processor configured to obtain the code and perform the edge calculation based accounting data management method of any one of claims 1-7.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the edge calculation based accounting data management method of any one of claims 1 to 7.

Description

Accounting data management method, system, equipment and medium based on edge calculation Technical Field The present application relates to the field of data management technologies, and in particular, to a method, a system, a device, and a medium for managing accounting data based on edge computing. Background Along with the development of digital wave, data becomes a core production element, the explosive growth of the digital wave promotes the continuous evolution of a data management technology, an early file system and a database technology solve the problems of storage and query of structured data, but the processing requirements of massive unstructured data (such as videos and logs) and real-time stream data (such as sensor data of the Internet of things) are faced, the traditional architecture is gradually inexhaustible, the popularization of cloud computing promotes the generation of a distributed storage and calculation framework, the elastic expansion and parallel processing are realized, the data processing is pushed to a terminal by edge computing, and the delay and the bandwidth are reduced. In the existing data management, the data management takes full life cycle management and control as a core principle, and runs through the full processes of data generation, storage, processing, transmission, application and destruction, a differential strategy is formulated according to data sensitivity and service value through a classification and grading mechanism, meanwhile, a data management framework is established, authority is divided into work, flow standardization is realized by combining an automatic tool, and service decision of the data is ensured on the premise of compliance, however, in the process of accounting data management, the traditional edge side accounting risk identification method is difficult to overcome the interference caused by natural long tail distribution characteristics of accounting subjects on similar credential retrieval, and further the risk assessment accuracy is reduced in accounting edge nodes due to unbalanced use frequency of the subjects, so that the problem of interference caused by the natural long tail distribution characteristics of the accounting subjects on similar credential retrieval in the process of accounting data management is solved. Disclosure of Invention The application provides an edge calculation-based accounting data management method, an edge calculation-based accounting data management system, edge calculation-based accounting data management equipment and an edge calculation-based accounting data management medium, which can overcome the interference of natural long tail distribution characteristics of accounting subjects on similar credential retrieval in accounting data management. In a first aspect, the present application provides an edge calculation-based accounting data management method, including the steps of: Constructing a light accounting document database at the edge computing node, wherein the light accounting document database is used for storing accounting document data and key value pairs formed by corresponding compliance verification results; when newly added accounting document data is received, extracting a structural feature vector of the newly added accounting document data; Determining data partition granularity based on the distribution density of the historical accounting document data, and extracting a similar document set similar to the newly added accounting document data from a high-frequency subject area and a sparse subject area in the historical accounting document data according to the data partition granularity; Determining the feature vector distance between the structured feature vector of the newly added accounting document data and each similar document in the similar document set, and further generating a risk confidence value of the newly added accounting document data based on the occurrence frequency of the similar accounting documents and the feature vector distance; And when the risk confidence value exceeds a set threshold value, identifying a compliance verification result corresponding to the newly-added accounting document data, and updating the compliance verification result and the structural feature vector corresponding to the newly-added accounting document data to an accounting document database at an edge computing node. In some embodiments, when the newly added accounting document data is received, extracting the structured feature vector of the newly added accounting document data specifically includes: performing format analysis on the newly added accounting document data to extract a standardized data field set; dividing the standardized set of data fields into a qualitative feature subset and a quantitative feature subset based on accounting element attributes; converting the qualitative feature subset and the quantitative feature subset into a qualitative feature vector and a qua