Search

US-12620036-B2 - Automated system and method for analyzing and auditing financial data

US12620036B2US 12620036 B2US12620036 B2US 12620036B2US-12620036-B2

Abstract

A financial report analysis system having an intelligent automation unit for selecting from stored financial data the financial data to be processed and automatically scheduling for processing the selected financial data, a conversion unit for converting the selected financial data stored in a storage unit into a selected processing format to form converted financial data, a financial data processing unit for automatically processing the converted financial data to form processed financial data and for generating marked financial data from the processed financial data that includes one or more markings, and a verification unit for verifying an accuracy of the processed financial data by comparing the processed financial data to other financial data, and then highlighting or marking portions of the processed data if the data is inaccurate.

Inventors

  • Partho Bandopadhyay
  • Cameron Gwinn
  • Mayank Dhanwantri
  • Gayathri KRISHNAN
  • Chatla Prashanth
  • Sahil Kalra

Assignees

  • KPMG LLP

Dates

Publication Date
20260505
Application Date
20230503

Claims (9)

  1. 1 . A financial report analysis system, comprising a hardware processor; a database hardware component comprising a storage device coupled to the hardware processor; and a non-transitory computer-readable storage medium storing instructions, which, when executed by the hardware processor, are configured to implement: an aggregation unit for aggregating financial data comprising numerical values, text labels, date stamps, and metadata from multiple source documents and for storing the financial data in the database hardware component to form stored financial data, an intelligent automation unit for selecting from the stored financial data the financial data to be processed and automatically scheduling for processing the selected financial data, a storage unit comprising allocated memory space within the non-transitory computer-readable storage medium for storing the selected financial data, a conversion unit for converting the selected financial data stored in the storage unit into a selected processing format comprising a machine-readable structured data format to form converted financial data, a financial data processing unit for automatically processing the converted financial data to form processed financial data and for generating marked financial data, the marked financial data comprising the processed financial data with embedded marking data, from the processed financial data that includes one or more markings, the financial data processing unit comprises: an extraction unit for automatically parsing the converted financial data by identifying and extracting individual data elements including account numbers, monetary amounts, transaction descriptions, and date fields to form parsed financial data, a mapping unit for mapping the parsed financial data with other parsed financial data by establishing relationships between corresponding data elements across different financial documents or reporting periods to form mapped financial data, a machine learning unit, implemented by the hardware processor and configured to: access training data comprising historical financial data sets that include verified accurate financial data labeled as compliant and verified inaccurate financial data labeled with specific error types, wherein the training data is retrieved from the database hardware component, train a plurality of distinct machine learning models using training data specific to an auditing operation type of each machine learning model, wherein training comprises iteratively adjusting model parameters to minimize prediction error between model outputs and labeled training data outcomes, receive the mapped financial data and auditing operation type data indicating a selected auditing operation to be performed, automatically select a machine learning model from the plurality of distinct machine learning models based on the auditing operation type data, wherein: when the auditing operation type data indicates a tie-out operation, select and apply a cosine similarity model from the plurality of distinct machine learning models that computes vector similarity scores between current year financial data elements and prior year financial data elements to identify matching entries across financial reporting periods, when the auditing operation type data indicates a recalculation operation, select and apply a natural language processing model from the plurality of distinct machine learning models comprising tokenization and named entity recognition algorithms to extract and validate numerical relationships within financial statement text, when the auditing operation type data indicates an internal consistency operation, select and apply a string distance measurement model from the plurality of distinct machine learning models that calculates edit distance scores between financial data entries to identify inconsistent reporting of identical financial items, and generate model data comprising an output of the selected machine learning model from the plurality of distinct machine learning models including confidence scores and identified data relationships specific to the selected auditing operation; a filtering unit for filtering the model data by applying thereto one or more business rules to form filtered financial data, and a marking unit for marking one or more portions of the filtered financial data by generating and associating the marking data with specific data elements that fail to satisfy the business rules or exceed the predefined thresholds, the hardware processor configured to execute the financial data processing unit operations in parallel processing threads and the non-transitory computer-readable storage medium configured to store intermediate processing results including parsed financial data, mapped financial data, and model data in structured database format on the database hardware component for retrieval by subsequent processing units, and a verification unit for verifying an accuracy of the processed financial data by comparing the processed financial data to other financial data comprising reference data from prior financial periods, external data sources, or predefined accounting standards, and then marking portions of the processed data if the data is inaccurate wherein the verification unit marks portions of the filtered financial data if the data is inaccurate by generating marking data that identify the inaccurate data, wherein the verification unit is configured for determining an accuracy of the filtered financial data based on a type of auditing procedure being performed.
  2. 2 . The system of claim 1 , wherein the selected processing format comprising the machine readable structured data format is selected from JSON, XML, or tabular database format with defined field types, wherein numerical financial values are stored as floating-point or decimal data types, text strings are stored as alphanumeric character sequences with defined encoding, and temporal data is stored in standardized date-time format to form the converted financial data.
  3. 3 . The system of claim 1 , wherein the embedded marking data includes one or more markings comprising computer-readable annotations embedded within or linked to specific data elements, wherein each marking comprises a data structure including: (i) a marking type identifier indicating whether the marking identifies an inconsistency, discrepancy, or item requiring verification, (ii) a location identifier specifying the marked data element, (iii) a confidence score indicating a likelihood of error or inconsistency, and (iv) descriptive text explaining the reason for the marking.
  4. 4 . The system of claim 1 , wherein the specific error types of the verified inaccurate data include calculation errors, inconsistencies, and misstatements.
  5. 5 . The system of claim 1 , wherein the machine learning unit is further configured to receive feedback data comprising user verification input indicating whether markings generated by the machine learning unit correctly identified actual errors or inaccuracies in the filtered financial data, and automatically re-train the selected machine learning model by incorporating the feedback data as additional training examples, wherein re-training comprises updating the model parameters based on discrepancies between the model predictions and the user verification input to improve future prediction accuracy.
  6. 6 . The system of claim 1 , wherein the business rules applied by the filtering unit include predefined thresholds, tolerance ranges, and logical conditions specific to financial auditing standards.
  7. 7 . The system of claim 6 , wherein the marking unit is configured to generate and associate the marking data structures with specific data elements that fail to satisfy the business rules or exceed the predefined thresholds.
  8. 8 . A computer-implemented method for marking financial data, comprising a hardware processor; a database hardware component comprising a storage device coupled to the hardware processor; and a non-transitory computer-readable storage medium storing instructions, which, when executed by the hardware processor, are configured for: aggregating financial data comprising numerical values, text labels, date stamps, and metadata from multiple source documents and storing the financial data in the database hardware component to form stored financial data, selecting from the stored financial data the financial data to be processed and automatically scheduling for processing the selected financial data, storing the selected financial data in allocated memory space within the non-transitory computer-readable storage medium, converting the selected financial data into a selected processing format comprising a machine-readable structured data format to form converted financial data, automatically processing with a financial data processing unit the converted financial data to form processed financial data and for generating marked financial data, the marked financial data comprising the processed financial data with embedded marking data, from the processed financial data that includes one or more markings, the financial data processing unit comprises one or more processors programmed for: automatically parsing the converted financial data by identifying and extracting individual data elements including account numbers, monetary amounts, transaction descriptions, and date fields to form parsed financial data, mapping the parsed financial data with other parsed financial data by establishing relationships between corresponding data elements across different financial documents or reporting periods to form mapped financial data, access training data comprising historical financial data sets that include verified accurate financial data labeled as compliant and verified inaccurate financial data labeled with specific error types, wherein the training data is retrieved from the database hardware component, train a plurality of distinct machine learning models using training data specific to an auditing operation type of each machine learning model, wherein training comprises iteratively adjusting model parameters to minimize prediction error between model outputs and labeled training data outcomes, receive the mapped financial data and auditing operation type data indicating a selected auditing operation to be performed, automatically select a machine learning model from the plurality of distinct machine learning models based on the auditing operation type data, wherein: when the auditing operation type data indicates a tie-out operation, select and apply a cosine similarity model from the plurality of distinct machine learning models that computes vector similarity scores between current year financial data elements and prior year financial data elements to identify matching entries across financial reporting periods, when the auditing operation type data indicates a recalculation operation, select and apply a natural language processing model from the plurality of distinct machine learning models comprising tokenization and named entity recognition algorithms to extract and validate numerical relationships within financial statement text, when the auditing operation type data indicates an internal consistency operation, select and apply a string distance measurement model from the plurality of distinct machine learning models that calculates edit distance scores between financial data entries to identify inconsistent reporting of identical financial items, and generate model data comprising an output of the selected machine learning model from the plurality of distinct machine learning models including confidence scores and identified data relationships specific to the selected auditing operation; filtering the model data by applying thereto one or more business rules to form filtered financial data, and marking one or more portions of the filtered financial data by generating and associating the marking data with specific data elements that fail to satisfy the business rules or exceed the predefined thresholds, the hardware processor configured to execute the financial data processing unit operations in parallel processing threads and the non-transitory computer-readable storage medium configured to store intermediate processing results including parsed financial data, mapped financial data, and model data in structured database format on the database hardware component for retrieval by subsequent processing units, and verifying an accuracy of the processed financial data by comparing the processed financial data to other financial data comprising reference data from prior financial periods, external data sources, or predefined accounting standards, and then marking portions of the processed data if the data is inaccurate, and highlighting portions of the filtered financial data if the data is inaccurate by generating marking data that identify the inaccurate data, wherein the verification unit is configured for determining an accuracy of the filtered financial data based on a type of auditing procedure being performed.
  9. 9 . A non-transitory, computer readable medium comprising computer program instructions tangibly stored on the computer readable medium, wherein the computer program instructions are executable by at least one computer processor to perform a method, the method comprising: aggregating financial data comprising numerical values, text labels, date stamps, and metadata from multiple source documents and storing the financial data to form stored financial data, selecting from the stored financial data the financial data to be processed and automatically scheduling for processing the selected financial data, storing the selected financial data, converting the selected financial data into a selected processing format comprising a machine-readable structured data format to form converted financial data, automatically processing with a financial data processing unit the converted financial data to form processed financial data and for generating marked financial data, the marked financial data comprising the processed financial data with embedded marking data, from the processed financial data that includes one or more markings, the financial data processing unit comprises one or more processors programmed for: automatically parsing the converted financial data by identifying and extracting individual data elements including account numbers, monetary amounts, transaction descriptions, and date fields to form parsed financial data, mapping the parsed financial data with other parsed financial data by establishing relationships between corresponding data elements across different financial documents or reporting periods to form mapped financial data, access training data comprising historical financial data sets that include verified accurate financial data labeled as compliant and verified inaccurate financial data labeled with specific error types, wherein the training data is retrieved from the database hardware component, train a plurality of distinct machine learning models using training data specific to an auditing operation type of each machine learning model, wherein training comprises iteratively adjusting model parameters to minimize prediction error between model outputs and labeled training data outcomes, receive the mapped financial data and auditing operation type data indicating a selected auditing operation to be performed, automatically select a machine learning model from the plurality of distinct machine learning models based on the auditing operation type data, wherein: when the auditing operation type data indicates a tie-out operation, select and apply a cosine similarity model from the plurality of distinct machine learning models that computes vector similarity scores between current year financial data elements and prior year financial data elements to identify matching entries across financial reporting periods, when the auditing operation type data indicates a recalculation operation, select and apply a natural language processing model from the plurality of distinct machine learning models comprising tokenization and named entity recognition algorithms to extract and validate numerical relationships within financial statement text, when the auditing operation type data indicates an internal consistency operation, select and apply a string distance measurement model from the plurality of distinct machine learning models that calculates edit distance scores between financial data entries to identify inconsistent reporting of identical financial items, and generate model data comprising an output of the selected machine learning model from the plurality of distinct machine learning models including confidence scores and identified data relationships specific to the selected auditing operation; filtering the model data by applying thereto one or more business rules to form filtered financial data wherein the filtering includes calculating confidence scores for machine learning model outputs using similarity score metrics, comparing calculated confidence scores against predetermined threshold values specific to each auditing operation type, automatically filtering out data having confidence scores below the predetermined threshold values to generate the filtered financial data, and marking one or more portions of the filtered financial data by generating and associating the marking data with specific data elements that fail to satisfy the business rules or exceed the predefined thresholds, verifying an accuracy of the processed financial data by comparing the processed financial data to other financial data comprising reference data from prior financial periods, external data sources, or predefined accounting standards, and then marking portions of the processed data if the data is inaccurate, and highlighting portions of the filtered financial data if the data is inaccurate by generating marking data that identify the inaccurate data, wherein the verification unit is configured for determining an accuracy of the filtered financial data based on a type of auditing procedure being performed.

Description

BACKGROUND OF THE INVENTION The present invention is related to systems and methods for processing financial related data, and more specifically is related to automated systems and methods for processing and analyzing financial information. In order to comply with current tax and finance related laws, enterprises, such as large companies, typically need to file selected financial related documents with the government at selected times during the year. Many companies engage external tax and audit finance experts (e.g., the tax and audit professionals) to handle the processing of financial related data from the company and to prepare and file the various, necessary filings with the local, state and federal governments and agencies. With regard to large companies, the internal collation, handling, and processing of financial related data can be a monumental task. As such, companies expend significant resources tracking and collating the financial data. Further, large companies oftentimes have many different disparate systems disposed at different locations, all of which are generating financial related information. Conventional systems exist that allow the companies to collate and store the financial related information, including for example enterprise resource planning (ERP) systems. The companies typically deploy the ERP systems at many different locations to aggregate and store financial related information that are eventually needed by the external finance experts. The large company needs to work closely with the finance experts to allow them access to all of the various ERP systems and the financial data stored therein. Conventionally, the financial data is downloaded or transferred from each of the ERP systems to the external finance expert. The financial related information is oftentimes shared with the finance experts to perform an audit and to prepare necessary tax filings and financial reports. Tax and financial report preparation, for example, is a necessary but time-consuming and laborious process. It is estimated that individuals and companies spend around 6.1 billion hours per year complying with the filing requirements of various tax and government authorities. Conventional systems and methods exist that allow entities, such as companies, to aggregate financial data, and then process and utilize the financial data as part of various state and federal tax returns and filings and to prepare required financial reports. Conventional systems and methods rely oftentimes on the experience of the finance experts to understand the financial related data and to properly and accurately organize, audit, analyze, reconcile, and report the data. The experts also need to understand the importance of selected types of financial data and how the data affects certain portions of the related financial reports. More specifically, the audit professional oftentimes needs to review voluminous amounts of financial related data, and then needs to determine that the information is accurate within the same document or report, as well as between different documents and reports. Conventional financial data processing systems and methods, however, have significant drawbacks. For example, the process associated with the preparation of financial reports, including tax returns, is oftentimes highly dependent upon the skill and experience of the financial expert. Thus, the quality of the financial reports tends to vary between financial experts. Also, conventional systems do not necessarily identify areas of the reports that need to be reviewed for further verification by the financial expert. Still further, the conventional systems do not sufficiently flag aberrant financial data. SUMMARY OF THE INVENTION The present invention is directed to a system and method for receiving and processing financial data and then marking and verifying the processed financial data. Specifically, the system of the present invention can be configured to convert the financial data into a suitable format, extract relevant portions of the financial data, and then map the data to selected other types of financial data. The system can then apply one or more types of machine learning models to the mapped data in order to perform selected financial operations to the data. The model data can then be marked in a selected manner so as to indicate that the data was processed. Further, the system can be configured to mark and highlight data as part of a verification process so as to indicate that the financial data needs to be further reviewed and processed by the system or a financial expert. The financial report analysis system of the present invention is configured to assist and ease the burden and simplify the process of reviewing and auditing financial reports, such as financial statements. The audit and review process can include, for example, performing selected audit procedures such as prior year tie outs, footings, cross-footings, internal consisten