Search

US-20260127672-A1 - System and Method for Using AI/ML to Categorize Financial Transaction Data for Hot Storage, Cold Storage, or Deletion Based on Data Significance

US20260127672A1US 20260127672 A1US20260127672 A1US 20260127672A1US-20260127672-A1

Abstract

The present invention relates to a system and method for the automated categorization, management, and storage of financial transaction data using artificial intelligence (AI) and machine learning (ML) algorithms. The system processes large volumes of financial data, such as tax records, transaction histories, and receipts, by assigning a significance score to each data item based on its relevance, legal obligations, and usage frequency. Data is categorized into hot storage for critical, frequently accessed data, cold storage for infrequently accessed but legally significant data, or flagged for deletion when deemed redundant or obsolete. The system includes a user interface allowing customization of significance thresholds and storage preferences, ensuring compliance with retention policies and auditing standards. By automating the categorization process, the invention optimizes storage resources, reduces costs, enhances data accessibility, and provides robust security measures, including encryption. This invention continuously improves through user feedback and retraining of the AI/ML models.

Inventors

  • Alexander Davis

Assignees

  • Alexander Davis

Dates

Publication Date
20260507
Application Date
20241103

Claims (10)

  1. 1 . A system for categorizing financial transaction data for storage, comprising: a. A computer memory device storing financial transaction data from a plurality of sources, comprising one or more keyboard input/output devices and one or more banking APIs; b. An AI/ML categorization engine comprising an application-specific integrated circuit (ASIC) for an artificial neural network connected to the computer memory device, the ASIC comprising: a plurality of neurons organized in an array, wherein each neuron comprises a register, a processing element and at least one input, and a plurality of synaptic circuits, each synaptic circuit including a memory for storing a synaptic weight, wherein each neuron is connected to at least one other neuron via one of the plurality of synaptic circuits configured, configured to analyze said financial transaction data using machine learning algorithms trained on historical datasets, wherein the AI/ML categorization engine identifies transaction attributes, assigns significance scores, and classifies said data into categories; c. A significance scoring module operatively coupled to the AI/ML categorization engine, configured to assign significance scores to said financial transaction data based on pre-defined parameters including, but not limited to, transaction amounts, transaction types, legal obligations, tax relevance, and audit risk; d. A storage decision module configured to allocate said financial transaction data into one of the following storage tiers based on said significance score: i. Hot storage for financial transaction data classified as high-priority, frequently accessed, or subject to immediate regulatory or audit requirements; ii. Cold storage for financial transaction data classified as low-priority, infrequently accessed, or retained for legal or historical record-keeping purposes; iii. Deletion for financial transaction data classified as redundant, obsolete, or insignificant for legal, financial, or business purposes; e. A user interface configured to allow users to modify storage preferences, significance thresholds, and categorization parameters; f. A compliance check module, configured to ensure said system adheres to regulatory retention policies and auditing standards applicable to financial data.
  2. 2 . The system of claim 1 , wherein the AI/ML categorization engine utilizes natural language processing (NLP) to identify keywords, patterns, and attributes within said financial transaction data.
  3. 3 . The system of claim 1 , further comprising a training model configured to update said AI/ML categorization engine based on user feedback, thereby improving the accuracy and relevance of future data categorization.
  4. 4 . The system of claim 1 , wherein the data ingestion module is further configured to normalize and validate said financial transaction data prior to categorization, ensuring consistency in data format across various input sources.
  5. 5 . A method for categorizing and storing financial transaction data, comprising the steps of: a. Receiving financial transaction data through a data ingestion module from a plurality of sources, including manual input, financial software, and banking APIs b. Analyzing said financial transaction data through an AI/ML categorization engine, wherein said AI/ML categorization engine is trained to identify transaction attributes and assign significance scores based on parameters such as transaction amount, transaction type, tax relevance, and audit risk; c. Assigning a significance score to each transaction record within said financial transaction data; d. Allocating said financial transaction data into one of the following storage tiers based on said significance score: i. Hot storage for high-priority, frequently accessed, or critical data. ii. Cold storage for low-priority, infrequently accessed, or archival data; iii. Deletion for redundant, obsolete, or insignificant data; e. Allowing user customization of significance thresholds and storage preferences via a user interface; f. Performing automated compliance checks to ensure the retention of financial transaction data adheres to applicable legal and regulatory standards.
  6. 6 . The method of claim 5 , further comprising the step of encrypting said financial transaction data both at rest and in transit to ensure data security.
  7. 7 . The method of claim 5 , wherein the significance score is continuously updated based on new transaction data and evolving legal or audit requirements.
  8. 8 . The method of claim 5 , wherein redundant or obsolete financial transaction data is automatically flagged for deletion, thereby optimizing storage resources.
  9. 9 . The method of claim 5 , further comprising the step of receiving user feedback on data categorization accuracy, wherein said user feedback is incorporated into retraining the AI/ML categorization engine.
  10. 10 . A computer-readable medium containing instructions that, when executed by a processor, cause a system to: a. Receive financial transaction data from a plurality of sources b. Analyze said financial transaction data using AI/ML algorithms to categorize said data into predetermined storage categories; c. Assign significance scores based on parameters including, but not limited to, transaction size, transaction type, tax relevance, and audit risk; d. Allocate said financial transaction data to hot storage, cold storage, or deletion based on said significance score; e. Perform automated compliance checks to ensure adherence to data retention policies; f. Allow users to customize categorization preferences and significance thresholds via a user interface.

Description

BACKGROUND OF THE INVENTION Field of Invention This invention relates to the field of data management and storage, specifically to the automated classification, categorization, and handling of large volumes of financial transaction data using artificial intelligence (AI) and machine learning (ML). It focuses on the efficient allocation of such data into appropriate storage systems, such as hot storage for frequently accessed information, cold storage for rarely accessed but still significant data, and deletion for unnecessary or redundant records. This system and method are particularly applicable in industries handling large volumes of sensitive financial information, such as accounting, banking, auditing, and tax management. BRIEF SUMMARY OF THE INVENTION The present invention provides a system and method that leverages artificial intelligence (AI) and machine learning (ML) for the automated classification and storage of large volumes of data, particularly financial transaction data. The system analyzes the data and categorizes it based on its significance and usage frequency, with the capability to allocate the data into appropriate storage tiers. “Hot storage” is used for frequently accessed or critical data, “cold storage” for data that is rarely needed but still significant for record-keeping, and non-essential data is flagged for deletion to optimize storage resources. This invention enhances data management efficiency, minimizes costs, and improves accessibility for professionals in fields such as finance, accounting, tax management, and auditing. BRIEF DESCRIPTION OF THE FIGURES FIG. 1: step by step of the invention's process. DETAILED DESCRIPTION The detailed description of this invention outlines the technical framework and operational process for an AI/ML-driven system that categorizes and manages data based on its significance, relevance, and future utility, with a specific emphasis on financial transaction data such as tax records. System Architecture The system architecture is comprised of multiple integrated components designed to analyze, categorize, and store data. It includes: Data Ingestion Module: This module handles the intake of raw financial data, such as tax records, transaction histories, invoices, and receipts. Data can be imported from various sources, including manual input, financial software, and banking APIs. The system employs a robust input validation and normalization process, ensuring consistency in data formats before further processing. AI/ML Categorization Engine: The core of the system, this engine is powered by machine learning algorithms trained to classify data based on predetermined categories of significance. The engine utilizes natural language processing (NLP) and pattern recognition to identify key attributes in financial data, such as transaction amounts, types of transactions (income, expense, capital gain), tax codes, and audit significance. Training Model: The machine learning models are continuously trained using historical financial data and patterns. The model improves its accuracy over time by learning from new datasets and user feedback. The training includes supervised learning models that classify data based on specific characteristics, with periodic updates and retraining. Significance Scoring: The categorization engine assigns a significance score to each data item. The significance score is based on parameters such as transaction size, legal obligations, tax relevance, and audit risk. For example, transactions that impact tax filings or large transactions with potential audit risks will receive a higher significance score. Storage Decision Module: Based on the significance score, the system determines the appropriate storage solution for each data item. The categorization engine passes the data to the storage decision module, which is responsible for managing the following storage options: Hot Storage: Critical data that is frequently accessed or required for immediate processing is stored in hot storage. This type of storage is optimized for speed and availability, using high-performance infrastructure. Examples of data stored here include ongoing tax records, recurring transactions, and flagged transactions with audit risks. Cold Storage: Data that is infrequently accessed but still necessary for legal or historical purposes is stored in cold storage. This storage type is more cost-effective and slower in access speed but ideal for archival purposes. Data stored in cold storage includes historical tax records, infrequently used transaction records, or data needed for long-term audits. Deletion Module: Data that is deemed insignificant or obsolete is flagged for deletion. The system identifies redundant, outdated, or unnecessary data that no longer holds relevance for legal, tax, or business purposes, reducing storage costs and improving overall system efficiency. User Interface & Customization: Users interact with the system through a user-frie