US-20260127905-A1 - ARTIFICIAL INTELLIGENCE BASED (AI-BASED) SYSTEMS AND METHODS TO MANAGE ELECTRONIC DOCUMENTS FOR AUTOMATED UNDERWRITING AND PRICING
Abstract
An AI-based system and method for managing electronic documents comprising closing packages, is disclosed. The AI-based method includes receiving the electronic documents comprising closing packages from electronic devices associated with first users; automatically categorizing the electronic documents comprising closing packages, by applying tags on electronic documents, using AI model; splitting each electronic document comprising the closing packages, based on tags, using an AI-based document splitting model; extracting information from types of electronic documents, using the AI model; determining eligibility of loans and base rate settings, upon validation of each electronic document, using AI-based guideline validation model; predicting risk assessment on loans based on market data and internal loan performance metrics, using AI-based risk model; and dynamically adjusting loan pricing and terms in response to market conditions based on combination of eligibility of loans, base rate settings, and risk assessment on the loans, using AI-powered pricing and terms engine.
Inventors
- John Beacham
- Sachin Venugopal
- Chakkrapani Grandhi
- Abhishek Jain
- Daniel Robin K
- Mounika Pinnamaneni
Assignees
- Toorak Capital Partners
Dates
- Publication Date
- 20260507
- Application Date
- 20241105
Claims (20)
- 1 . An artificial intelligence based (AI-based) method for managing one or more electronic documents comprising closing packages, the AI-based method comprising: receiving, by one or more hardware processors, the one or more electronic documents comprising the closing packages from one or more electronic devices associated with one or more first users, wherein the closing packages comprise a set of the one or more electronic documents associated with one or more financial transactions, wherein the one or more electronic documents are corresponding to a form of a portable document format (PDF); automatically categorizing, by the one or more hardware processors, the one or more electronic documents comprising the closing packages associated with one or more financial transactions, by applying one or more tags on the one or more electronic documents, using an artificial intelligence (AI) model; splitting, by the one or more hardware processors, each electronic document of the one or more electronic documents comprising the closing packages, based on the one or more tags applied on the one or more electronic documents, using an AI-based document splitting model; extracting, by the one or more hardware processors, one or more information from one or more types of the one or more electronic documents, using the AI model; validating, by the one or more hardware processors, each electronic document of the one or more electronic documents to determine whether each electronic document of the one or more electronic documents are required to a process of one or more loans; determining, by the one or more hardware processors, at least one of: eligibility of the one or more loans and base rate settings, upon validation of each electronic document of the one or more electronic documents, using an AI-based guideline validation model; predicting, by the one or more hardware processors, risk assessment on the one or more loans for one or more second users based on at least one of: one or more market data and one or more internal loan performance metrics, using an AI-based risk model; and dynamically adjusting, by the one or more hardware processors, loan pricing and terms in response to market conditions based on a combination of at least one of: the eligibility of the one or more loans, the base rate settings, and the risk assessment on the one or more loans for one or more second users, using an AI-powered pricing and terms engine.
- 2 . The AI-based method of claim 1 , further comprising training the AI model to automatically categorize the one or more electronic documents comprising the closing packages, wherein training the AI model comprises: obtaining, by the one or more hardware processors, one or more first training datasets associated with the one or more text formats of the one or more electronic documents corresponding to one or more predefined tags; generating, by the one or more hardware processors, one or more feature vectors by processing the one or more first training datasets using at least one of: optical character recognition (OCR) engine and natural language processing (NLP) model; correlating, by the one or more hardware processors, the generated one or more feature vectors with one or more respective tags being assigned for the one or more electronic documents; training, by the one or more hardware processors, the AI model based on the correlation between the generated one or more feature vectors and the one or more respective tags, wherein the AI model comprises a Stochastic Gradient Descent (SGD) classification model; and determining, by the one or more hardware processors, one or more tags being applied on the one or more electronic documents to automatically categorize the one or more electronic documents, based on the trained AI model.
- 3 . The AI-based method of claim 1 , wherein splitting each electronic document of the one or more electronic documents comprising the closing packages, using the AI-based document splitting model, comprises: converting, by the one or more hardware processors, the one or more electronic documents from the portable document format (PDF) to one or more text formats using the OCR engine; processing, by the one or more hardware processors, the converted one or more text formats of the one or more electronic documents to extract a list of page numbers for each electronic document of the one or more electronic documents, using a Spacy named entity recognition (NER) model; converting, by the one or more hardware processors, the one or more electronic documents from the portable document format (PDF) to one or more image formats; predicting, by the one or more hardware processors, a type of one or more pages of the one or more electronic documents based on the converted one or more image formats of the one or more electronic documents, using a convolutional neural network (CNN) model, wherein the type of one or more pages of the one or more electronic documents comprise at least one of: start page, middle page, end page, filler page, and single page, of the one or more electronic documents; determining, by the one or more hardware processors, a boundary of each electronic document of the one or more electronic documents by combining the extracted list of page numbers for each electronic document of the one or more electronic documents and predicted type of the one or more pages of the one or more electronic documents, using the AI-based document splitting model; tagging, by the one or more hardware processors, each electronic documents of the one or more electronic documents based on the one or more types of the one or more electronic documents, using a document tag classifier, wherein the document tag classifier comprises a Stochastic Gradient Descent (SGD) classifier; and splitting, by the one or more hardware processors, each electronic document of the one or more electronic documents based on the one or more tags applied on the one or more electronic documents.
- 4 . The AI-based method of claim 1 , further comprising training, by the one or more hardware processors, the AI model to extract the one or more information from the one or more types of the one or more electronic documents, wherein training the AI model comprises: obtaining, by the one or more hardware processors, a set of electronic documents comprising the one or more types of the one or more electronic documents, wherein the one or more types of the one or more electronic documents comprise at least one of: one or more notes, housing and urban development document, social security number (SSN), and driving licenses; annotating, by the one or more hardware processors, the one or more types of the one or more electronic documents to indicate one or more key details comprising at least one of: one or more names of the one or more second users, one or more SSN values, and one or more loan values; extracting, by the one or more hardware processors, one or more features from the annotated one or more electronic documents using the NLP model, wherein the NLP model comprises a SpaCy library model; and training, by the one or more hardware processors, the AI model with the extracted one or more features and the annotated one or more electronic documents, to analyze the one or more information from each type of the one or more electronic documents.
- 5 . The AI-based method of claim 1 , further comprising sending, by the one or more hardware processors, one or more notifications to the one or more electronic devices associated with the one or more first users when the one or more electronic documents are at least one of: missing and incomplete, wherein the one or more notifications comprise a request of submission of the one or more electronic documents being missed during the process of the one or more loans.
- 6 . The AI-based method of claim 1 , wherein determining at least one of: eligibility of the one or more loans and the base rate settings using the AI-based guideline validation model, comprises at least one of: determining, by the one or more hardware processors, at least one of: eligibility of the one or more loans and the base rate settings, based on one or more factors comprising at least one of: evaluations of credit score, loan-to-value ratio defining property value corresponding to a loan amount, eligibility of the one or more second users and one or more properties; determining, by the one or more hardware processors, whether the one or more loans meet one or more minimum standards indicating an acceptation of the one or more first users on the one or more loans; determining, by the one or more hardware processors, base pricings for the one or more loans based on one or more first fields comprising at least one of: information associated with a loan amount, experience of the one or more second users, one or more credit scores, and demography; and determining, by the one or more hardware processors, optimized pricings for the one or more loans based on one or more second fields obtained from the one or more second users, wherein the one or more second fields comprise one or more properties belonging to the one or more second users.
- 7 . The AI-based method of claim 1 , further comprising training, by the one or more hardware processors, the AI-based risk model to predict the risk assessment on the one or more loans for the one or more second users, wherein training the AI-based risk model comprises: obtaining, by the one or more hardware processors, one or more second training datasets comprising one or more data from one or more data sources, wherein the one or more data comprise at least one of: the one or more market data, one or more user geographical and financial data, one or more loan performance data, and one or more social media data; and training, by the one or more hardware processors, the AI-based risk model based on the one or more second training datasets using a grid search approach, wherein the AI-based risk model comprises an extreme gradient boosting (XGBoost) model, wherein training the AI-based risk model comprises: defining, by the one or more hardware processors, a range of values for each hyperparameter to be tuned in the AI-based risk model, wherein defining the range of values for each hyperparameter comprises assigning maximum, minimum, and step size for each hyperparameter of one or more hyperparameters, wherein the one or more hyperparameters comprise at least one of: learning rate, tree depth, and number of trees; generating, by the one or more hardware processors, a grid search space by combining the range of values from the one or more hyperparameters; generating, by the one or more hardware processors, an optimized grid of configurations for the AI-based risk model based on the combination of the range of values from the one or more hyperparameters; and training, by the one or more hardware processors, the XGBoost model on the one or more second trained datasets using k-fold cross-validation to determine robustness and prevent overfitting.
- 8 . The AI-based method of claim 7 , further comprising evaluating, by the one or more hardware processors, performance of the trained AI-based risk model using one or more metrics comprising root mean squared error (RMSE), wherein the RMSE indicates close matching of the prediction of the AI-based risk model with one or more actual values.
- 9 . The AI-based method of claim 8 , further comprising adjusting, by the one or more hardware processors, the one or more hyperparameters to fine-tune the AI-based risk model with minimum RMSE for dynamically predicting the risk assessment with optimized accuracy.
- 10 . An artificial intelligence based (AI-based) system for managing one or more electronic documents comprising closing packages, the AI-based system comprising: one or more hardware processors; a memory coupled to the one or more hardware processors, wherein the memory comprises a plurality of subsystems in form of programmable instructions executable by the one or more hardware processors, and wherein the plurality of subsystems comprises: a document receiving subsystem configured to receive the one or more electronic documents comprising the closing packages from one or more electronic devices associated with one or more first users, wherein the closing packages comprise a set of the one or more electronic documents associated with one or more financial transactions, wherein the one or more electronic documents are corresponding to a form of a portable document format (PDF); a document categorizing subsystem configured to automatically categorize the one or more electronic documents comprising the closing packages associated with one or more financial transactions, by applying one or more tags on the one or more electronic documents, using an artificial intelligence (AI) model; a document splitting subsystem configured to split each electronic document of the one or more electronic documents comprising the closing packages, based on the one or more tags applied on the one or more electronic documents, using an AI-based document splitting model; an information extraction subsystem configured to extract one or more information from one or more types of the one or more electronic documents, using the AI model; a document validation subsystem configured to validate each electronic document of the one or more electronic documents to determine whether each electronic document of the one or more electronic documents are required to a process of one or more loans; a loan eligibility determining subsystem configured to determine at least one of: eligibility of the one or more loans and base rate settings, upon validation of each electronic document of the one or more electronic documents, using an AI-based guideline validation model; a risk assessment prediction subsystem configured to predict risk assessment on the one or more loans for one or more second users based on at least one of: one or more market data and one or more internal loan performance metrics, using an AI-based risk model; and a loan price adjusting subsystem configured to dynamically adjust loan pricing and terms in response to market conditions based on a combination of at least one of: the eligibility of the one or more loans, the base rate settings, and the risk assessment on the one or more loans for one or more second users, using an AI-powered pricing and terms engine.
- 11 . The AI-based system of claim 10 , further comprising a training system configured to train the AI model for automatically categorizing the one or more electronic documents comprising the closing packages, wherein in training the AI model, the training subsystem is configured to: obtain one or more first training datasets associated with one or more text formats of the one or more electronic documents corresponding to one or more predefined tags; generate one or more feature vectors by processing the one or more first training datasets using at least one of: optical character recognition (OCR) engine and natural language processing (NLP) model; correlate the generated one or more feature vectors with one or more respective tags being assigned for the one or more electronic documents; train the AI model based on the correlation between the generated one or more feature vectors and the one or more respective tags, wherein the AI model comprises a Stochastic Gradient Descent (SGD) classification model; and determine one or more tags being applied on the one or more electronic documents to automatically categorize the one or more electronic documents, based on the trained AI model.
- 12 . The AI-based system of claim 10 , wherein in splitting each electronic document of the one or more electronic documents comprising the closing packages, using the AI-based document splitting model, the document splitting subsystem is further configured to: convert the one or more electronic documents from the portable document format (PDF) to the one or more text formats using the OCR engine; process the converted one or more text formats of the one or more electronic documents to extract a list of page numbers for each electronic document of the one or more electronic documents, using a Spacy named entity recognition (NER) model; convert the one or more electronic documents from the portable document format (PDF) to one or more image formats; predict a type of one or more pages of the one or more electronic documents based on the converted one or more image formats of the one or more electronic documents, using a convolutional neural network (CNN) model, wherein the type of one or more pages of the one or more electronic documents comprise at least one of: start page, middle page, end page, filler page, and single page, of the one or more electronic documents; determine a boundary of each electronic document of the one or more electronic documents by combining the extracted list of page numbers for each electronic document of the one or more electronic documents and predicted type of the one or more pages of the one or more electronic documents, using the AI-based document splitting model; and tag each electronic documents of the one or more electronic documents based on the one or more types of the one or more electronic documents, using a document tag classifier, wherein the document tag classifier comprises a Stochastic Gradient Descent (SGD) classifier; and split each electronic document of the one or more electronic documents based on the one or more tags applied on the one or more electronic documents.
- 13 . The AI-based system of claim 10 , wherein the training subsystem is configured to train the AI model for extracting the one or more information from the one or more types of the one or more electronic documents, wherein in training the AI model, the training subsystem is configured to: obtain a set of electronic documents comprising the one or more types of the one or more electronic documents, wherein the one or more types of the one or more electronic documents comprise at least one of: one or more notes, housing and urban development document, social security number (SSN), and driving licenses; annotate the one or more types of the one or more electronic documents to indicate one or more key details comprising at least one of: one or more names of the one or more second users, one or more SSN values, and one or more loan values; extract one or more features from the annotated one or more electronic documents using the NLP model, wherein the NLP model comprises a SpaCy library model; and train the AI model with the extracted one or more features and the annotated one or more electronic documents, to analyze the one or more information from each type of the one or more electronic documents.
- 14 . The AI-based system of claim 10 , wherein the document validation subsystem is further configured to send one or more notifications to the one or more electronic devices associated with the one or more first users when the one or more electronic documents are at least one of: missing and incomplete, wherein the one or more notifications comprise a request of submission of the one or more electronic documents being missed during the process of the one or more loans.
- 15 . The AI-based system of claim 10 , wherein in determining at least one of: eligibility of the one or more loans and the base rate settings using the AI-based guideline validation model, the loan eligibility determining subsystem is configured to: determine at least one of: eligibility of the one or more loans and the base rate settings, based on one or more factors comprising at least one of: evaluations of credit score, loan-to-value ratio defining property value corresponding to a loan amount, eligibility of the one or more second users and one or more properties; determine whether the one or more loans meet one or more minimum standards indicating an acceptation of the one or more first users on the one or more loans; determine base pricings for the one or more loans based on one or more first fields comprising at least one of: information associated with a loan amount, experience of the one or more second users, one or more credit scores, and demography; and determine optimized pricings for the one or more loans based on one or more second fields obtained from the one or more second users, wherein the one or more second fields comprise one or more properties belonging to the one or more second users.
- 16 . The AI-based system of claim 10 , wherein the training subsystem is configured to train the AI-based risk model for predicting the risk assessment on the one or more loans for the one or more second users, wherein in training the AI-based risk model, the training subsystem is configured to: obtain one or more second training datasets comprising one or more data from one or more data sources, wherein the one or more data comprise at least one of: the one or more market data, one or more user geographical and financial data, one or more loan performance data, and one or more social media data; and train the AI-based risk model based on the one or more second training datasets using a grid search approach, wherein the AI-based risk model comprises an extreme gradient boosting (XGBoost) model, wherein training the AI-based risk model comprises: defining a range of values for each hyperparameter to be tuned in the AI-based risk model, wherein defining the range of values for each hyperparameter comprises assigning maximum, minimum, and step size for each hyperparameter of one or more hyperparameters, wherein the one or more hyperparameters comprise at least one of: learning rate, tree depth, and number of trees; generating a grid search space by combining the range of values from the one or more hyperparameters; generating an optimized grid of configurations for the AI-based risk model based on the combination of the range of values from the one or more hyperparameters; and training the XGBoost model on the one or more second trained datasets using k-fold cross-validation to determine robustness and prevent overfitting.
- 17 . The AI-based system of claim 16 , wherein the training subsystem is further configured to evaluate performance of the trained AI-based risk model using one or more metrics comprising root mean squared error (RMSE), wherein the RMSE indicates close matching of the prediction of the AI-based risk model with one or more actual values.
- 18 . The AI-based system of claim 17 , wherein the training subsystem is further configured to adjust the one or more hyperparameters to fine-tune the AI-based risk model with minimum RMSE for dynamically predicting the risk assessment with optimized accuracy.
- 19 . A non-transitory computer-readable storage medium having instructions stored therein that when executed by a hardware processor, cause the processor to execute operations of: receiving the one or more electronic documents comprising closing packages from one or more electronic devices associated with one or more first users, wherein the closing packages comprise a set of the one or more electronic documents associated with one or more financial transactions, wherein the one or more electronic documents are corresponding to a form of a portable document format (PDF); automatically categorizing the one or more electronic documents comprising the closing packages associated with one or more financial transactions, by applying one or more tags on the one or more electronic documents, using an artificial intelligence (AI) model; splitting each electronic document of the one or more electronic documents comprising the closing packages, based on the one or more tags applied on the one or more electronic documents, using the AI-based document splitting model; extracting one or more information from one or more types of the one or more electronic documents, using the AI model; validating each electronic document of the one or more electronic documents to determine whether each electronic document of the one or more electronic documents are required to a process of one or more loans; determining at least one of: eligibility of the one or more loans and base rate settings, upon validation of each electronic document of the one or more electronic documents, using an AI-based guideline validation model; predicting risk assessment on the one or more loans for one or more second users based on at least one of: one or more market data and one or more internal loan performance metrics, using an AI-based risk model; and dynamically adjusting loan pricing and terms in response to market conditions based on a combination of at least one of: the eligibility of the one or more loans, the base rate settings, and the risk assessment on the one or more loans for one or more second users, using an AI-powered pricing and terms engine.
- 20 . The non-transitory computer-readable storage medium of claim 19 , wherein splitting each electronic document of the one or more electronic documents comprising the closing packages, using the AI-based document splitting model, comprises: converting the one or more electronic documents from the portable document format (PDF) to one or more text formats using the OCR engine; processing the converted one or more text formats of the one or more electronic documents to extract a list of page numbers for each electronic document of the one or more electronic documents, using a Spacy named entity recognition (NER) model; converting the one or more electronic documents from the portable document format (PDF) to one or more image formats; predicting a type of one or more pages of the one or more electronic documents based on the converted one or more image formats of the one or more electronic documents, using a convolutional neural network (CNN) model, wherein the type of one or more pages of the one or more electronic documents comprise at least one of: start page, middle page, end page, filler page, and single page, of the one or more electronic documents; determining a boundary of each electronic document of the one or more electronic documents by combining the extracted list of page numbers for each electronic document of the one or more electronic documents and predicted type of the one or more pages of the one or more electronic documents, using AI-based document splitting model; and tagging each electronic documents of the one or more electronic documents based on the one or more types of the one or more electronic documents, using a document tag classifier, wherein the document tag classifier comprises a Stochastic Gradient Descent (SGD) classifier; and splitting each electronic document of the one or more electronic documents based on the one or more tags applied on the one or more electronic documents.
Description
FIELD OF INVENTION Embodiments of the present disclosure relate to artificial intelligence driven (AI-based) systems, and more particularly relates to an AI-based method and system to manage one or more electronic documents including closing packages for providing dynamic risk assessment and loan pricing based on real-time data. BACKGROUND The current process for managing mortgage loans is hindered by substantial challenges in document management and risk assessment. Classifying electronic documents and Handling closing packages, which often exceed a thousand pages, remains an arduous and time-consuming task. These closing packages include a variety of documents, such as notes, appraisals, social security numbers, driver's licenses, loan agreements, and housing and urban development (HUD) documents. Manually sorting these documents may require considerable human effort and time. Document processing in loan closings has always been a bottleneck for users and the document processing may take more than a week to complete because of an amount of paperwork involved in the loan closings. Additionally, assessing a risk of a loan necessitates accurate and efficient management of diverse data types and sources. The existing methods rely heavily on manual processes and lack the ability to dynamically integrate various data sources that influence loan decisions, leading to inefficiencies and potential inaccuracies in determining loan pricing and terms. Hence, there is a need for an improved artificial intelligence based (AI-based) system and method for managing one or more electronic documents including closing packages for providing dynamic risk assessment and loan pricing based on real-time data, in order to address the aforementioned issues. SUMMARY This summary is provided to introduce a selection of concepts, in a simple manner, which is further described in the detailed description of the disclosure. This summary is neither intended to identify key or essential inventive concepts of the subject matter nor to determine the scope of the disclosure. In accordance with an embodiment of the present disclosure, an artificial intelligence based (AI-based) method for managing one or more electronic documents comprising closing packages, is disclosed. The artificial intelligence based (AI-based) method comprises receiving, by one or more hardware processors, the one or more electronic documents including the closing packages from one or more electronic devices associated with one or more first users. The closing packages comprise a set of the one or more electronic documents associated with one or more financial transactions. The one or more electronic documents are corresponding to a form of a portable document format (PDF). The AI-based method further comprises automatically categorizing, by the one or more hardware processors, the one or more electronic documents including the closing packages associated with one or more financial transactions, by applying one or more tags on the one or more electronic documents, using an artificial intelligence (AI) model. The AI-based method further comprises splitting, by the one or more hardware processors, each electronic document of the one or more electronic documents including the closing packages, based on the one or more tags applied on the one or more electronic documents, using an AI-based document splitting model. The AI-based method further comprises extracting, by the one or more hardware processors, one or more information from one or more types of the one or more electronic documents, using the AI model. The AI-based method further comprises validating, by the one or more hardware processors, each electronic document of the one or more electronic documents to determine whether each electronic document of the one or more electronic documents are required to a process of one or more loans. The AI-based method further comprises determining, by the one or more hardware processors, at least one of: eligibility of the one or more loans and base rate settings, upon validation of each electronic document of the one or more electronic documents, using an AI-based guideline validation model. The AI-based method further comprises predicting, by the one or more hardware processors, risk assessment on the one or more loans for one or more second users based on at least one of: one or more market data and one or more internal loan performance metrics, using an AI-based risk model. The AI-based method further comprises dynamically adjusting, by the one or more hardware processors, loan pricing and terms in response to market conditions based on a combination of at least one of: the eligibility of the one or more loans, the base rate settings, and the risk assessment on the one or more loans for one or more second users, using an AI-powered pricing and terms engine. In an embodiment, the AI-based method further comprises training the AI model to automatically categorize the one or mo