CN-122022981-A - Loan interest rate pricing method based on large language model
Abstract
The invention discloses a loan interest rate pricing method and a loan interest rate pricing system based on a large language model, comprising the steps of obtaining borrower structured financial data and unstructured behavior data and preprocessing; the loan risk assessment model with the combined fine adjustment is used for processing data to generate a comprehensive credit assessment result, the reinforced learning dynamic pricing model is input to output a differential interest rate scheme, credit and market interest rate changes are monitored in real time, and re-pricing is triggered by exceeding a threshold value. The system comprises a data acquisition module, a risk assessment module, a dynamic pricing module and a monitoring and re-pricing module. According to the invention, risk assessment precision is improved through multi-source data fusion, personalized dynamic pricing is realized through reinforcement learning, interest rate and risk matching are monitored and guaranteed in real time, data safety is ensured through a privacy calculation and encryption technology, and the problems of insufficient utilization, individuation and instantaneity of traditional pricing data are solved.
Inventors
- ZHANG RAN
Assignees
- 中信银行股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251211
Claims (10)
- 1. A loan interest rate pricing method based on a large language model is characterized by comprising the following steps: S1, obtaining structured financial data and unstructured behavior data of a borrower, and carrying out standardized processing on the data, wherein the standardized processing comprises sensitive data desensitization, abnormal value and missing value processing, unstructured data enhancement and data standardization; S2, processing the structured financial data and the unstructured behavior data through a loan risk assessment model to generate a comprehensive credit assessment result, wherein the loan risk assessment model adopts a multi-model collaborative processing mode and comprises feature extraction and multi-model fusion assessment; S3, inputting the comprehensive credit evaluation result and the current market reference interest rate into a dynamic pricing model, and outputting a differential interest rate quotation scheme, wherein the dynamic pricing model is based on a reinforcement learning architecture and comprises a state space, an action space and a rewarding function configuration; And S4, monitoring and triggering an interest rate re-pricing mechanism in real time, wherein the interest rate re-pricing mechanism starts a re-pricing process and records an adjustment log based on a credit score drift index and a preset threshold value of market interest rate fluctuation range.
- 2. The method according to claim 1, wherein the sensitive data processing in step S1 comprises: The identification card number and the mobile phone number are encrypted by SHA-256 hash; After extracting key entities from text data through a named entity recognition model, desensitizing the text data by using substitution symbols; The random disturbance is added in the summary statistics by adopting a Laplace mechanism of differential privacy, and the privacy budget is controlled within epsilon=1.0.
- 3. The method according to claim 1, wherein the feature extraction in step S2 comprises: Extracting semantic features, namely completing emotion tendency identification, semantic slot extraction and text abstract generation of unstructured text by adopting a BERT-based multitask learning model, and mining risk keywords by combining a textRank algorithm; the topological relation feature extraction, namely constructing a borrower association network, and generating a node topological feature vector by adopting Graph Attention Network; and (3) extracting the characteristics of the structured data, namely adopting XGBoost model to process the structured financial data and outputting numerical characteristic vectors.
- 4. The method according to claim 1, wherein the multi-model fusion assessment in step S2 employs a Stacking-seal integration framework specifically comprising: the primary model includes BERT, graph Attention Network, XGBoost; And the secondary model adopts LightGBM, inputs the prediction probability and middle layer characteristics of the primary model, and outputs a comprehensive credit evaluation result comprising a risk level and a default probability value through a multi-source characteristic fusion network.
- 5. The method according to claim 1, wherein the dynamic pricing model configuration in step S3 comprises: The model structure adopts a DQN+double Q-learning+ Prioritized Experience Replay combined architecture; The state space comprises credit level variation of borrowers, default probability, market benchmark interest rate, transaction frequency and repayment history; the action space is a discrete interest rate adjustment value set; The rewarding function is defined as actual benefit-penalty-customer churn penalty, and the piecewise incentive mechanism is set for customers with different risk levels.
- 6. The method according to claim 1, wherein the monitoring index calculation and re-pricing triggering in step S4 comprises: Credit score drift monitoring, namely calculating a distribution drift index of credit scores of borrowers, and triggering evaluation updating when PSI is more than 0.2; Market interest rate fluctuation monitoring, namely tracking the fluctuation amplitude of market reference interest rate in real time, and starting a re-pricing process when the fluctuation exceeds +/-0.5%.
- 7. The method of claim 1, further comprising a data security and compliance mechanism: The federal learning framework is adopted to realize the training of a cross-mechanism distributed model, and the original data cannot be output locally; processing a cross-institution risk assessment task by utilizing multiparty safety calculation; and embedding a SHAP value analysis tool into the risk assessment output, and visualizing the grading result according to the characteristic contribution degree.
- 8. A loan interest rate pricing system based on a large language model, employing the method of any one of claims 1-7, comprising: The data acquisition module is configured to perform standardization processing on the data, wherein the standardization processing comprises sensitive data desensitization, abnormal value and missing value processing, unstructured data enhancement and data standardization; the risk assessment module is configured to process the structured financial data and the unstructured behavior data through a loan risk assessment model to generate an integrated credit assessment result, wherein the loan risk assessment model adopts a multi-model collaborative processing mode and comprises feature extraction and multi-model fusion assessment; a dynamic pricing module configured to input the comprehensive credit assessment result and the current market benchmark interest rate into a dynamic pricing model, and output a differentiated interest rate quotation scheme, wherein the dynamic pricing model is based on a reinforcement learning architecture and comprises a state space, an action space and a rewarding function configuration; The monitoring and re-pricing module is configured to monitor and trigger a rate re-pricing mechanism in real time, wherein the rate re-pricing mechanism starts a re-pricing process and records an adjustment log based on a credit score drift index and a preset threshold value of market rate fluctuation amplitude.
- 9. An electronic device comprising a processor and a memory; The memory is used for storing operation instructions; the processor is configured to execute the method of any one of claims 1-7 by invoking the operation instruction.
- 10. A computer readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when executed by a processor, implements the method of any of claims 1-7.
Description
Loan interest rate pricing method based on large language model Technical Field The invention relates to the crossing field of artificial intelligence and financial science and technology, in particular to a loan interest rate pricing method based on a large language model, which is particularly suitable for risk assessment, dynamic interest rate generation and real-time pricing adjustment technologies integrating structured financial data and unstructured behavior data, and belongs to the technical fields of natural language processing, deep learning and financial wind control. Background In banking loan business, loan interest rate pricing is a core link for realizing risk control and income optimization, and the current mainstream pricing method mainly comprises a cost addition method, a market competition pricing method and a risk base pricing method. The cost addition method is used for superposing fixed profit margin on the basis of bank fund cost, operation cost and risk cost, the market competition pricing method is used for evaluating the risk level by tracking the adjustment strategy of the interest rate of the contest of the same industry, and the risk base pricing method is used for evaluating the risk level based on factors such as credit score of borrowers and the like, so that the three components together form the technical framework of traditional loan interest rate pricing. However, the prior art has obvious limitations, the traditional model has weak processing capability on unstructured data, emotion tendency and repayment willingness characteristics in texts such as customer service dialogues, loan application statement and the like cannot be effectively extracted, so that the risk assessment dimension is single, the pricing strategy adopts standardized formulas or rules, a differentiated scheme is difficult to generate aiming at credit dynamics and behavior preference of borrowers, customer experience and market competitiveness are influenced, dynamic factors such as market interest rate fluctuation, borrower credit condition change and the like cannot be monitored in real time and fed back to the pricing model, the interest rate and actual risk are possibly mismatched, credit risk or income loss is aggravated, the capturing capability of the traditional statistical model on complex feature interaction is limited, and the generalization performance is insufficient under a big data scene. With the development of financial science and technology, single data type and static pricing logic are difficult to meet the risk pricing requirement under a complex economic environment, so that an intelligent loan interest rate pricing method which integrates multi-source heterogeneous data, dynamic response market and client changes, has deep learning capability and integrates artificial intelligence technology is needed, the problems of insufficient data utilization, individuation, insufficient instantaneity and the like in the prior art are solved, and the precision and market adaptability of bank risk pricing are improved. Disclosure of Invention In order to solve the defects of the prior art, the invention provides a loan interest rate pricing method and a loan interest rate pricing system based on a large language model. The technical scheme adopted by the invention comprises the following steps: in a first aspect, the present invention provides a loan interest rate pricing method based on a large language model, the method comprising: S1, obtaining structured financial data and unstructured behavior data of a borrower, and carrying out standardized processing on the data, wherein the standardized processing comprises sensitive data desensitization, abnormal value and missing value processing, unstructured data enhancement and data standardization; S2, processing the structured financial data and the unstructured behavior data through a loan risk assessment model to generate a comprehensive credit assessment result, wherein the loan risk assessment model adopts a multi-model collaborative processing mode and comprises feature extraction and multi-model fusion assessment; S3, inputting the comprehensive credit evaluation result and the current market reference interest rate into a dynamic pricing model, and outputting a differential interest rate quotation scheme, wherein the dynamic pricing model is based on a reinforcement learning architecture and comprises a state space, an action space and a rewarding function configuration; And S4, monitoring and triggering an interest rate re-pricing mechanism in real time, wherein the interest rate re-pricing mechanism starts a re-pricing process and records an adjustment log based on a credit score drift index and a preset threshold value of market interest rate fluctuation range. Further, the sensitive data processing in step S1 includes: The identification card number and the mobile phone number are encrypted by SHA-256 hash; After extracting key ent