CN-122019740-A - Customer service intelligent response method and system based on semantic understanding
Abstract
The application discloses a customer service intelligent response method and system based on semantic understanding, wherein the method comprises the steps of processing an original large language model by adopting a mixed precision quantization strategy, pruning the quantized model based on a structured pruning algorithm to obtain a lightweight model, carrying out fine tuning training by using a labeled field data set based on the lightweight model to construct a precision compensation model, constructing a three-level cache architecture, defining a cache data classification standard, configuring a cache storage rule, establishing a cache refreshing mechanism, processing a user request, extracting key information, inputting the inference of the lightweight model, filling a response template according to cache data, feeding back the user after compliance verification, and carrying out dynamic iteration on the model and the cache strategy. The application can recover the semantic understanding accuracy to a high level close to the original model while improving the reasoning speed, thereby meeting the requirements of financial customer service on the service interpretation accuracy.
Inventors
- LI JINGFENG
- LI ZHIBIN
Assignees
- 中国工商银行股份有限公司安阳分行
Dates
- Publication Date
- 20260512
- Application Date
- 20251202
Claims (10)
- 1. The intelligent customer service response method based on semantic understanding is characterized by comprising the following steps: S1, screening high-frequency dialogue data of customer service in a database, and performing de-duplication, desensitization and intention labeling; S2, processing the original large language model by adopting a mixed precision quantization strategy; S3, pruning the quantized model based on a structured pruning algorithm to obtain a lightweight model, and calculating locking key parameters through gradient contribution in the pruning process; s4, performing fine adjustment training by using the labeled field data set on the basis of the lightweight model, and constructing an accuracy compensation model; S5, constructing a three-level cache architecture, comprising a memory cache L1, a distributed cache L2 and a database cache L3, defining a cache data classification standard, configuring a cache storage rule, and establishing a cache refreshing mechanism; S6, processing a user request, extracting key information and inputting lightweight model reasoning; s7, filling a response template according to the cache data, and feeding back to the user after compliance verification; S8, dynamically iterating the model and the caching strategy.
- 2. The intelligent response method of customer service based on semantic understanding according to claim 1, wherein the screening database is used for carrying out de-duplication, desensitization and intention labeling on the high-frequency dialogue data of the customer service, specifically, screening the high-frequency dialogue data from a financial customer service database, taking dialogue core semantics and business information as references, eliminating redundant data with repeated and core content highly similar by generating data feature fingerprint comparison, focusing customer sensitive information to complete desensitization, deleting name and identity card number privacy content, simultaneously completely retaining business consultation points and interaction logic in the dialogue, and finally forming a field labeling data set meeting model training requirements according to preset intention classification system labeling.
- 3. The intelligent customer service response method based on semantic understanding according to claim 1 is characterized in that a mixed precision quantization strategy is adopted to process an original large language model, specifically, a large language model which is adaptive to a financial customer service scene and is pre-trained by financial business basic corpus is selected as a quantization object, an FP16 and INT8 mixed precision strategy is adopted in the quantization process, a semantic understanding core layer which relates to customer consultation semantic analysis, financial business term understanding and customer core intention recognition is adopted, FP16 precision is reserved, and a non-core calculation layer which is responsible for general text cleaning, sentence standardization and redundant information filtering is adopted to conduct quantization compression by INT8 precision.
- 4. The intelligent customer service response method based on semantic understanding according to claim 1, wherein the structured pruning algorithm prunes the quantized model to obtain a lightweight model, specifically, based on the model quantized with mixed precision, preheating reasoning is performed by using a labeled financial customer service field data set, gradient contribution degree of each parameter is calculated by gradient back propagation mechanism layer by layer, a parameter importance ranking table containing relevant information of parameter positions, the network layer and the contribution degree is generated according to accumulated effect of the parameters in multi-round dialogue sample pushing, parameters directly related to semantic feature extraction of financial services in a semantic understanding core layer and key parameters supporting basic semantic processing logic in a non-core computing layer are locked, locking parameters are reserved in structured pruning operation, and parameters with low contribution degree and high redundancy are removed.
- 5. The intelligent customer service response method based on semantic understanding according to claim 1, wherein the method is characterized in that the method is based on a lightweight model, fine tuning training is performed by using a labeled field data set, an accuracy compensation model is built, specifically, fine tuning training is performed by using high-frequency customer service dialogue data screened in the early stage based on the lightweight model, training is performed by using an optimizer and a loss function, training quality is ensured by monitoring a verification set and stopping early, the fine tuning model recovers semantic understanding accuracy to an original model high level while maintaining lightweight reasoning speed, and the accuracy compensation model is formed.
- 6. The intelligent response method for customer service based on semantic understanding according to claim 1, wherein the three-level cache architecture is constructed and comprises a memory cache L1, a distributed cache L2 and a database cache L3, specifically, according to the real-time requirement of data access frequency and the differential design of storage cost, L1 is deployed in a core server memory to realize high-speed access and storage of high-frequency data, L2 adopts a Redis cluster to guarantee high availability to accept L1 miss requests, L3 and a service database are linked to preload low-frequency data to reduce direct query, and the three-level cache realizes level circulation to ensure consistency through a data synchronization protocol, and each layer is configured with an independent fault isolation mechanism.
- 7. The intelligent response method for customer service based on semantic understanding according to claim 1, wherein the definition of the cache data classification standard is characterized in that the customer service related data is divided into three types, namely high-frequency fixed data comprises basic business flow description, common question standard answer and standardized service speaking, medium-frequency dynamic data comprises historical consultation record, business handling progress and personalized service preference, and low-frequency real-time data comprises real-time financial data, temporary business bulletin and special business consultation related real-time data.
- 8. The intelligent response method for customer service based on semantic understanding according to claim 1, wherein the configuration cache storage rule is that the L1 cache stores high-frequency fixed data, the L2 cache is automatically disabled and triggers synchronous update with the L2 cache when the validity period is exceeded, the L2 cache is used as a core transfer level, all high-frequency fixed data and all medium-frequency dynamic data are stored, concurrent access capacity is improved through distributed storage balance node load, the L3 cache bears low-frequency real-time data storage and all data backup functions, latest data are synchronized from a service database according to the rule, and backup data is enabled only when a superior cache fails or misses to guarantee access continuity.
- 9. The intelligent customer service response method based on semantic understanding according to claim 1 is characterized in that a cache refreshing mechanism is established, specifically, a system receives service requests initiated by multiple channels of users in real time, format standardization processing is carried out on the request contents, a lightweight natural language processing tool is called, segmentation, part-of-speech tagging and entity identification operations are carried out on standardized request texts according to a semantic dictionary in the financial customer service field, intention keywords and auxiliary information are extracted, a structured information matrix containing user intention categories, core entity contents and request key elements is constructed after extraction is completed, a core entity is associated with a customer identity identifier, and information extraction operations are re-executed after user core appeal is clear.
- 10. A system for using the semantic understanding based customer service intelligent response method according to any of claims 1-9, comprising: the data preprocessing module is used for screening the high-frequency dialogue data of the customer service in the database and finishing the duplicate removal, desensitization and intention labeling; The model light weight module is used for processing the original large language model by adopting a mixed precision quantization strategy, pruning the quantized model based on a structured pruning algorithm, and calculating locking key parameters through gradient contribution; the precision compensation module is used for constructing a precision compensation model by utilizing the labeled field data set for fine adjustment training based on the light model; The cache management module is used for constructing a three-level cache architecture, defining a cache data classification standard, configuring a cache storage rule and establishing a cache refreshing mechanism; the request reasoning module is used for processing a user request, extracting key information and inputting the information into the lightweight model to complete reasoning; The response generation module is used for filling a response template according to the cache data, and feeding back response contents to a user after compliance verification; and the iterative optimization module is used for carrying out dynamic iterative optimization on the model and the cache strategy.
Description
Customer service intelligent response method and system based on semantic understanding Technical Field The invention relates to the field of intelligent customer service, in particular to a customer service intelligent response method and system based on semantic understanding. Background In the process of finance digital transformation, intelligent upgrading of customer service becomes the core direction of improving service quality and efficiency, and the semantic understanding technology is used as the core support of an intelligent response system, and the performance of the semantic understanding technology directly determines service experience. Currently, large language models are widely applied to financial customer service scenes due to strong semantic processing capability, but the models have huge parameters, so that the reasoning speed is difficult to meet the real-time response requirements of customer service. In order to improve response efficiency, the industry often adopts a quantization and pruning lightweight technology to compress a model, but the model generally faces the dilemma that the semantic understanding accuracy is obviously reduced, and cannot accurately match the high requirement of financial customer service on the service interpretation accuracy. The patent document with the publication number of CN119474280A discloses an intelligent customer service system based on an AI large model, and belongs to the technical field of intelligent customer service systems. The invention improves the flexibility and user experience of the system through a multi-mode interaction technology, intelligently identifies the data types and carries out corresponding processing, automatically adjusts the language style and emotion intonation of the response content according to the mood and intention of the user, ensures the real-time updating and comprehensiveness of the knowledge content through integrating static data and dynamic interaction data, carries out quick retrieval based on the knowledge base, carries out accurate reply by combining a large language model, generates high-quality reply and solution through rule reasoning and generating model, records and manages the state information of each round of interaction of the user through constructing a dialogue state tracker, and realizes the accurate tracking and updating of the context. For complex tasks, the system disassembles the user request into a plurality of subtasks and sequentially executes the subtasks according to the priority and the dependency relationship, so that the high efficiency and the continuity of task processing are ensured. The patent document with publication number CN120450843A discloses a financial business processing method, a device and a system based on a large language model, wherein the method comprises the steps of collecting various data generated in the process of financial business consultation or request handling when detecting that a customer initiates the financial business consultation or request handling; preprocessing the acquired data, extracting key information, inputting the preprocessed data and the extracted key information into a pre-trained large language model, identifying the client intention of the client through the large language model, extracting key business elements, performing risk assessment and emotion analysis on the client, and performing business processing and decision making based on the client intention, the key business elements, a risk assessment result and an emotion analysis result and combining business rules and risk strategies preset by a financial institution. Meanwhile, the financial customer service data loading link has the outstanding problems that the customer consultation is more related to business flow and historical record data, the existing system is more dependent on directly inquiring a database to acquire information, the repeated access is frequently performed, the load of the database is increased, the data loading delay is caused, and the overall response speed is further dragged. How to realize the light weight of the model to improve the reasoning efficiency, ensure the semantic understanding precision, and consider the instantaneity of data loading and cache optimization, becomes a key problem to be solved in the floor application of the intelligent response system of the financial customer service. Disclosure of Invention The invention aims to provide a customer service intelligent response method based on semantic understanding and a customer service intelligent response system based on semantic understanding so as to solve the problems in the background technology. In order to solve the technical problems, the invention provides the following technical scheme: In a first aspect, a method for intelligent response to customer service based on semantic understanding includes the steps of: S1, screening high-frequency dialogue data of customer servi