CN-121999772-A - User portrait memory and voice skill scenerized linkage method and system based on large model

CN121999772ACN 121999772 ACN121999772 ACN 121999772ACN-121999772-A

Abstract

The invention discloses a large-model-based user portrait memory and voice skill scenerization linkage method and system, wherein the method comprises the steps of obtaining a multi-dimensional portrait tag of a user, iterating the multi-dimensional portrait tag in real time based on an update triggering condition, constructing a scene feature vector, obtaining the current potential intention of the user through an intention reasoning algorithm, generating a requirement list according to a user voice command and the current potential intention of the user, generating a voice skill linkage sequence according to the requirement list and the association degree between voice skills, obtaining an execution result according to the linkage sequence, a voice skill execution sequence and a data interaction rule, transmitting the execution result to the user, obtaining user feedback, and adjusting linkage logic of the multi-dimensional portrait tag and the voice skill linkage sequence according to the feedback.

Inventors

WANG ZIMIAN
SONG YUECHEN
GAO ZELEI
WU WEIWEI

Assignees

一汽奔腾汽车股份有限公司

Dates

Publication Date: 20260508
Application Date: 20260113

Claims (10)

1. A large-model-based user portrait memory and voice skill scenerization linkage method is characterized by comprising the following steps: Acquiring multi-source data, performing data processing on the multi-source data to obtain a multi-dimensional portrait tag of a user, setting an update triggering condition of the multi-dimensional portrait tag, and iterating the multi-dimensional portrait tag in real time based on the update triggering condition, wherein the multi-source data comprises active behavior data of the user and scene data of a scene where the user is located; constructing a scene feature vector according to scene data, and combining the multidimensional image tag and the scene feature vector to obtain the current potential intention of the user through an intention reasoning algorithm; receiving a user voice command, and analyzing the user demand according to the user voice command and the current potential intention of the user so as to generate a demand list; Setting a skill library, matching voice skills from the skill library according to the requirement list, generating a voice skill linkage sequence according to the association degree between the voice skills, and determining the voice skill execution sequence and the data interaction rule; according to the linkage sequence, the voice skill execution sequence and the data interaction rule, an execution result is obtained, and the execution result is integrated into a unified response and transmitted to a user; and acquiring feedback of the user on the unified response, and adjusting linkage logic of the multidimensional portrait tag and the voice skill linkage sequence according to the feedback.
2. The large-model-based user portrait memory and voice skill scenerization linkage method as claimed in claim 1, wherein the steps of collecting multi-source data, and performing data processing on the multi-source data to obtain a multi-dimensional portrait tag of the user include: Collecting multi-source data, wherein the active behavior data of the user comprise dialogue text data, operation behavior data and feedback data of the user; And carrying out semantic analysis, preference extraction and behavior attribution on the multi-source data to finish the data processing of the multi-source data and obtain the multi-dimensional image tag of the user.
3. The large-model-based user portrait memory and voice skill scenerization linkage method according to claim 2, wherein setting an update trigger condition of the multidimensional portrait tag, and iterating the multidimensional portrait tag in real time based on the update trigger condition comprises: According to the dialogue text data, the operation behavior data and the scene data of the user, the change is used as an updating triggering condition of the multidimensional portrait tag, the multidimensional portrait tag is iterated in real time, and preference data in the operation behavior data is reserved, wherein the preference data comprises long-term stable preference and short-term dynamic preference.
4. The large-model-based user portrait memory and voice skill scenery linkage method according to claim 1, wherein the method is characterized in that scene feature vectors are constructed according to scene data, and the current potential intention of a user is obtained through an intention reasoning algorithm by combining multidimensional portrait tags and the scene feature vectors, and comprises the following steps: And according to the scene feature vector and the multi-dimensional portrait tag, obtaining the current potential intentions of the plurality of users through an intention reasoning algorithm, and sequencing the current potential intentions of the plurality of users according to the confidence degree.
5. The large model-based user portrait memory and voice skill scenerization linkage method according to claim 1, wherein receiving user voice instructions, resolving user demands according to the user voice instructions and the current potential intention of the user to generate a demand list, further comprises: and acquiring scene data of high frequency occurrence, analyzing and acquiring a user high frequency scene feature vector, and generating a high frequency scene demand list.
6. The large-model-based user portrait memory and voice skill scenerizing linkage method according to claim 5, comprising: Based on the high-frequency scene demand list, triggering active linkage judgment, generating a push linkage sequence, pushing linkage suggestions to a user, after the user confirms, acquiring an execution result according to the push linkage sequence, the voice skill execution sequence and the data interaction rule, and integrating the execution result into a unified response and transmitting the unified response to the user.
7. The large-model-based user portrait memory and voice skill scenerization linkage method according to claim 1, wherein the method is characterized in that multi-source data are collected, the multi-source data are processed to obtain multi-dimensional portrait labels of users, update trigger conditions of the multi-dimensional portrait labels are set, and the multi-dimensional portrait labels are iterated in real time based on the update trigger conditions, and the method further comprises: acquiring user basic information, and initializing the multi-dimensional portrait tag through the user basic information to acquire an initialized multi-dimensional portrait tag; setting an update triggering condition of the multi-dimensional portrait tag, calculating the weight of the multi-dimensional portrait tag based on the update triggering condition, iterating the multi-dimensional portrait tag in real time, and storing the multi-dimensional portrait tag to a local and/or cloud.
8. The method for user portrait memory and voice skill modeling linkage based on a large model according to claim 1, wherein the steps of setting a skill library, matching voice skills from the skill library according to a demand list, generating a voice skill linkage sequence according to the degree of association between the voice skills, and determining a voice skill execution sequence and a data interaction rule include: calling a corresponding voice skill interface according to the voice skill linkage sequence, and transmitting data; and integrating the execution results of the skills into a unified response, and feeding back the unified response to the user through voice or characters.
9. The large-model-based user portrait memory and voice skill scenerized linkage method according to claim 1, wherein the method is characterized in that feedback of unified response of a user is obtained, and linkage logic of a multidimensional portrait tag and voice skill linkage sequence is adjusted according to the feedback, and comprises the following steps: Acquiring feedback data of a user for unified response; and (3) adjusting the multidimensional portrait tag according to the feedback data, and optimizing the reasoning precision and the voice skill linkage sequence of the current potential intention of the user to form a self-optimization closed loop.
10. A large model-based user portraiture memory and speech skill scenery linkage system, applying the large model-based user portraiture memory and speech skill scenery linkage method of any of claims 1-9, comprising: The user portrait dynamic construction module is used for acquiring multi-source data, processing the multi-source data to obtain a multi-dimensional portrait tag of a user, setting an update triggering condition of the multi-dimensional portrait tag, and iterating the multi-dimensional portrait tag in real time based on the update triggering condition; The scene intelligent recognition module is used for constructing scene feature vectors according to scene data, combining the multidimensional image labels and the scene feature vectors, and obtaining the current potential intention of the user through an intention reasoning algorithm; the large model reasoning linkage module is used for setting a skill library, generating a demand list, matching a plurality of adaptive voice skills from the skill library according to the demand list, generating a voice skill linkage sequence according to the degree of association among the plurality of voice skills, and determining a voice skill execution sequence and a data interaction rule, wherein a user voice command is received, and the user demand is analyzed according to the user voice command and the current potential intention of a user to form the demand list; And the skill scheduling and feedback module is used for calling corresponding voice skill parameters according to the linkage sequence, obtaining an execution result based on the voice skill execution sequence and the data interaction rule, integrating the execution result into a unified response, transmitting the unified response to a user, obtaining feedback of the user on the unified response, and adjusting linkage logic of the multidimensional portrait tag and the voice skill linkage sequence according to the feedback.

Description

User portrait memory and voice skill scenerized linkage method and system based on large model Technical Field The invention relates to the technical field of intelligent terminal application, in particular to a large-model-based user portrait memory and voice skill scenerization linkage method and system. Background The existing voice assistant products depend on fixed rule triggering skills, user portraits are static and have limited memory capacity, and the problems of poor scene suitability, incapacity of skill linkage, insufficient personalized service and the like exist. The prior art user portrait is static, lacks dynamic updating capability, is not capable of dynamically iterating according to real-time behaviors and conversation contexts of users because of initial setting or fixed labels, is low in service matching degree, is passive in voice skill triggering, poor in scene linkage, and is capable of actively associating related skills according to user clear instructions and combining current scenes (such as time, places and user states), and has the low efficiency problem of 'requiring multiple instructions of users', the prior art semantic understanding is shallow, personalized response is insufficient, only instruction surface meanings are identified, potential needs of users cannot be understood and personalized preference cannot be understood, responses of different users are consistent when the same instructions are received, customized experience is lacking, the prior art multi-skill cooperation is low in efficiency, functional splitting exists, a plurality of voice skills are independently operated, cooperative logic is not available, the users need to trigger a plurality of instructions respectively to complete complex requirements (such as 'booking ticket+reserving and delivering machine' needs to be divided into two instructions), and the problems need to be solved. Disclosure of Invention The invention aims to provide a large-model-based user portrait memory and voice skill scenerization linkage method and system, which are characterized in that the invention realizes the deep semantic understanding and long-term memory capacity of a large model, the method comprises the steps of constructing a dynamically updated user portrait, and combining scene context to realize active and accurate linkage of voice skills so as to solve the problems of scene splitting and individuation missing of the traditional scheme in the background technology. In order to achieve the purpose, the invention provides the following technical scheme that the user portrait memory and voice skill scenerization linkage method based on the large model comprises the following steps: Acquiring multi-source data, performing data processing on the multi-source data to obtain a multi-dimensional portrait tag of a user, setting an update triggering condition of the multi-dimensional portrait tag, and iterating the multi-dimensional portrait tag in real time based on the update triggering condition, wherein the multi-source data comprises active behavior data of the user and scene data of a scene where the user is located; constructing a scene feature vector according to scene data, and combining the multidimensional image tag and the scene feature vector to obtain the current potential intention of the user through an intention reasoning algorithm; receiving a user voice command, and analyzing the user demand according to the user voice command and the current potential intention of the user so as to generate a demand list; Setting a skill library, matching voice skills from the skill library according to the requirement list, generating a voice skill linkage sequence according to the association degree between the voice skills, and determining the voice skill execution sequence and the data interaction rule; according to the linkage sequence, the voice skill execution sequence and the data interaction rule, an execution result is obtained, and the execution result is integrated into a unified response and transmitted to a user; and acquiring feedback of the user on the unified response, and adjusting linkage logic of the multidimensional portrait tag and the voice skill linkage sequence according to the feedback. Further, collecting multi-source data, and performing data processing on the multi-source data to obtain a multi-dimensional portrait tag of a user, including: Collecting multi-source data, wherein the active behavior data of the user comprise dialogue text data, operation behavior data and feedback data of the user; And carrying out semantic analysis, preference extraction and behavior attribution on the multi-source data to finish the data processing of the multi-source data and obtain the multi-dimensional image tag of the user. Further, setting an update trigger condition of the multidimensional portrait tag, and iterating the multidimensional portrait tag in real time based on the update trigger condition,