CN-122024720-A - Intelligent assistant multi-user interaction method and system based on voiceprint recognition

CN122024720ACN 122024720 ACN122024720 ACN 122024720ACN-122024720-A

Abstract

The invention discloses an intelligent assistant multi-user interaction method and system based on voiceprint recognition, which relate to the technical field of data interaction, and comprise the steps of 1, establishing a voiceprint model, extracting voiceprint characteristics with distinction from voice signals, providing basic data for subsequent user identity recognition, 2, recognizing current interactive user identities through similarity calculation and intelligent threshold adjustment, providing basis for subsequent personalized services, 3, establishing a preference model, and respectively ensuring that each user can acquire interactive experience conforming to habits and preferences from a basic preference layer, a functional preference layer, a behavior habit layer and a context perception layer, 4, establishing an arbitration model intelligent perception interaction environment, and making reasonable arbitration decisions under a multi-user scene, and 5, continuously optimizing each model according to user feedback and performance monitoring feedback.

Inventors

XU HUI
GENG FEI

Assignees

浪潮通信技术有限公司

Dates

Publication Date: 20260512
Application Date: 20260203

Claims (8)

1. An intelligent assistant multi-user interaction method based on voiceprint recognition is characterized by comprising the following steps: Step 1, a voiceprint model is established, voiceprint characteristics with distinction degree are extracted from voice signals, and basic data are provided for subsequent user identification; Step 2, identifying the identity of the current interactive user through similarity calculation and intelligent threshold adjustment, and providing basis for the follow-up personalized service: calculating cosine similarity between the input voiceprint features and voiceprint features stored in the database; Setting a dynamic similarity threshold value, and adaptively adjusting the dynamic similarity threshold value between 0.75 and 0.95 according to the environmental noise level; when the similarity exceeds a threshold value, confirming the identity of the user and outputting the user ID; Step 3, establishing a preference model, ensuring that each user can obtain interactive experience conforming to habit and preference from a basic preference layer, a functional preference layer, a behavior habit layer and a context awareness layer respectively, Step 4, establishing an arbitration model intelligent perception interaction environment, and making a reasonable arbitration decision in a multi-user scene: Acquiring noise level and light intensity information through an environment sensor, and automatically adjusting voice recognition sensitivity according to environment parameters; detecting instruction conflict in real time, and carrying out multi-user arbitration by applying predefined arbitration rules and combining a priority scoring algorithm: Detecting conflict detection types, wherein the conflict detection types comprise resource conflict, authority conflict and logic conflict; re-compute priority, priority score = 0.4 x device ownership +0.3 x frequency of use +0.2 x temporal context +0.1 x location advantage; Configuring arbitration rules, including general rules with priority scores higher than the priority scores, security rules with priority of security instructions and custom rules, Performing multi-user arbitration according to arbitration rules; and 5, continuously optimizing each model according to the user feedback and the performance monitoring feedback.
2. The intelligent assistant multi-user interaction method based on voiceprint recognition according to claim 1, wherein in step 1, a deep convolutional neural network structure is adopted, a voice signal with a sampling rate of 16kHz is received, a frame length is 25ms, a frame shift is 10ms, 5 convolution layers are included, 3×3 convolution kernels are used for each layer, a mel frequency cepstrum coefficient and fundamental frequency characteristics are extracted, characteristics of different layers are spliced to form 256-dimensional characteristic vectors, and normalized voiceprint characteristic representation is output through a full connection layer.
3. The intelligent assistant multi-user interaction method based on voiceprint recognition according to claim 1, wherein in step 3, the method comprises: The user's basic setup preferences for the voice assistant, including voice assistant timbre, response speed, interactive volume, The preferences of the user in terms of specific functions are recorded at the function preference layer, including music style tag weights, news content category preferences, device control authority levels, The behavior habit layer learns and stores the behavior patterns of the user, including high-frequency usage instruction statistics, time usage pattern analysis, interactive style preference identification, The context awareness layer dynamically adjusts configuration according to the environment context, including setting environment self-adaptive parameters, multi-user arbitration strategies and privacy protection grades.
4. The intelligent assistant multi-user interaction method based on voiceprint recognition according to claim 1, wherein in step 5, the method comprises: receiving user scores, instruction corrections, preference setting adjustments of user feedback, Receiving performance monitoring feedback, acquiring identification accuracy, instruction execution success rate, interaction completion time and function use frequency, and continuously optimizing a model; Updating the voiceprint model, namely fine-tuning a feature extraction network based on a new voice sample, and setting a new feature weight of 0.3 and a historical feature weight of 0.7 by adopting a sliding window weighted average method; Updating a preference model, namely recalculating the user interest vector every 7 days based on a collaborative filtering algorithm; Updating the arbitration model, namely dynamically adjusting each weight parameter according to the conflict resolution effect.
5. An intelligent assistant multi-user interaction system based on voiceprint recognition is characterized by comprising a voiceprint feature extraction module, a user identity recognition module, a personalized configuration management module, a context perception interaction module and an increment learning optimization module, The voiceprint feature extraction module establishes a voiceprint model to extract voiceprint features with distinction degree from the voice signals, and provides basic data for subsequent user identification; The user identity recognition module recognizes the current interactive user identity through similarity calculation and intelligent threshold adjustment, and provides basis for subsequent personalized service: calculating cosine similarity between the input voiceprint features and voiceprint features stored in the database; Setting a dynamic similarity threshold value, and adaptively adjusting the dynamic similarity threshold value between 0.75 and 0.95 according to the environmental noise level; when the similarity exceeds a threshold value, confirming the identity of the user and outputting the user ID; the personalized configuration management module establishes a preference model, ensures that each user can obtain interactive experience conforming to habit and preference from a basic preference layer, a functional preference layer, a behavior habit layer and a context perception layer respectively, The context-aware interaction module establishes an arbitration model intelligent-aware interaction environment, and makes reasonable arbitration decisions in a multi-user scene: Acquiring noise level and light intensity information through an environment sensor, and automatically adjusting voice recognition sensitivity according to environment parameters; detecting instruction conflict in real time, and carrying out multi-user arbitration by applying predefined arbitration rules and combining a priority scoring algorithm: Detecting conflict detection types, wherein the conflict detection types comprise resource conflict, authority conflict and logic conflict; re-compute priority, priority score = 0.4 x device ownership +0.3 x frequency of use +0.2 x temporal context +0.1 x location advantage; Configuring arbitration rules, including general rules with priority scores higher than the priority scores, security rules with priority of security instructions and custom rules, Performing multi-user arbitration according to arbitration rules; the incremental learning optimization module continuously optimizes each model according to user feedback and performance monitoring feedback.
6. The intelligent assistant multi-user interaction system based on voiceprint recognition according to claim 5, wherein the voiceprint feature extraction module adopts a deep convolutional neural network structure, receives a voice signal with a sampling rate of 16kHz, has a frame length of 25ms, moves a frame to 10ms, comprises 5 convolution layers, extracts a Mel frequency cepstrum coefficient and fundamental frequency features by using a3×3 convolution kernel in each layer, splices features of different layers to form 256-dimensional feature vectors, and outputs normalized voiceprint feature representations through full connection layers.
7. The intelligent assistant multi-user interactive system based on voiceprint recognition according to claim 5, wherein the personalized configuration management module stores user's basic setup preferences of the voice assistant at a basic preference layer, including voice assistant tone, response speed, interactive volume, The preferences of the user in terms of specific functions are recorded at the function preference layer, including music style tag weights, news content category preferences, device control authority levels, The behavior habit layer learns and stores the behavior patterns of the user, including high-frequency usage instruction statistics, time usage pattern analysis, interactive style preference identification, The context awareness layer dynamically adjusts configuration according to the environment context, including setting environment self-adaptive parameters, multi-user arbitration strategies and privacy protection grades.
8. The intelligent assistant multi-user interaction system based on voiceprint recognition according to claim 5, wherein the incremental learning optimization module receives user scores, instruction corrections, preference setting adjustments for user feedback, Receiving performance monitoring feedback, acquiring identification accuracy, instruction execution success rate, interaction completion time and function use frequency, and continuously optimizing a model; Updating the voiceprint model, namely fine-tuning a feature extraction network based on a new voice sample, and setting a new feature weight of 0.3 and a historical feature weight of 0.7 by adopting a sliding window weighted average method; Updating a preference model, namely recalculating the user interest vector every 7 days based on a collaborative filtering algorithm; Updating the arbitration model, namely dynamically adjusting each weight parameter according to the conflict resolution effect.

Description

Intelligent assistant multi-user interaction method and system based on voiceprint recognition Technical Field The invention discloses an intelligent assistant multi-user interaction method and system based on voiceprint recognition, and relates to the technical field of data interaction. Background There are a number of interactive solutions based on voiceprint recognition in the field of intelligent voice assistants. In the prior art, voiceprint recognition is mainly applied to user authentication and basic personalization setting, and although the prior art realizes voiceprint-based personalized interaction to a certain extent, some problems still exist: the deep integration of the identification and the personalized service is insufficient, namely, in the prior art, voiceprint identification is mainly used as an identification verification means and has obvious disconnection with deep personalized service. Only limited preset parameters can be loaded, and dynamic adjustment and depth personalized customization cannot be performed according to actual use habits and behavior patterns of users. The user preference configuration is static, and the self-adaption capability is lacking, namely the user preference setting of the existing system is mostly fixed, and the self-adaption adjustment can not be carried out according to the behavior change and the environment change of the user. The conflict resolution mechanism in multi-user scenarios is imperfect, and the prior art lacks effective conflict detection and arbitration strategies in scenarios where multiple users interact with the device at the same time. The instruction priority cannot be judged intelligently, so that the user experience is reduced and even the wrong operation is executed. The privacy protection mechanism is not sound enough, and the prior art lacks a perfect privacy grading protection mechanism when processing multi-user sensitive information. User data of different authority levels cannot be effectively isolated and protected, and privacy leakage risks exist. Disclosure of Invention The invention provides an intelligent assistant multi-user interaction method and system based on voiceprint recognition, which solve the technical problems of voiceprint recognition and personalized service disjoint, static configuration, multi-user conflict, insufficient privacy protection and the like in the prior art. The specific scheme provided by the invention is as follows: the invention provides an intelligent assistant multi-user interaction method based on voiceprint recognition, which comprises the following steps: Step 1, a voiceprint model is established, voiceprint characteristics with distinction degree are extracted from voice signals, and basic data are provided for subsequent user identification; Step 2, identifying the identity of the current interactive user through similarity calculation and intelligent threshold adjustment, and providing basis for the follow-up personalized service: calculating cosine similarity between the input voiceprint features and voiceprint features stored in the database; Setting a dynamic similarity threshold value, and adaptively adjusting the dynamic similarity threshold value between 0.75 and 0.95 according to the environmental noise level; when the similarity exceeds a threshold value, confirming the identity of the user and outputting the user ID; Step 3, establishing a preference model, ensuring that each user can obtain interactive experience conforming to habit and preference from a basic preference layer, a functional preference layer, a behavior habit layer and a context awareness layer respectively, Step 4, establishing an arbitration model intelligent perception interaction environment, and making a reasonable arbitration decision in a multi-user scene: Acquiring noise level and light intensity information through an environment sensor, and automatically adjusting voice recognition sensitivity according to environment parameters; detecting instruction conflict in real time, and carrying out multi-user arbitration by applying predefined arbitration rules and combining a priority scoring algorithm: Detecting conflict detection types, wherein the conflict detection types comprise resource conflict, authority conflict and logic conflict; re-compute priority, priority score = 0.4 x device ownership +0.3 x frequency of use +0.2 x temporal context +0.1 x location advantage; Configuring arbitration rules, including general rules with priority scores higher than the priority scores, security rules with priority of security instructions and custom rules, Performing multi-user arbitration according to arbitration rules; and 5, continuously optimizing each model according to the user feedback and the performance monitoring feedback. Further, in the step 1 of the intelligent assistant multi-user interaction method based on voiceprint recognition, a deep convolutional neural network structure is adopted, a voi