CN-122027743-A - Intelligent customer service method, electronic device, storage medium and program product

CN122027743ACN 122027743 ACN122027743 ACN 122027743ACN-122027743-A

Abstract

The invention discloses an intelligent customer service method, electronic equipment, a storage medium and a program product. Relates to the field of artificial intelligence, and can be used in the field of financial science and technology. The voice command processing method comprises the steps of obtaining a target audio sequence corresponding to voice of a user when sound intensity mutation occurs according to a received voice command of the user, dividing the target audio sequence according to a neural oscillation frequency band to obtain a plurality of sub-audio sequences, extracting MFCCs of the plurality of sub-audio sequences, inputting the plurality of sub-audio sequences and the MFCCs of the plurality of sub-audio sequences into a pulse neural network to obtain the current emotion of the user and confidence corresponding to the current emotion, determining the intensity of the current emotion according to the confidence corresponding to the current emotion, determining a reply strategy of the voice command based on the intensity of the current emotion, and responding to the voice command by utilizing the reply strategy. Therefore, the reply strategy is adaptively adjusted based on the intensity of the emotion change of the user, and the intelligent customer service efficiency is improved.

Inventors

LIU YAO

Assignees

中国工商银行股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260126

Claims (10)

1. An intelligent customer service method, characterized in that the method comprises the following steps: acquiring a target audio sequence of a user according to a received voice instruction of the user, wherein the target audio sequence is an audio sequence corresponding to sound of the user when sound intensity mutation occurs; dividing the target audio sequence according to a neural oscillation frequency band to obtain a plurality of sub audio sequences, and extracting Mel Frequency Cepstrum Coefficient (MFCC) of the plurality of sub audio sequences; Inputting the plurality of sub-audio sequences and the MFCCs respectively corresponding to the plurality of sub-audio sequences into a pulse neural network to obtain the current emotion of the user and the confidence level corresponding to the current emotion; Determining the intensity of the user in the current emotion according to the confidence corresponding to the current emotion; And determining a response strategy of the voice instruction based on the intensity of the current emotion of the user, and responding to the voice instruction by utilizing the response strategy.
2. The method of claim 1, wherein the obtaining the target audio sequence for the user comprises: Acquiring the sound intensity corresponding to the audio sequence of the user at the current moment; Comparing the difference value of the sound intensity between the sound intensity of the audio sequence at the current moment and the sound intensity of the audio sequence at the previous moment with a sound intensity threshold value; And if the sound intensity difference value is larger than the sound intensity threshold value, determining that sound of the user is subjected to sound intensity mutation, and determining the audio sequence at the current moment as the target audio sequence.
3. The method of claim 1, wherein the impulse neural network comprises an input layer, an underlying layer, and an output layer, the underlying layer comprising a plurality of first neurons, the output layer comprising a plurality of second neurons, any one of the first neurons establishing a synaptic connection with each of the plurality of second neurons, the plurality of second neurons corresponding to different emotions, respectively; Inputting the plurality of sub-audio sequences and MFCCs corresponding to the plurality of sub-audio sequences into a pulse neural network, to obtain a current emotion of the user and a confidence level corresponding to the current emotion, including: Inputting the plurality of sub-audio sequences and MFCCs respectively corresponding to the plurality of sub-audio sequences into the hidden layer through the input layer; Performing feature extraction on the plurality of sub-audio sequences and MFCCs respectively corresponding to the plurality of sub-audio sequences through a plurality of first neurons in the hidden layer by adopting a leakage integral ignition model to obtain a plurality of space-time pulse features, and transmitting the plurality of space-time pulse features to second neurons corresponding to an output layer through each synaptic connection; And after converting the space-time pulse characteristics into pulse frequencies through any one second neuron in the output layer, determining the emotion corresponding to a second target neuron as the current emotion, and determining the pulse frequency corresponding to the second target neuron as the confidence level of the current emotion, wherein the second target neuron is the second neuron with the largest value of the pulse frequency.
4. A method according to claim 3, wherein the intelligent customer service method is applied to a federal server cluster, and the federal server cluster comprises a plurality of first servers and second servers, and the method is executed by any first server, and a pulse neural network is built in each first server; after determining the intensity of the user in the current emotion according to the confidence corresponding to the current emotion, the method further comprises: determining a weight updating amount according to the intensity of the current emotion of the user, wherein the weight updating amount is used for updating the weight of each synapse in the impulse neural network; adding random noise into the weight updating quantity to obtain a first intermediate weight updating quantity; encrypting the first intermediate weight updating quantity by using a preset public key to obtain an encrypted first intermediate weight updating quantity; The second server is used for aggregating the encrypted first intermediate weight updating amounts sent by the plurality of first servers to obtain the synaptic encrypted target weight updating amount, and sending the encrypted target weight updating amount to each first server in the federation server cluster; after receiving the encrypted target weight updating amount sent by the second server, decrypting the encrypted target weight updating amount by utilizing a private key stored by the second server, and updating the current weight of each synapse in the impulse neural network based on the encrypted target weight updating amount.
5. The method of claim 4, wherein updating the weights of the synapses in the impulse neural network based on the encrypted target weight update amount comprises: multiplying the encrypted target weight updating amount by a preset federal learning rate to obtain an intermediate value; And adding the intermediate value and the current weight of the synapse for any synapse to obtain the updated weight of any synapse.
6. The method of claim 4, wherein determining the weight update amount based on the intensity of the current emotion of the user comprises: Multiplying the intensity of the current emotion, the synaptic time difference and a preset basic learning rate to obtain the weight updating quantity; The synaptic time difference is the difference between the discharge time of a first neuron of a target synapse and the discharge time of a second neuron of the target synapse, and the target synapse is the synapse of the second neuron with the largest connection pulse frequency value.
7. The method of any of claims 1-6, wherein determining a reply strategy to the voice instruction based on the intensity of the user's current emotion comprises: If the intensity of the current emotion is greater than a first specified threshold, determining that the reply strategy is to push the voice instruction to be processed manually; if the intensity of the current emotion is not greater than the first specified threshold and is greater than a second specified threshold, determining that the reply policy is a simplified speech algorithm, wherein the first specified threshold is greater than the second specified threshold; and if the intensity of the current emotion is not greater than the second specified threshold, determining that the answer strategy is a standard speaking strategy.
8. An electronic device, the electronic device comprising: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the intelligent customer service method of any of claims 1-7.
9. A computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, the computer instructions for causing a processor to perform the intelligent customer service method of any one of claims 1-7 when executed.
10. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the intelligent customer service method according to any of claims 1-7.

Description

Intelligent customer service method, electronic device, storage medium and program product Technical Field The present invention relates to the field of artificial intelligence, and in particular, to an intelligent customer service method, an electronic device, a storage medium, and a program product. Background With the vigorous development of artificial intelligence technology, intelligent customer service has become an important tie for connecting users and services, and the intelligent customer service gradually replaces the traditional artificial customer service mode by virtue of strong automatic processing capacity and high-efficiency service response speed. The intelligent customer service system can receive the user consultation in real time, rapidly analyze the user intention and provide corresponding solutions or services through advanced natural language processing technology and machine learning algorithm. In the prior art, an intelligent customer service system mainly adopts cloud centralized processing. The method mainly comprises the steps of voice collection by a client, transmission to a cloud server, emotion analysis based on a machine learning model and processing result return. However, this method can only perform simple basic emotion recognition, and has a slow response to changes such as sudden volume increase of the client, and cannot be individually adjusted according to the response of the client. Disclosure of Invention The embodiment of the invention provides an intelligent customer service method, electronic equipment, a storage medium and a program product, which are used for identifying the slight change of emotion of a user in the communication process, adaptively adjusting a reply strategy based on the strength of the emotion change of the user and improving the intelligent customer service efficiency. According to an aspect of the embodiment of the present invention, there is provided an intelligent customer service method, including: acquiring a target audio sequence of a user according to a received voice instruction of the user, wherein the target audio sequence is an audio sequence corresponding to sound of the user when sound intensity mutation occurs; Dividing the target audio sequence according to a nerve oscillation frequency band to obtain a plurality of sub audio sequences, and extracting MFCCs (Mel Frequency Cepstral Coefficient, mel frequency cepstrum coefficients) of the plurality of sub audio sequences; Inputting the plurality of sub-audio sequences and the MFCCs respectively corresponding to the plurality of sub-audio sequences into a pulse neural network to obtain the current emotion of the user and the confidence level corresponding to the current emotion; Determining the intensity of the user in the current emotion according to the confidence corresponding to the current emotion; And determining a response strategy of the voice instruction based on the intensity of the current emotion of the user, and responding to the voice instruction by utilizing the response strategy. In a possible embodiment, the acquiring the target audio sequence of the user includes: Acquiring the sound intensity corresponding to the audio sequence of the user at the current moment; Comparing the difference value of the sound intensity between the sound intensity of the audio sequence at the current moment and the sound intensity of the audio sequence at the previous moment with a sound intensity threshold value; And if the sound intensity difference value is larger than the sound intensity threshold value, determining that sound of the user is subjected to sound intensity mutation, and determining the audio sequence at the current moment as the target audio sequence. In a possible embodiment, the impulse neural network includes an input layer, an underlying layer, and an output layer, the underlying layer including a plurality of first neurons, the output layer including a plurality of second neurons, any one of the first neurons establishing synaptic connections with the plurality of second neurons, the plurality of second neurons corresponding to different emotions, respectively; Inputting the plurality of sub-audio sequences and MFCCs corresponding to the plurality of sub-audio sequences into a pulse neural network, to obtain a current emotion of the user and a confidence level corresponding to the current emotion, including: Inputting the plurality of sub-audio sequences and MFCCs respectively corresponding to the plurality of sub-audio sequences into the hidden layer through the input layer; Performing feature extraction on the plurality of sub-audio sequences and MFCCs respectively corresponding to the plurality of sub-audio sequences through a plurality of first neurons in the hidden layer by adopting a leakage integral ignition model to obtain a plurality of space-time pulse features, and transmitting the plurality of space-time pulse features to second neurons correspond