CN-121999767-A - Iterative updating method, device, system, vehicle and server for voice application

CN121999767ACN 121999767 ACN121999767 ACN 121999767ACN-121999767-A

Abstract

The invention discloses an iterative updating method, device, system, vehicle and server of voice application, relating to the technical field of voice recognition, wherein the method is applied to rail transit vehicles and comprises the steps of acquiring an original voice data stream acquired by a vehicle-mounted pickup after the current voice application is started; the method comprises the steps of carrying out voice recognition on an original voice data stream by utilizing a voice recognition model of a current voice application to obtain a voice recognition result, generating and storing a model iteration data set according to the voice recognition result and recognition voice data corresponding to each voice recognition result, and sending the model iteration data set to a cloud server.

Inventors

HU YUNQING
LIU REN
LIU YUE
LUO XIAO

Assignees

中车株洲电力机车研究所有限公司

Dates

Publication Date: 20260508
Application Date: 20241106

Claims (15)

1. An iterative updating method for voice application, which is applied to rail transit vehicles, comprising: After a current voice application is started, acquiring an original voice data stream acquired by a vehicle-mounted sound pick-up, wherein the current voice application is a voice application deployed in the rail transit vehicle; Performing voice recognition on the original voice data stream by utilizing the voice recognition model of the current voice application to obtain a voice recognition result; Generating and storing a model iteration dataset according to the voice recognition result and recognition voice data corresponding to each voice recognition result; and sending the model iteration data set to a cloud server, so that the cloud server utilizes the model iteration data set to perform migration learning on the voice recognition model of the current voice application, and an iteration voice recognition model is obtained.
2. The iterative updating method of a speech application according to claim 1, wherein said performing speech recognition on said original speech data using a speech recognition model of said current speech application to obtain a speech recognition result comprises: converting the data format of the original voice data stream into a preset data format to obtain a preprocessed voice data stream; And performing voice recognition on the preprocessed voice data stream by utilizing the voice recognition model of the current voice application to obtain the voice recognition result.
3. The iterative updating method for a speech application according to claim 2, wherein the generating and storing of the model iterative data set based on the speech recognition result and the recognition speech data corresponding to each of the speech recognition results, respectively, comprises: Acquiring current voice recognition data corresponding to a current voice recognition result, wherein the current voice recognition result is any voice recognition result, and the recognition voice data is original voice data in the original voice data stream and/or preprocessed voice data in the preprocessed voice data stream; and labeling the current recognition voice data by utilizing the current voice recognition result to obtain labeled voice data, and storing the labeled voice data into the model iteration data set.
4. The iterative updating method for a speech application according to claim 1, wherein the generating and storing of the model iterative data set based on the speech recognition result and the recognition speech data corresponding to each of the speech recognition results, respectively, comprises: and generating and storing a model iteration data set according to the voice recognition result and recognition voice data and recognition error information corresponding to each voice recognition result.
5. The iterative update method for a speech application of claim 4, further comprising: Executing the vehicle control operation corresponding to the acquired current voice recognition result; detecting whether a preset cancel operation corresponding to the vehicle control operation is executed within a preset time period; If yes, determining that the recognition error information corresponding to the current voice recognition result is voice recognition error; If not, determining that the recognition error information corresponding to the current voice recognition result is correct in voice recognition.
6. The iterative updating method of a speech application according to claim 1, further comprising, after said sending the model iterative dataset to a cloud server: and deploying and updating the current voice application according to the application package corresponding to the iterative voice recognition model received from the cloud server.
7. An iterative updating apparatus for a voice application, for use with a rail transit vehicle, comprising: The system comprises a data acquisition module, a vehicle-mounted pickup, a voice acquisition module and a voice acquisition module, wherein the data acquisition module is used for acquiring an original voice data stream acquired by the vehicle-mounted pickup after a current voice application is started; The model reasoning module is used for carrying out voice recognition on the original voice data stream by utilizing the voice recognition model of the current voice application to obtain a voice recognition result; the data storage module is used for generating and storing a model iteration data set according to the voice recognition result and recognition voice data corresponding to each voice recognition result; And the data uploading module is used for sending the model iteration data set to a cloud server so that the cloud server can utilize the model iteration data set to conduct migration learning on the voice recognition model of the current voice application, and an iteration voice recognition model is obtained.
8. A rail transit vehicle, comprising: A memory for storing a computer program; a processor for implementing the steps of the iterative updating method of a speech application according to any of claims 1 to 6 when said computer program is executed.
9. An iterative updating method for a voice application is characterized by being applied to a cloud server and comprising the following steps: Receiving a model iteration data set sent by each rail transit vehicle by using a current voice application, wherein the model iteration data set comprises each piece of recognition voice data and a voice recognition result corresponding to each piece of recognition voice data, and the current voice application is a voice application of a current version deployed in the rail transit vehicle; Performing migration learning on a target voice recognition model by using the model iteration data set to obtain an iteration voice recognition model, wherein the target voice recognition model is a voice recognition model corresponding to the current voice application; And generating an application package of an updated version corresponding to the iterative voice recognition model so that the rail transit vehicle deploys voice application corresponding to the iterative voice recognition model by using the application package.
10. The iterative update method for a speech application according to claim 9, further comprising: acquiring an initial voice keyword data set, wherein the initial voice keyword data set comprises each Chinese keyword voice data and voice recognition keywords corresponding to each Chinese keyword voice data; Training an initial Chinese voice keyword recognition model by utilizing the initial voice keyword data set to obtain a voice recognition model; and generating an application packet of an initial version corresponding to the voice recognition model.
11. The iterative update method for a speech application according to claim 10, wherein said obtaining an initial speech keyword dataset comprises: Acquiring Chinese voice original data; data cleaning is carried out on the Chinese voice original data to obtain cleaned voice data; generating a spectrogram corresponding to the cleaned voice data; According to the annotation instruction corresponding to the obtained spectrogram, annotation voice data in the cleaned voice data and voice recognition keywords corresponding to each piece of annotation voice data are obtained; and generating the initial voice keyword data set according to the endorsement voice data and the voice recognition keywords corresponding to the endorsement voice data.
12. The iterative update method for a speech application according to claim 9, further comprising: According to the acquired voice application deployment instruction, pushing an application package of a specified version corresponding to the voice application deployment instruction to a target rail transit vehicle so that the target rail transit vehicle deploys and installs the voice application of the specified version by using the application package of the specified version, wherein the target rail transit vehicle is the rail transit vehicle corresponding to the voice application deployment instruction.
13. An iterative updating device for a voice application, applied to a cloud server, comprising: The system comprises a data receiving module, a model iteration data set, a data processing module and a data processing module, wherein the data receiving module is used for receiving a model iteration data set sent by each rail transit vehicle by using a current voice application, the model iteration data set comprises each piece of recognition voice data and a voice recognition result corresponding to each piece of recognition voice data, and the current voice application is a voice application of a current version deployed in the rail transit vehicle; the model iteration module is used for performing migration learning on a target voice recognition model by utilizing the model iteration data set to obtain an iteration voice recognition model, wherein the target voice recognition model is a voice recognition model corresponding to the current voice application; and the application package updating module is used for generating an application package of an updated version corresponding to the iterative voice recognition model so that the rail transit vehicle deploys the voice application corresponding to the iterative voice recognition model by utilizing the application package.
14. A cloud server, comprising: A memory for storing a computer program; Processor for implementing the steps of the iterative updating method of a speech application according to any of claims 9 to 12 when executing said computer program.
15. An iterative update system for a voice application comprising the rail transit vehicle of claim 8 and the cloud server of claim 14.

Description

Iterative updating method, device, system, vehicle and server for voice application Technical Field The invention relates to the technical field of voice recognition, in particular to an iteration updating method, device and system for voice application, a rail transit vehicle and a cloud server. Background AI (artificial intelligence) voice application is used as the most representative intelligent technology at present, is widely applied to the fields of public security, transportation, manufacturing, finance, medical treatment, education and the like, and provides great assistance for the upgrading of various industries of society. But are limited in many ways, development and updating of speech applications is difficult to implement quickly. At present, due to the diversification of AI voice application scenes at the present stage, the demand points are highly dispersed, and the programming capability of practitioners is limited, so that the AI application development iteration speed is low, and the application potential cannot be fully mined. In response to this situation, in the past few years, many internet science and technology companies at home and abroad have successively introduced mature intelligent voice application services, and the results have been applied in mass engineering. However, in the field of rail transit, the problems of insufficient service data volume, slow integration of upstream and downstream resources, weak calculation capability of a terminal and the like seriously restrict the rapid engineering landing of intelligent voice products of the rail transit. Therefore, the method can realize rapid iterative updating of the voice application in the track traffic scene, improve the applicability of the voice application in the track traffic scene, and accelerate the engineering landing speed of intelligent products of the track traffic voice, which is a problem to be solved urgently nowadays. Disclosure of Invention The invention aims to provide an iterative updating method, device and system for voice application, a rail transit vehicle and a cloud server, so that rapid iterative updating of the voice application in a rail transit scene is realized, and applicability of the voice application in the rail transit scene is improved. In order to solve the technical problems, the invention provides an iterative updating method for voice application, which is applied to rail transit vehicles and comprises the following steps: After a current voice application is started, acquiring an original voice data stream acquired by a vehicle-mounted sound pick-up, wherein the current voice application is a voice application deployed in the rail transit vehicle; Performing voice recognition on the original voice data stream by utilizing the voice recognition model of the current voice application to obtain a voice recognition result; Generating and storing a model iteration dataset according to the voice recognition result and recognition voice data corresponding to each voice recognition result; and sending the model iteration data set to a cloud server, so that the cloud server utilizes the model iteration data set to perform migration learning on the voice recognition model of the current voice application, and an iteration voice recognition model is obtained. In another aspect, the performing speech recognition on the original speech data by using the speech recognition model of the current speech application to obtain a speech recognition result includes: converting the data format of the original voice data stream into a preset data format to obtain a preprocessed voice data stream; And performing voice recognition on the preprocessed voice data stream by utilizing the voice recognition model of the current voice application to obtain the voice recognition result. In another aspect, the generating and storing a model iteration dataset according to the speech recognition result and the recognition speech data corresponding to each of the speech recognition results includes: Acquiring current voice recognition data corresponding to a current voice recognition result, wherein the current voice recognition result is any voice recognition result, and the recognition voice data is original voice data in the original voice data stream and/or preprocessed voice data in the preprocessed voice data stream; and labeling the current recognition voice data by utilizing the current voice recognition result to obtain labeled voice data, and storing the labeled voice data into the model iteration data set. In another aspect, the generating and storing a model iteration dataset according to the speech recognition result and the recognition speech data corresponding to each of the speech recognition results includes: and generating and storing a model iteration data set according to the voice recognition result and recognition voice data and recognition error information corresponding to each voice recogniti