CN-122024715-A - Voice control method, system, device, storage medium and program product
Abstract
The application discloses a voice control method, a system, equipment, a storage medium and a program product, which relate to the technical field of industrial equipment control and artificial intelligence intersection and comprise the steps of obtaining voice data; the method comprises the steps of obtaining a voice data, carrying out text conversion and semantic analysis on the voice data through a pre-trained NLP semantic understanding model to obtain a semantic analysis result, combining the semantic analysis result with a preset checking mechanism to generate a device control instruction, sending the device control instruction to target equipment, receiving a state or data returned after the target equipment executes the device control instruction, generating result feedback based on the state or data, breaking through the limitation of the traditional fixed instruction, supporting the interaction of a user with industrial equipment in natural language, combining the semantic and instruction mapping through the preset checking mechanism, confirming the safety of operation, and supporting the interaction of the industrial equipment in multiple coverage scenes.
Inventors
- DENG CHENDONG
- Wu Meipeng
- LI DONG
Assignees
- 荟普智能装备(深圳)有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251224
Claims (10)
- 1. A voice control method, characterized in that the voice control method comprises: Acquiring voice data; performing text conversion and semantic analysis on the voice data through a pre-trained NLP semantic understanding model to obtain a semantic analysis result; combining the semantic analysis result with a preset verification mechanism, generating a device control instruction, and sending the device control instruction to target devices; And receiving the state or data returned after the target equipment executes the equipment control instruction, and generating result feedback based on the state or data.
- 2. The speech control method of claim 1, wherein the pre-trained NLP semantic understanding model comprises a pre-trained BERT model and a pre-trained BiLSTM-CRF network; The step of performing text conversion and semantic analysis on the voice data through the pre-trained NLP semantic understanding model to obtain a semantic analysis result comprises the following steps: converting the voice data into text information through the pre-trained BERT model, and extracting semantic features from the text information; Inputting the semantic features to the pre-trained BiLSTM-CRF network, and executing intention classification and entity identification by the pre-trained BiLSTM-CRF network to determine user operation intention and corresponding equipment control parameters and generate the semantic analysis result.
- 3. The voice control method of claim 2, wherein the step of converting the voice data into text information by the pre-trained BERT model and extracting semantic features from the text information is preceded by the steps of: acquiring industrial equipment interaction corpus, and carrying out joint labeling of intention and entity on the industrial equipment interaction corpus; Constructing a corpus based on the annotated industrial equipment interaction corpus; and utilizing the corpus to adjust industrial terms of the BERT model and training BiLSTM-CRF network.
- 4. The voice control method according to claim 1, wherein the step of generating the device control command by combining the semantic parsing result and a preset verification mechanism includes: And carrying out voice pattern identity verification and key operation confirmation on the voice data according to the preset verification mechanism, and mapping the semantic analysis result into the equipment operation instruction through a preset semantic mapping dictionary after verification is passed.
- 5. The voice control method according to claim 4, wherein the step of performing voice print identity verification and key operation confirmation on the voice data according to the preset verification mechanism, and mapping the semantic parsing result into the device operation instruction through a preset semantic mapping dictionary after the verification is passed comprises: Voiceprint comparison is carried out on the voice data and the voice features of the preregistered user; If the voiceprint comparison result meets a preset matching threshold, judging whether the user operation intention in the semantic analysis result belongs to a key operation type according to the semantic analysis result; If the user operation intention belongs to the key operation type, generating a voice prompt based on the semantic analysis result, and after receiving affirmative voice feedback returned by the user based on the voice prompt, mapping synonyms and/or fuzzy expressions in the semantic analysis result into the equipment operation instruction through the preset semantic mapping dictionary.
- 6. The voice control method of claim 2, wherein after the steps of converting the voice data into text information by the pre-trained BERT model and extracting semantic features from the text information, comprising: Collecting user interaction data and labeling error cases in the user interaction data; and based on the user interaction data and the corresponding error cases, performing incremental training on the BERT model by adopting a low learning rate.
- 7. A speech control system, the speech control system comprising: the data acquisition module is used for acquiring voice data; The voice processing module is used for carrying out text conversion and semantic analysis on the voice data through the pre-trained NLP semantic understanding model to obtain a semantic analysis result; the instruction generation module is used for generating a device control instruction by combining the semantic analysis result and the voiceprint verification mechanism and sending the device control instruction to target devices; and the feedback output module is used for receiving the state or data returned after the target equipment executes the equipment control instruction and generating result feedback based on the state or data.
- 8. A speech control device, characterized in that the device comprises a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program being configured to implement the steps of the speech control method according to any one of claims 1 to 6.
- 9. A storage medium, characterized in that the storage medium is a computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of the speech control method according to any one of claims 1 to 6.
- 10. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the steps of the speech control method according to any one of claims 1 to 6.
Description
Voice control method, system, device, storage medium and program product Technical Field The present application relates to the field of industrial equipment control and artificial intelligence intersection technologies, and in particular, to a voice control method, a system, an apparatus, a storage medium, and a program product. Background In the current context of industrial intelligent transformation, voice control technology is gradually being introduced into industrial equipment operating scenes as an important way of man-machine interaction. However, existing industrial equipment voice control techniques have significant limitations: Firstly, the command format is highly structured and fixed, requiring the user to strictly follow a preset grammar (such as ' device a_start_mode 1 ', ' parameter_temperature_set_80 ℃), resulting in high learning cost, and difficulty for non-professional staff to quickly get up; secondly, the semantic understanding capability of the system is weak, and the synonymous expression and fuzzy quality (such as 'slowing down the speed', 'how much is produced today') in the natural language cannot be effectively analyzed only by a keyword matching mechanism, so that the flexibility and practicability of interaction are limited; Thirdly, the coverage of the application scene is narrow, the basic operations such as equipment start-stop and the like are mainly concentrated, and support for complex tasks (such as fault diagnosis, parameter calibration, production data query and report) is lacked; Fourth, there is an obvious short board in terms of security mechanism and device compatibility, which lacks security policy based on identity authentication and secondary confirmation of key operation, and adopts a closed architecture, so that it is difficult to adapt to industrial devices with different brands and different communication protocols. Although Natural Language Processing (NLP) technology has been developed in recent years (such as pre-training language model BERT and intention recognition algorithm) to solve the above problems, the application of the existing NLP technology in industrial equipment control scenes still lacks targeted optimization-does not combine with industrial corpus characteristics, equipment protocol adaptation and operation safety requirements, and is difficult to directly land. The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present application and is not intended to represent an admission that the foregoing is prior art. Disclosure of Invention The application mainly aims to provide a voice control method, a voice control system, voice control equipment, a voice control storage medium and a voice control program product, and aims to solve the technical problem of how to realize natural language interaction of industrial equipment which is natural, safe and covers multiple scenes. In order to achieve the above object, the present application provides a voice control method, which is characterized in that the voice control method includes: Acquiring voice data; performing text conversion and semantic analysis on the voice data through a pre-trained NLP semantic understanding model to obtain a semantic analysis result; combining the semantic analysis result with a preset verification mechanism, generating a device control instruction, and sending the device control instruction to target devices; And receiving the state or data returned after the target equipment executes the equipment control instruction, and generating result feedback based on the state or data. In one embodiment, the pre-trained NLP semantic understanding model includes a pre-trained BERT model and a pre-trained BiLSTM-CRF network; The step of performing text conversion and semantic analysis on the voice data through the pre-trained NLP semantic understanding model to obtain a semantic analysis result comprises the following steps: converting the voice data into text information through the pre-trained BERT model, and extracting semantic features from the text information; Inputting the semantic features to the pre-trained BiLSTM-CRF network, and executing intention classification and entity identification by the pre-trained BiLSTM-CRF network to determine user operation intention and corresponding equipment control parameters and generate the semantic analysis result. In an embodiment, before the step of converting the speech data into text information and extracting semantic features from the text information by the pre-trained BERT model, the method further includes: acquiring industrial equipment interaction corpus, and carrying out joint labeling of intention and entity on the industrial equipment interaction corpus; Constructing a corpus based on the annotated industrial equipment interaction corpus; and utilizing the corpus to adjust industrial terms of the BERT model and training BiLSTM-CRF network. In an embodiment