CN-121983049-A - Man-machine interaction method of intelligent robot

CN121983049ACN 121983049 ACN121983049 ACN 121983049ACN-121983049-A

Abstract

The invention provides a man-machine interaction method of an intelligent robot, and belongs to the technical field of train maintenance. The method comprises the steps of constructing a corpus database of intelligent cleaning robot man-machine interaction instructions in a subway train cleaning scene, utilizing the corpus database to finely tune a pre-training language model, constructing a cloud side end collaborative framework, taking an intelligent control terminal as a cloud end to receive input voice instructions, converting the voice instructions into text instruction data through a voice recognition model, then inputting the finely tuned pre-training language model to generate executable structured control instructions, taking the cleaning robot as an edge computing node, and executing corresponding tasks and feeding back task execution states to the intelligent control terminal. The invention provides a highly intelligent, highly reliable and highly available voice interaction solution for the cleaning robot through the combination of cloud edge end cooperative intelligent distribution, end-to-end direct semantic mapping and data-driven closed-loop evolution.

Inventors

LIU SHAOYUAN
ZHANG XING
ZHANG JIAZHI
DU YONGJUN
TAN ZHONGBIN
DENG NA

Assignees

广东华能机电集团有限公司
湖大粤港澳大湾区创新研究院(广州增城)

Dates

Publication Date: 20260505
Application Date: 20260204

Claims (10)

1. The man-machine interaction method of the intelligent robot is characterized by comprising the following steps of: Step S1, constructing a corpus database of intelligent cleaning robot man-machine interaction instructions in a subway train cleaning scene, and performing fine adjustment on a pre-training language model by using the corpus database; step S2, a cloud side end collaborative framework is built, an intelligent control terminal is used as a cloud end to receive an input voice command, the voice command is converted into text command data through a voice recognition model, then a fine-tuned pre-training language model is input, an executable structural control command is generated, a cleaning robot is used as an edge computing node to receive the control command, corresponding tasks are executed, and task execution states are fed back to the intelligent control terminal.
2. The man-machine interaction method of an intelligent robot according to claim 1, wherein the construction process of the corpus database comprises the following steps: Step S11, collecting corpus data in a subway train cleaning scene, wherein the corpus data comprises actual operation instructions of historical cleaning operation, typical speaking operation of equipment fault treatment and diversified instructions generated by manual simulation; Step S12, the collected corpus data is cleaned in a mode of combining rule filtering and manual auditing; step S13, labeling the washed corpus data, wherein the labeled attribute comprises two dimensions of intention and entity; And S14, enhancing the language data by means of synonym substitution, sentence pattern conversion and noise addition.
3. The man-machine interaction method of the intelligent robot according to claim 2, wherein in the step S13, labeling tools are LabelStudio, labeling accuracy requires entity identification deviation less than or equal to 1, wherein intention is divided into task assignment, parameter adjustment, state inquiry and fault processing, and the entity comprises carriage number, operation part, cleaning grade and parameter threshold.
4. The human-machine interaction method of an intelligent robot according to claim 1, wherein the pre-training language model is a BERT-base model.
5. The man-machine interaction method of the intelligent robot according to claim 3, wherein the fine tuning process is characterized in that a cross entropy loss function is adopted to calculate deviation between a prediction intention and a labeling intention, a AdamW is selected as an optimizer, a learning rate is set to be 1e-5, training rounds are 30 rounds, batch size is 64, and meanwhile an early stopping strategy is adopted to ensure that a model is adapted to instruction analysis requirements of subway cleaning scenes.
6. The human-computer interaction method of the intelligent robot according to claim 1, wherein the intelligent control terminal and the cleaning robot adopt a mode of 5G, wi-Fi 6 double-link communication.
7. The robot-to-robot interaction method of claim 1, wherein the cleaning robot is configured with a PPO multitasking reinforcement learning model, and wherein the behavior strategy of the cleaning robot is dynamically adjusted according to the reward signal.
8. The human-computer interaction method of an intelligent robot according to claim 7, wherein the structure of the PPO multitask reinforcement learning model is as follows: State space, state vector Wherein D is a quantitative value of the dirt degree of the area to be cleaned, T is the operation duration of the cleaning robot, E is the accumulated energy consumption of the cleaning robot; Coding the running state of the cleaning robot; Motion space motion vector Wherein The joint angle of the mechanical arm of the cleaning robot; purge pressure for the cleaning robot; For the moving speed of the cleaning robot The strategy network architecture adopts a lightweight architecture of '3-layer full connection and 1-layer output', wherein an input layer is an 8-dimensional state vector, the number of hidden layer neurons is 64, 128 and 64 in sequence, an activation function is a ReLU function, an output layer is motion space probability distribution, and an activation function is Softmax; setting a multi-level rewarding mechanism, wherein the specific mathematical expression and weight setting logic of the rewarding function are as follows: ; In the formula, Indicating a base prize, 1 for the completion of cleaning the designated area, the incomplete number is 0; Indicating an efficiency benefit that is to be awarded, , For the time of the zone cleaning, For the preset standard working time, the working time is set, Higher values represent higher operating efficiency; Indicating a quality benefit to be awarded, , For the area-cleaning coverage rate, In order to achieve a residual rate of the dirt, Higher values represent better cleaning quality; indicating an energy saving benefit, , In order to be able to consume the energy in practice, Is the standard energy consumption of the device, Higher values represent better energy consumption control; 、、、 And representing the weight coefficient and dynamically adjusting according to the job priority.
9. The man-machine interaction method of the intelligent robot is characterized in that a federal learning model is deployed at the intelligent control terminal, data sharing and model collaborative optimization of a plurality of cleaning robots are achieved based on federal learning technology, after model training is completed locally by all edge nodes, model parameters are uploaded to the intelligent control terminal only, the intelligent control terminal generates a global optimization model according to weighted average of data volume of all the nodes, and for a data heterogeneous scene, fedProx algorithm is introduced, model deviation is relieved through addition of near-end item constraint, and model suitability is improved.
10. The man-machine interaction method of an intelligent robot according to claim 1, wherein the execution state is voice-broadcasted by means of a voice synthesis model in the execution state feedback stage.

Description

Man-machine interaction method of intelligent robot Technical Field The invention belongs to the technical field of train maintenance, and particularly relates to a man-machine interaction method of an intelligent robot. Background In the daily maintenance of subway trains, the cleaning of the outer part and the bottom of the train is an important link, and the traditional technology mainly relies on manual use of equipment such as high-pressure water guns and the like in a maintenance warehouse. The mode has the problems of high labor intensity, dependence on worker experience on cleaning effect, lower operation efficiency, potential safety hazard in a narrow vehicle bottom space and the like. In recent years, an orbital or mobile cleaning robot for replacing a manual work is beginning to appear to perform an automated train outer surface cleaning operation. However, in practical overhaul application scenes, the man-machine interaction mode of the existing robot has obvious limitations that on one hand, the operation process highly depends on a preset program, when facing stubborn stains, complex vehicle body structures or local areas needing important cleaning, the robot lacks an interaction means for allowing operators to conveniently intervene and implement accurate and flexible control, and on the other hand, in an urgent overhaul window, the operators are difficult to quickly and intuitively master the real-time operation state, the cleaning coverage rate and the abnormal condition of the robot, and the overall operation efficiency and the cooperativity are affected. Therefore, it is necessary to provide a man-machine interaction method for intelligent robots to solve the above problems. Disclosure of Invention The invention provides a man-machine interaction method of an intelligent robot, which provides a high-intelligent, high-reliability and high-availability voice interaction solution for a cleaning robot through the organic combination of cloud edge end cooperative intelligent distribution, end-to-end direct semantic mapping and data-driven closed-loop evolution, thereby effectively solving at least one technical problem related to the background technology. In order to solve the technical problems, the invention is realized as follows: A man-machine interaction method of an intelligent robot comprises the following steps: Step S1, constructing a corpus database of intelligent cleaning robot man-machine interaction instructions in a subway train cleaning scene, and performing fine adjustment on a pre-training language model by using the corpus database; And S2, building a cloud side end collaborative framework, taking the intelligent control terminal as a cloud end to receive an input voice command, converting the voice command into text command data through a voice recognition model, then inputting a fine-tuned pre-training language model to generate an executable structured control command, and taking the cleaning robot as an edge computing node for receiving the control command, executing a corresponding task and feeding back a task execution state to the intelligent control terminal. As a preferred improvement, the construction process of the corpus database comprises the following steps: Step S11, collecting corpus data in a subway train cleaning scene, wherein the corpus data comprises actual operation instructions of historical cleaning operation, typical speaking operation of equipment fault treatment and diversified instructions generated by manual simulation; Step S12, the collected corpus data is cleaned in a mode of combining rule filtering and manual auditing; step S13, labeling the washed corpus data, wherein the labeled attribute comprises two dimensions of intention and entity; And S14, enhancing the language data by means of synonym substitution, sentence pattern conversion and noise addition. As a preferable improvement, in the step S13, the labeling tool is LabelStudio, and the labeling precision requires that the entity identification deviation is less than or equal to 1, wherein the intention is divided into task assignment, parameter adjustment, state inquiry and fault processing, and the entity comprises a carriage number, a working part, a cleaning grade and a parameter threshold. As a preferred improvement, the pre-trained language model is a BERT-base model. As a preferable improvement, the fine tuning process is to calculate the deviation between the prediction intention and the labeling intention by adopting a cross entropy loss function, select AdamW by an optimizer, set the learning rate to be 1e-5, set the training round to be 30 rounds and set the batch size to be 64, and simultaneously adopt an early-stop strategy to ensure that the model is suitable for the instruction analysis requirement of the subway cleaning scene. As a preferable improvement, the intelligent control terminal and the cleaning robot adopt a mode of 5G, wi-Fi 6 double-link communication. As a