CN-121989236-A - Remote operation method and system for man-machine collaborative robot based on visual interaction

CN121989236ACN 121989236 ACN121989236 ACN 121989236ACN-121989236-A

Abstract

The invention discloses a teleoperation method and a teleoperation system for a man-machine cooperative robot based on visual interaction, and belongs to the technical field of robot control and interaction. Aiming at the problems that a traditional teleoperation instruction is ambiguous, the execution is free from verification and the dependence on a network is strong, the method comprises the steps of displaying a robot video stream on a client, responding to click operation of a user on a video picture to determine a target two-dimensional coordinate, mapping the coordinate with a three-dimensional environment data space, and resolving and verifying the execution pose of the robot. The system allows a user to intuitively designate a target through frame selection/point selection, and the background automatically converts the two-dimensional interaction information into a three-dimensional robot instruction which is accurate and passes through feasibility verification. According to the invention, the coordinate precision is ensured by introducing the static reference image, the batch planning and submitting are supported through the task queue so as to reduce the network dependence, and the safe, accurate and high-robustness man-machine collaborative operation is realized.

Inventors

LUO CHUN
WANG FENGXU
YANG DONG
WANG XIANCHAO

Assignees

杭州宇岛人工智能科技有限公司

Dates

Publication Date: 20260508
Application Date: 20260127

Claims (10)

1. A teleoperation method of a man-machine cooperative robot based on visual interaction is characterized by comprising the following steps of Displaying a video picture transmitted in real time by a robot terminal on a user interface; responding to the clicking operation of a user on the video picture, and determining the two-dimensional coordinates of the target in the image; Performing space mapping on the two-dimensional coordinates and three-dimensional environment data acquired by the robot, and resolving the execution pose of the target under a robot base coordinate system; And carrying out robot executable verification on the execution pose.
2. The method for teleoperation of a human-computer collaborative robot based on visual interaction of claim 1, wherein the clicking operation comprises a frame selection operation, and the closed region information defined by the frame selection operation is used as an input basis for triggering background to perform target recognition and three-dimensional pose calculation.
3. A method of teleoperation of a human-machine collaborative robot based on visual interaction according to claim 1, wherein operable instruction options are provided in the user interface immediately after the user completes the clicking operation.
4. The method for teleoperation of a human-computer collaborative robot based on visual interaction of claim 1, wherein spatially mapping two-dimensional coordinates to three-dimensional environmental data comprises: Recording a corresponding video frame time stamp when the clicking operation occurs; Acquiring depth image data acquired synchronously according to the time stamp; and based on the depth image data participated in the camera, back-projecting the two-dimensional coordinates to three-dimensional space points under a robot base coordinate system.
5. The method for teleoperation of the human-computer collaborative robot based on visual interaction of claim 1, wherein determining the two-dimensional coordinates of the object in the image comprises converting the display coordinates generated based on the user click operation into original image pixel coordinates and performing normalization processing; The spatial mapping of the two-dimensional coordinates and the three-dimensional environment data acquired by the robot comprises the step of restoring the normalized coordinates to pixel coordinates and then executing the spatial mapping.
6. The human-machine collaborative robot teleoperation method based on visual interaction according to claim 1, wherein the executable verification includes verification of at least one of robot kinematic reachability, collision interference on a motion path, and mechanical stability of an operation process.
7. A teleoperation method of a human-computer collaborative robot based on visual interaction according to claim 1, wherein tasks passing through the verification are added to an ordered task queue and displayed in the user interface, and the robot is controlled to automatically and sequentially execute the tasks in the queue in response to execution instructions.
8. The method for teleoperation of a human-computer collaborative robot based on visual interaction of claim 7, wherein the task queue is constructed and managed locally on a user interface, and the response to the execution instruction comprises the step of sequencing the whole task queue, submitting the whole task queue to a robot end as a disposable transaction, and automatically executing the task queue in sequence by the robot end.
9. A teleoperation method of a human-machine collaborative robot based on visual interaction according to any one of claims 1-8, wherein prior to adding a task to a queue, if multiple candidate targets are detected within the selection area, a differentiated display is provided at a user interface and a secondary confirmation of the target entity by the user is received.
10. A teleoperation system of a man-machine cooperative robot based on visual interaction is characterized by comprising The display and interaction unit is used for displaying video pictures transmitted by the robot end in real time and receiving click operation of a user; a task processing unit for determining target coordinates and performing space mapping and executable verification in response to a click operation, a memory storing a computer program, and A processor configured to execute the program to implement the method steps of any one of claims 1 to 9.

Description

Remote operation method and system for man-machine collaborative robot based on visual interaction Technical Field The invention relates to the technical field of robot control and man-machine interaction, in particular to a teleoperation method and a teleoperation system of a man-machine cooperative robot based on visual interaction. Background The robot teleoperation technology has important application in the fields of remote operation, dangerous environment treatment and the like. At present, the main interaction and control modes have the following limitations: 1. In a traditional manual remote control mode, an operator remotely drives the robot to move through equipment such as a rocker or a handle. In this way, the operator can only send rough direction and speed instructions such as "up", "left", and the like, and cannot directly specify the final operation target such as "grasp that cup". The robot 'where to go' and 'what to do' completely depend on continuous microscopic control of an operator, like a game handle is used for remotely completing fine operation, and the robot is heavy in burden and easy to make mistakes. More importantly, the system directly executes the instruction after the instruction is sent out, and the key steps of accessibility and safety of the action are automatically checked, so that the safety depends on personal judgment of an operator. 2. And the voice control mode is that a user can control the robot through voice commands. This approach, while more natural, is less reliable in noisy industrial environments or outdoor scenes and speech recognition is susceptible to interference. A further fundamental disadvantage is that the voice command has difficulty in describing the exact spatial position and attitude (e.g. "grasp left front one red bolt against box side"), and the robot is prone to misjudgment or mishandling due to semantic understanding bias, as well as lack of automatic verification of spatial feasibility before execution. 3. And the pre-programming fully autonomous mode is that the robot runs completely according to a preset program. This approach, although not requiring manual real-time manipulation, lacks extreme flexibility, cannot accommodate dynamic changes in task content or environmental layout, and cannot handle any situations that are not preprogrammed. In summary, the prior art lacks a man-machine cooperation method that allows a user to intuitively, precisely, and reliably designate a spatial operation target in a complex real environment, and automatically and closed-loop ensures execution safety and success by a system. Disclosure of Invention 1. Technical problem to be solved by the invention The invention provides a teleoperation method and a teleoperation system for a robot based on visual interaction, which realize safe, accurate and high-robustness man-machine collaborative operation in a complex environment by automatically converting visual click of a user into a strictly-checked robot execution instruction. 2. Technical proposal In order to solve the problems, the technical scheme provided by the invention is as follows: A teleoperation method of a man-machine cooperative robot based on visual interaction comprises the following steps Displaying a video picture transmitted in real time by a robot terminal on a user interface; responding to the clicking operation of a user on the video picture, and determining the two-dimensional coordinates of the target in the image; Performing space mapping on the two-dimensional coordinates and three-dimensional environment data acquired by the robot, and resolving the execution pose of the target under a robot base coordinate system; And carrying out robot executable verification on the execution pose. As a preferable scheme of the invention, the clicking operation comprises a frame selecting operation, and the closed region information defined by the frame selecting operation is used as an input basis for triggering a background to carry out target recognition and three-dimensional pose calculation. As a preferred embodiment of the present invention, the user interface provides the user with the option of the operable instruction immediately after the user completes the clicking operation. As a preferred aspect of the present invention, the spatially mapping the two-dimensional coordinates with the three-dimensional environment data includes: Recording a corresponding video frame time stamp when the clicking operation occurs; Acquiring depth image data acquired synchronously according to the time stamp; and based on the depth image data participated in the camera, back-projecting the two-dimensional coordinates to three-dimensional space points under a robot base coordinate system. The method comprises the steps of determining two-dimensional coordinates of a target in an image, wherein the two-dimensional coordinates comprise display coordinates generated based on user clicking operation, converting