CN-121989241-A - Robot touch perception method and device

CN121989241ACN 121989241 ACN121989241 ACN 121989241ACN-121989241-A

Abstract

The invention discloses a robot touch sensing method and a device, relates to the technical field of robot sensing, and aims to realize space-time decoupling of robot touch data and full utilization of time characteristics and space characteristics, so that the accuracy of robot touch sensing is improved. The method comprises the steps of preprocessing original tactile data of a robot to obtain tactile data, carrying out feature extraction and fusion on single-frame tactile data based on a plurality of convolution branches to obtain multi-scale fused tactile feature data, carrying out block flattening and position embedding on the tactile feature data to obtain a block sequence with embedded positions, inputting a transducer encoder layer to extract tactile space features, sequencing the tactile space features according to a time frame sequence, outputting time sequence space features, extracting tactile time features based on Mamba, carrying out feature gating fusion on the basis of the tactile space features and the tactile time features, and determining a target tactile sensing result according to the fused features.

Inventors

XU CHI
HAN CHENGLONG
YU HAIBIN
XIA CHANGQING
ZENG PENG

Assignees

中国科学院沈阳自动化研究所

Dates

Publication Date: 20260508
Application Date: 20260212

Claims (9)

1. A method of robotic haptic perception, the method comprising: acquiring original tactile data of a robot according to a preset sampling period, and preprocessing the original tactile data of the robot to obtain tactile data; dividing the haptic data into a plurality of single-frame haptic data according to a time frame sequence, and carrying out feature extraction and fusion on each single-frame haptic data based on a plurality of convolution branches to obtain multi-scale fused haptic feature data; Partitioning the tactile feature data to obtain a flattened partitioned sequence, performing position embedding on the partitioned sequence to obtain a partitioned sequence with embedded positions, inputting the partitioned sequence with embedded positions into an encoder layer of a transducer, and extracting tactile space features; sequencing the tactile space features according to a time frame sequence, outputting time sequence space features, and extracting tactile time features based on Mamba; And carrying out feature gating fusion according to the tactile space features and the tactile time features to obtain fused features, and determining a target tactile perception result according to the fused features.
2. The method of claim 1, wherein the feature extraction and fusion of each of the single frame haptic data based on the plurality of convolution branches results in multi-scale fused haptic feature data, comprising: Performing feature extraction on the single-frame tactile data based on a depth separable convolution network to obtain a first branch feature corresponding to a first convolution branch, and for a second convolution branch to a last convolution branch, splicing the single-frame tactile data with a last branch feature to obtain a splicing result, performing feature extraction on the splicing result based on the depth separable convolution network to correspondingly obtain a second branch feature to a last branch feature, wherein the last branch feature of the second branch feature is the first branch feature; Calculating the normalization weight corresponding to each branch feature, and calculating the product of the branch feature and the normalization weight of the same convolution branch to obtain a feature extraction result corresponding to each convolution branch; inputting each feature extraction result into an activation function to obtain a corresponding filtered feature extraction result, and adding all the filtered feature extraction results to obtain a multi-scale feature corresponding to each single-frame tactile data; And splicing all the multi-scale features corresponding to the single-frame tactile data to obtain multi-scale fused tactile feature data.
3. The method of claim 1, wherein the inputting the position-embedded block sequence into the encoder layer of the transducer, extracting the haptic spatial features, comprises: Inputting the segmented sequence with the embedded position into an encoder layer of a transducer, calculating the decomposition attention of the segmented sequence with the embedded position by using the encoder layer to obtain an attention result, carrying out layer normalization on the attention result to obtain a normalized result, inputting the normalized result into a feedforward network, and extracting the tactile space characteristics.
4. A method according to claim 3, wherein calculating the resolved attention of the position-embedded block sequence to obtain an attention result comprises: Determining elements in the same width direction and elements in the same height direction according to the position information embedded in the positions; Taking all elements in each width direction as a width sequence, calculating the width sub self-attention weight corresponding to each width sequence, and splicing all the width sub self-attention weights to obtain the width direction self-attention weight; Taking all elements in each height direction as a height sequence, calculating the self-attention weight of the height sub-corresponding to each height sequence, and splicing all the self-attention weights of the height sub-to obtain the self-attention weight in the height direction; calculating a value matrix corresponding to the block sequence with the embedded position, and obtaining an attention result according to the self-attention weight in the width direction, the self-attention weight in the height direction and the value matrix.
5. The method of claim 1, wherein the extracting haptic time features based on Mamba comprises: Performing linear projection on the time sequence space characteristics to obtain a time sequence, and determining gating parameters, a state matrix, an input matrix and an output matrix according to the time sequence; performing discretization processing on the state matrix and the input matrix respectively to obtain a discretized state matrix and a discretized input matrix correspondingly; adjusting the discretization state matrix and the discretization input matrix according to the gating parameters to correspondingly obtain an adjusted state matrix and an adjusted input matrix; And calculating a state equation according to the time sequence, the adjusted state matrix, the adjusted input matrix and the output matrix to obtain the touch time characteristic.
6. The method of claim 1, wherein the feature gating fusion of the haptic spatial features with the haptic temporal features to obtain fused features comprises: calculating gating weights according to the tactile space features and the tactile time features; and weighting and summing the tactile space features and the tactile time features by using the gating weight to obtain fused features.
7. A robotic haptic sensation device, the device comprising: the acquisition module is used for acquiring original tactile data of the robot according to a preset sampling period, and preprocessing the original tactile data of the robot to obtain tactile data; The feature extraction and fusion module is used for dividing the haptic data into a plurality of single-frame haptic data according to a time frame sequence, and carrying out feature extraction and fusion on each single-frame haptic data based on a plurality of convolution branches to obtain multi-scale fused haptic feature data; The haptic space feature extraction module is used for partitioning the haptic feature data to obtain a flattened partitioning sequence, performing position embedding on the partitioning sequence to obtain a partitioning sequence with the embedded position, inputting the partitioning sequence with the embedded position into an encoder layer of a transducer, and extracting haptic space features; the haptic time feature extraction module is used for sequencing the haptic space features according to a time frame sequence, outputting time sequence space features and extracting haptic time features based on Mamba; And the fusion sensing module is used for carrying out feature gating fusion according to the tactile space features and the tactile time features to obtain fused features, and determining a target tactile sensing result according to the fused features.
8. A storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the robot tactile sensation method of any one of claims 1 to 6.
9. A computer device comprising a memory, a processor and a computer program stored on the storage medium and executable on the processor, characterized in that the processor implements the robotic haptic sensation method of any one of claims 1 to 6 when executing the program.

Description

Robot touch perception method and device Technical Field The invention relates to the technical field of robot perception, in particular to a robot touch perception method and a device. Background The touch perception of the robot can enable the robot to perceive physical characteristics such as contact force, texture, shape and the like, finer operation and environment interaction are realized, and the robot is one of key technologies for improving autonomy, safety and man-machine cooperation capability of the robot. Taking a robot grabbing task as an example, the robot grabbing task is highly dependent on accurate perception of a real-time state of a target object, when the real-time state changes, such as sliding, shifting or deformation caused by stress of the object before grabbing, the robot needs to dynamically adjust the grabbing pose, the clamping force or the grabbing path so as to ensure the robustness and the success rate of the task, and therefore, the touch perception capability of the robot needs to be improved, and touch data is accurately classified. The haptic sensation depends on classification of haptic data, which refers to a process of automatically discriminating and classifying the haptic data according to a preset class standard, with the goal of letting the robot judge key state information of the current contact scene through the haptic data. The classification result is a core basis for the robot to realize self-adaptive control. For example, when the classification result is determined that the object slides, the robot control system triggers a closed loop adjustment strategy, i.e. increases the clamping force or corrects the hand position and posture, so as to ensure the robustness of the grabbing task. However, to achieve accurate haptic sensations, it is necessary to fully utilize the temporal and spatial characteristics of the haptic data. The haptic data has space-time coupling characteristics, and simultaneously, the haptic time characteristics and the haptic space characteristics are easy to interfere, so that the model performance is reduced, and the decoupling operation brings higher calculation cost and higher model complexity. In addition, the difference of the tactile information of different robots is obvious, and the robustness of the model is high, so that the accurate classification of the tactile data of the robots is needed to be completed at present so as to realize the accurate tactile perception of the robots. Disclosure of Invention In view of the above, the invention provides a method and a device for sensing the touch of a robot, which realize space-time decoupling of the touch data of the robot and full utilization of time characteristics and space characteristics, thereby improving the accuracy of the touch sensing of the robot. According to one aspect of the present invention, there is provided a robot tactile sensation method comprising: acquiring original tactile data of a robot according to a preset sampling period, and preprocessing the original tactile data of the robot to obtain tactile data; dividing the haptic data into a plurality of single-frame haptic data according to a time frame sequence, and carrying out feature extraction and fusion on each single-frame haptic data based on a plurality of convolution branches to obtain multi-scale fused haptic feature data; Partitioning the tactile feature data to obtain a flattened partitioned sequence, performing position embedding on the partitioned sequence to obtain a partitioned sequence with embedded positions, inputting the partitioned sequence with embedded positions into an encoder layer of a transducer, and extracting tactile space features; sequencing the tactile space features according to a time frame sequence, outputting time sequence space features, and extracting tactile time features based on Mamba; And carrying out feature gating fusion according to the tactile space features and the tactile time features to obtain fused features, and determining a target tactile perception result according to the fused features. Preferably, the feature extraction and fusion are performed on each single frame of haptic data based on a plurality of convolution branches to obtain multi-scale fused haptic feature data, including: Performing feature extraction on the single-frame tactile data based on a depth separable convolution network to obtain a first branch feature corresponding to a first convolution branch, and for a second convolution branch to a last convolution branch, splicing the single-frame tactile data with a last branch feature to obtain a splicing result, performing feature extraction on the splicing result based on the depth separable convolution network to correspondingly obtain a second branch feature to a last branch feature, wherein the last branch feature of the second branch feature is the first branch feature; Calculating the normalization weight corresponding to each branch feat