CN-122008269-A - Non-contact robot man-machine interaction system based on multi-sensor fusion

CN122008269ACN 122008269 ACN122008269 ACN 122008269ACN-122008269-A

Abstract

The invention discloses a non-contact robot-machine interaction system based on multi-sensor fusion. According to the invention, through the synergistic effect of the multi-mode sensor module and the data synchronization module, multidimensional data such as physiological signals, environmental parameters, interaction states and the like are comprehensively captured, and the accurate cleaning and standardized processing of the data preprocessing module are combined, so that a high-quality input basis is provided for the context awareness and creative response generation module. The context sensing module can further understand the intention of the user and the environmental change by means of the multidimensional information after the characteristics are extracted, so that the sensing precision of non-contact interaction is effectively improved, and the adaptability of the system to complex scenes is enhanced. The design ensures that the robot can still accurately capture the user demand without physical contact, reduces interaction delay and enhances the fluency of man-machine cooperation.

Inventors

SUN BO
CHENG SHU
GAO WENJIE
DING Yinyu

Assignees

北京海百川科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260128

Claims (10)

1. A non-contact robot man-machine interaction system based on multi-sensor fusion is characterized by comprising a multi-mode sensor module, a data preprocessing module, a data synchronization module, a feature extraction module, a multi-element comprehensive evaluation module, a situation perception and creative response generation module and a response execution and feedback module; The context awareness and creative response generation module is internally provided with a context modeling sub-module, an intention prediction sub-module and a creative response generation sub-module; The output end of the multi-mode sensor module is connected to the input end of the data synchronization module, The output end of the data synchronization module is connected to the input end of the data preprocessing module, The output end of the data preprocessing module is connected to the input end of the feature extraction module, The output end of the characteristic extraction module is respectively connected with the input end of the environment perception and creative response generation module and the input end of the multi-element comprehensive evaluation module; the output end of the context awareness and creative response generation module is connected to the input end of the response execution and feedback module, The output end of the response execution and feedback module is connected to the input end of the multi-element comprehensive evaluation module.
2. The non-contact robot man-machine interaction system based on multi-sensor fusion of claim 1, wherein a physiological signal acquisition unit, an environment sensing unit, an interaction state sensor and a sensor control center are arranged in the multi-mode sensor module, the physiological signal acquisition unit integrates a photoelectric heart rate sensor eye tracking camera and a facial myoelectric sensor, the environment sensing unit comprises a laser radar RGB-D camera and a temperature and humidity sensor, the interaction state sensor is provided with a touch pressure sensor and a voice microphone array, and the sensor control center realizes that a multi-device collaborative built-in FPGA chip is used for packaging original data through an SPI bus to support dynamic power consumption management.
3. The non-contact robot man-machine interaction system based on multi-sensor fusion of claim 1, wherein an original data cleaning sub-layer, a space-time standardization sub-layer, a characteristic dimension reduction sub-layer and a time sequence segmentation sub-layer are arranged in the data preprocessing module, the original data cleaning sub-layer processes physiological signal environment point cloud and voice data by adopting a multi-level filtering mechanism, the space-time standardization sub-layer performs data scale unification and comprises physiological characteristic standardization image interpolation scaling and coordinate conversion, the characteristic dimension reduction sub-layer processes high-dimensional data by means of principal component analysis and a t-SNE algorithm, and the time sequence segmentation sub-layer adopts a sliding window technology to divide data segments and add time stamps and data quality labels.
4. The non-contact robot man-machine interaction system based on multi-sensor fusion of claim 1, wherein a hardware triggering synchronization subsystem, a timestamp calibration algorithm layer, a cross-mode delay compensation layer and a synchronization quality monitoring layer are arranged in the data synchronization module, the hardware triggering synchronization subsystem realizes multi-sensor clock alignment through GPIO trigger signals, the timestamp calibration algorithm layer operates a distributed time synchronization protocol to adopt GPS timing and NTP calibration, the cross-mode delay compensation layer establishes a device delay model to align data through a pre-compensation algorithm, and the synchronization quality monitoring layer calculates a reset mechanism when timestamp consistency data integrity and event synchronization rate indexes are abnormal.
5. The non-contact robot man-machine interaction system based on multi-sensor fusion of claim 1, wherein a time domain feature calculation unit, a frequency domain feature conversion unit, a space domain feature coding unit and a semantic feature analysis unit are arranged in the feature extraction module, the time domain feature calculation unit extracts statistical features aiming at physiological and interactive signals, the frequency domain feature conversion unit processes periodic signals through Fourier transformation and wavelet transformation, the space domain feature coding unit processes visual and spatial data, and the semantic feature analysis unit performs deep coding on text and voice contents.
6. The non-contact robot-computer interaction system based on multi-sensor fusion of claim 1, wherein an evaluation index system construction unit, a dynamic weight distribution unit, a real-time evaluation algorithm unit and a historical data comparison unit are arranged in the multi-element comprehensive evaluation module, the evaluation index system construction unit defines four types of core indexes, the dynamic weight distribution unit calculates index weights by adopting an improved entropy weight method, the real-time evaluation algorithm unit operates a layered evaluation model, and the historical data comparison unit maintains an evaluation result time sequence database.
7. The non-contact robot interaction system based on multi-sensor fusion of claim 1, wherein the context modeling submodule is configured to integrate and dynamically characterize the multi-source heterogeneous data in real time, and comprises: The multi-source data input layer synchronously receives three basic data, namely user state data, environment perception data and historical interaction data, and performs pretreatment of time stamp alignment and outlier filtering; A dynamic feature fusion layer, which is to fuse input features by adopting a self-adaptive weighting algorithm, wherein the user state weight w u is dynamically adjusted by the signal-to-noise ratio of physiological signals, the environment feature weight w e is introduced into a scene complexity factor, and the historical data weight w h is attenuated by exponential moving average; The situation state characterization layer is used for constructing a dynamic situation map based on the heterogeneous graph neural network, wherein the nodes comprise users, environment entities and interaction events, and the edge weights are dynamically updated through an attention mechanism; the real-time updating engine adopts an improved LSTM network to realize the time sequence evolution of the situation state, outputs the current situation vector S t every 100ms, predicts the situation trend of 2 seconds in the future through Kalman filtering, and provides a prospective input for the intention prediction submodule; The dynamic update formula of the situation state is as follows: ; wherein: S t represents a context state vector at time t, including a fusion characterization of user-environment-interactions; w u (t) represents the user state weight at time t, calculated from the signal-to-noise ratio snr_u (t) of HRV signal, wu (t) =0.5·snru (t) +0.2; U t represents a user state vector at the time t, and the emotion characteristic and the attention characteristic are spliced; w e (t) represents the environmental feature weight at time t, and is positively correlated with the scene complexity factor C (t): we (t) =0.1+0.4·c (t)/Cmax, cmax being the preset maximum complexity; E t represents an environmental feature vector at the time t, and comprises scene semantic coding and dynamic barrier density; w h (t) represents a historical state weight at time t, satisfying w u (t)+w e (t)+w h (t) =1; s t−1 represents a context state vector at time t-1; η (t) represents a context entropy gradient coefficient characterizing the effect of context uncertainty on state update; ∇ H (S t−1 ) represents a gradient vector of the context entropy H (S t−1 ) at the time t-1, and the higher the entropy value is, the larger the correction amplitude of the gradient term to St is.
8. The non-contact robot man-machine interaction system based on multi-sensor fusion is characterized in that the intention prediction submodule is used for realizing accurate prediction of user explicit instructions and implicit demands by fusing time sequence interaction data and knowledge reasoning, the core composition of the system comprises an input fusion layer, a time sequence intention modeling layer, a knowledge enhancement reasoning layer and a knowledge enhancement reasoning layer, wherein the input fusion layer is used for receiving real-time situation vectors S t , historical interaction intention sequences I 1 ... t−1 and knowledge atlas embedment output by a situation modeling submodule and weighting and splicing by a multi-head attention mechanism, the time sequence intention modeling layer is used for capturing recent intention trends by a bidirectional LSTM network, then mining long-term habit patterns by the LSTM and outputting time sequence intention feature vectors Ht, the knowledge enhancement reasoning layer is used for processing knowledge atlases based on the graph attention network and dynamically calculating entity association weights to generate knowledge feature vectors Kt, the intention decoding layer is used for fusing H t and K t by a gating mechanism and outputting explicit probability distribution by the softmax, and meanwhile, the self-encoder is used for generating implicit demand vectors by a variable classification mechanism and finally outputting structural intention prediction results (meaning pattern category, confidence and execution priority; the dynamic fusion formula of the intention probability is as follows: ; wherein: P (I t ) the comprehensive probability distribution of the moment t of the intention I t ; Lambda (t) time sequence characteristic weight; GAT (K t ,S t ) is used for integrating knowledge characteristic Kt and context vector St through a GAT network to output intention probability of knowledge guidance; Epsilon (t) is the intended evolution coefficient, positively correlated with the temporal continuity of the intended sequence; ΔP (I t−1 ) the amount of change in the intention probability at time t-1 reflects the intention trend.
9. The non-contact robotic interaction system based on multi-sensor fusion of claim 1, wherein the creative response generation sub-module comprises a response task decomposition layer, a multi-modal generation engine and a real-time optimization unit.
10. The non-contact robot interaction system based on multi-sensor fusion of claim 1, wherein an actuator control layer, a multi-mode output scheduling layer, a feedback signal acquisition layer and an execution state monitoring layer are arranged in the response execution and feedback module, and the actuator control layer comprises a three-level control architecture.

Description

Non-contact robot man-machine interaction system based on multi-sensor fusion Technical Field The invention belongs to the technical field of robot interaction, and particularly relates to a non-contact robot-computer interaction system based on multi-sensor fusion. Background The non-contact robot interaction is a technical mode for realizing man-machine communication in a physical contact-free mode, and mainly depends on means such as voice recognition, gesture control, sight tracking, facial expression recognition, remote control and the like, so that a user can complete instruction input and information acquisition under the condition of not directly touching equipment. The interaction mode is widely applied to scenes such as medical treatment, service, public display, intelligent home and the like, and has remarkable advantages in high-sanitation-requirement or barrier-free environments. The non-contact robot interaction combines artificial intelligence and multi-mode sensing technology, user intention can be analyzed in real time, accurate feedback is provided, cross infection risk is effectively reduced, and interaction convenience and safety are improved. With the continuous maturation of technology, non-contact interaction is gradually becoming an important direction of intelligent development of robots. However, the prior art has the problems of single data acquisition dimension, insufficient preprocessing precision, poor multi-mode data synchronism and the like, so that the situation awareness capability is limited, the complex interaction intention is difficult to understand accurately, the response mode lacks flexibility and individuation, the system stability and reliability are insufficient, the interaction delay is higher, and the dynamic change scene requirement is difficult to adapt. Disclosure of Invention The invention aims to solve the problems and provide a non-contact robot man-machine interaction system based on multi-sensor fusion. The technical scheme adopted by the invention is that the non-contact robot-computer interaction system based on multi-sensor fusion comprises a multi-mode sensor module, a data preprocessing module, a data synchronization module, a characteristic extraction module, a multi-element comprehensive evaluation module, a situation perception and creative response generation module and a response execution and feedback module; The context awareness and creative response generation module is internally provided with a context modeling sub-module, an intention prediction sub-module and a creative response generation sub-module; The output end of the multi-mode sensor module is connected to the input end of the data synchronization module, The output end of the data synchronization module is connected to the input end of the data preprocessing module, The output end of the data preprocessing module is connected to the input end of the feature extraction module, The output end of the characteristic extraction module is respectively connected with the input end of the environment perception and creative response generation module and the input end of the multi-element comprehensive evaluation module; the output end of the context awareness and creative response generation module is connected to the input end of the response execution and feedback module, The output end of the response execution and feedback module is connected to the input end of the multi-element comprehensive evaluation module. In a preferred embodiment, a physiological signal acquisition unit, an environment sensing unit, an interaction state sensor and a sensor control center are arranged in the multi-mode sensor module, the physiological signal acquisition unit integrates a photoelectric heart rate sensor eye tracking camera and a facial myoelectric sensor, the environment sensing unit comprises a laser radar RGB-D camera and a temperature and humidity sensor, the interaction state sensor is provided with a touch pressure sensor and a voice microphone array, and the sensor control center realizes that a plurality of devices cooperate with a built-in FPGA chip to package raw data and support dynamic power consumption management through an SPI bus. In a preferred embodiment, the data preprocessing module is internally provided with an original data cleaning sub-layer, a space-time standardization sub-layer, a characteristic dimension reduction sub-layer and a time sequence segmentation sub-layer, wherein the original data cleaning sub-layer adopts a multi-level filtering mechanism to process physiological signal environment point cloud and voice data, the space-time standardization sub-layer performs data dimension unification and comprises physiological characteristic standardization image interpolation scaling and coordinate conversion, the characteristic dimension reduction sub-layer processes high-dimensional data through principal component analysis and a t-SNE algorithm, and the time sequen