CN-122020239-A - Hand exoskeleton control method and system integrating fatigue perception and reinforcement learning

CN122020239ACN 122020239 ACN122020239 ACN 122020239ACN-122020239-A

Abstract

The invention discloses a hand exoskeleton control method and a hand exoskeleton control system for fusing fatigue perception and reinforcement learning, wherein the method comprises the steps of collecting surface electromyographic signals, man-machine interaction force and position information of a user and hand gesture data based on vision collection; the method comprises the steps of obtaining muscle fatigue grade by utilizing a pre-training fatigue identification model, constructing an enhanced state vector, generating optimized virtual rigidity and damping parameters by a reinforcement learning parameter optimizer, updating a virtual hand model based on hand gestures, performing collision detection, selecting admittance or mixed admittance control mode according to contact state, calculating expected speed instructions, and finally driving an exoskeleton to execute. The intelligent cooperative control closed loop is realized by sensing muscle fatigue in real time and adaptively adjusting force feedback parameters, realizing safe and personalized rehabilitation training, improving the immersion of training by means of virtual contact triggering and realistic force sense simulation, and constructing the intelligent cooperative control closed loop by fusing physiological, physical and visual information.

Inventors

GUO HAO
ZHU YIJIE
QI FEI
SUN LINING

Assignees

苏州大学

Dates

Publication Date: 20260512
Application Date: 20251229

Claims (10)

1. The hand exoskeleton control method integrating fatigue sensing and reinforcement learning is characterized by comprising the following steps of: S1, acquiring original surface electromyographic signals of a user, human-computer interaction force and position information from a hand rehabilitation exoskeleton and hand gesture data based on vision acquisition; s2, inputting the original surface electromyographic signals into a pre-trained fatigue recognition model to obtain the current muscle fatigue grade, fusing the muscle fatigue grade, the man-machine interaction force and the position information, constructing an attribute-decoupled time sequence based on the man-machine interaction information to generate an countermeasure network, synthesizing a simulated hand man-machine interaction track data set, fusing the muscle fatigue grade, the man-machine interaction force and the position information, and constructing an enhanced state vector; s3, inputting the enhanced state vector into an enhanced learning parameter optimizer pre-trained based on a simulated man-machine interaction data set to obtain an optimized virtual stiffness parameter and virtual damping parameter; S4, updating a virtual hand model in the virtual environment by using the hand gesture data, and executing collision detection based on the updated virtual hand model in the virtual environment to judge whether the virtual hand and the virtual object are contacted or not so as to obtain a judging result comprising a contact state and a contact position; S5, selecting a control mode and calculating instructions based on the contact state in the judging result, selecting an admittance control mode and calculating an expected speed instruction according to man-machine interaction force if the contact state indicates non-contact, selecting a hybrid admittance control mode if the contact state indicates contact, and calculating the expected speed instruction by combining the man-machine interaction force and the position error calculated by the contact position in the judging result by applying the optimized virtual stiffness parameter and the virtual damping parameter; And S6, obtaining a motor control instruction according to the expected speed instruction, and driving the hand rehabilitation exoskeleton to execute.
2. The method for controlling the hand exoskeleton fusing fatigue sensing and reinforcement learning according to claim 1, wherein in the step S2, the training method of the fatigue recognition model comprises the steps of preprocessing an original surface electromyographic signal, segmenting the preprocessed signal by adopting a sliding window, extracting time domain features and frequency domain features from each signal window, immediately dividing preprocessed surface electromyographic signal data into a labeled dataset and a non-labeled dataset, and jointly inputting the labeled dataset and the non-labeled dataset into a convolutional neural network for training by adopting a semi-supervised learning framework to obtain a recognition model capable of classifying real-time surface electromyographic signals into multiple levels.
3. The method for controlling the exoskeleton of the hand fused with fatigue perception and reinforcement learning according to claim 1, wherein in step S2, a time sequence with attribute decoupling is constructed based on man-machine interaction information to generate an countermeasure network, a simulated hand man-machine interaction track data set is synthesized, the muscle fatigue grade, the man-machine interaction force and the position information are fused to construct an enhanced state vector, and the method comprises the following steps: collecting human-computer interaction data generated when a user executes tasks under different fatigue states and different impedance parameter combinations, wherein the human-computer interaction data comprises impedance parameters serving as static attributes, hand position tracks serving as dynamic time sequence characteristics, human-computer interaction force tracks and real-time muscle fatigue grades; based on the collected man-machine interaction data, constructing a time sequence of attribute decoupling to generate an countermeasure network, wherein the time sequence of attribute decoupling to generate the countermeasure network comprises a generator and a discriminator, and the method comprises the following steps of: The generator maps random noise into static attributes, generates corresponding normalization factors based on the static attributes, and synthesizes dynamic time sequence data containing hand position, man-machine interaction force and fatigue degree information by adopting a long-period memory network according to the static attributes, the normalization factors and the noise; The discriminator evaluates static properties in the samples generated by the generator and the authenticity of the whole time sequence respectively; And alternately training the generator and the discriminator to enable the generator to synthesize simulation track data consistent with real human-computer interaction data distribution, and constructing an enhancement state vector input into the reinforcement learning parameter optimizer.
4. The method for controlling the hand exoskeleton fusing fatigue sensing and reinforcement learning according to claim 1, wherein in the step S3, the reinforcement state vector is input into a pre-trained reinforcement learning parameter optimizer to obtain optimized virtual stiffness parameters and virtual damping parameters, the reinforcement state vector is input into the pre-trained reinforcement learning parameter optimizer to obtain motion vectors containing virtual damping adjustment amounts and virtual stiffness adjustment amounts, and the current virtual damping parameters and virtual stiffness parameters are updated according to the adjustment amounts in the motion vectors to obtain the optimized virtual stiffness parameters and virtual damping parameters.
5. The method for controlling hand exoskeleton fusing fatigue sensing and reinforcement learning according to claim 1 or 4, wherein in step S3, the reinforcement learning parameter optimizer is trained by optimizing the reinforcement learning parameter optimizer based on a comprehensive reward function including a track reward term associated with a muscle fatigue level and a tracking error of a user' S motion track, a smoothing penalty term suppressing the adjustment amplitude of the virtual stiffness parameter and the virtual damping parameter, and a sparse forward reward term triggered when the average tracking error of the motion track is lower than a preset threshold.
6. The method for controlling hand exoskeleton fusing fatigue sensing and reinforcement learning according to claim 1 or 4, wherein in step S3, the reinforcement learning parameter optimizer optimizes using actor-commentator framework, wherein the commentator network evaluates the value of a given state and updates by minimizing the value prediction error, and the actor network outputs actions according to the current state and updates by optimizing an objective function.
7. The method for hand exoskeleton control fusing fatigue sensing and reinforcement learning as set forth in claim 1, wherein in step S5, the desired speed command in admittance control mode is given Calculated by the following formula: , Wherein, the For the purpose of the human-computer interaction force, Is a preset admittance damping coefficient; Desired speed command in the hybrid admittance control mode Calculated by the following formula: , Wherein, the In order to optimize the virtual stiffness parameter after the optimization, In order to optimize the virtual damping parameters after the optimization, In order to determine the contact position in the result, Is the current location.
8. The hand exoskeleton control system integrating fatigue sensing and reinforcement learning is characterized by comprising the following modules: The information acquisition module comprises a hand rehabilitation exoskeleton, a hand gesture acquisition module and a hand gesture acquisition module, wherein the hand rehabilitation exoskeleton is used for acquiring original surface electromyographic signals of a user, man-machine interaction force and position information from the hand rehabilitation exoskeleton and hand gesture data based on vision acquisition; The state processing module is used for inputting the original surface electromyographic signals into a pre-trained fatigue recognition model to obtain the current muscle fatigue grade, fusing the muscle fatigue grade, the man-machine interaction force and the position information, constructing an attribute decoupling time sequence based on the man-machine interaction information to generate an countermeasure network, synthesizing a simulated hand man-machine interaction track data set, fusing the muscle fatigue grade, the man-machine interaction force and the position information, and constructing an enhanced state vector; the parameter optimization module is used for inputting the enhanced state vector into a reinforcement learning parameter optimizer pre-trained based on a simulation man-machine interaction data set to obtain an optimized virtual stiffness parameter and a virtual damping parameter; The virtual interaction module is used for updating a virtual hand model in the virtual environment by utilizing the hand gesture data, executing collision detection based on the updated virtual hand model in the virtual environment, and judging whether the virtual hand and the virtual object are contacted or not to obtain a judging result comprising a contact state and a contact position; The instruction generation module is used for selecting a control mode and calculating instructions based on the contact state in the judging result, selecting an admittance control mode and calculating an expected speed instruction according to man-machine interaction force if the contact state indicates non-contact, selecting a hybrid admittance control mode if the contact state indicates contact, applying the optimized virtual stiffness parameter and virtual damping parameter, and calculating the expected speed instruction by combining the man-machine interaction force and a position error calculated by the contact position in the judging result; and the execution control module is used for obtaining a motor control instruction according to the expected speed instruction and driving the hand rehabilitation exoskeleton to execute.
9. An electronic device, comprising a processor, a memory and a bus system, wherein the processor and the memory are connected through the bus system, the memory is used for storing instructions, and the processor is used for executing the instructions stored in the memory to realize the hand exoskeleton control method of the fusion fatigue sensing and reinforcement learning according to any one of claims 1 to 7.
10. A computer storage medium storing a computer software product comprising instructions for causing a computer device to perform the method of hand exoskeleton control of fusion fatigue sensing and reinforcement learning of any one of claims 1 to 7.

Description

Hand exoskeleton control method and system integrating fatigue perception and reinforcement learning Technical Field The invention relates to the technical field of rehabilitation robots and man-machine interaction control, in particular to a hand exoskeleton control method and system integrating fatigue perception and reinforcement learning. Background Aiming at hand movement dysfunction caused by diseases such as stroke, spinal cord injury and the like, a hand rehabilitation robot has become an important auxiliary tool for clinical rehabilitation training. At present, a typical control scheme of the system is that a patient wears a rigid exoskeleton driven by a motor, and the exoskeleton drives fingers of the patient to perform rehabilitation actions such as flexion and extension through a link mechanism. The control system is usually built in with several preset training modes. For example, in passive training, the system drives the hand of the patient to repeatedly move according to the preset movement track, speed and constant resistance, and in active training, the patient is required to actively apply a force larger than a fixed threshold value to trigger the auxiliary action of the robot. In order to improve the participation of the patient, part of the system is also provided with a display screen, and visual feedback is carried out on the movement process through a simple progress bar or basic animation. However, the above prior art solutions have several significant drawbacks in practical application, which restrict the safety, effect and user experience of rehabilitation training. First, control strategies are inflexible and lack dynamic response capabilities to the physiological state of the user. Once the parameters such as the motion trail, the speed, the resistance and the like of the existing system are preset, the parameters are kept unchanged in the whole training process, the dynamic change of a patient as a physiological individual is ignored, and particularly muscle fatigue generated in the training process cannot be perceived and responded. When the muscles of the patient are in a fatigue state, the system still performs the training task forcedly with the original strength, which not only can cause discomfort of the patient and reduce the rehabilitation effect, but also can cause secondary injuries such as muscle strain and the like when serious, and has potential safety hazards. Secondly, human-computer interaction experiences singleness, boring, immersion and insufficient fidelity. The force feedback provided by existing systems is mostly a fixed, programmed resistance that is too different from the rich, dynamic haptic experience perceived when gripping different objects (e.g. sponges, hard balls) in the real world. Meanwhile, the visual feedback system is severely disjointed from physical interaction, and a meaningful and immersive interaction scene cannot be constructed. This mechanical, repetitive training pattern is very boring to the patient, creating conflicting emotions, resulting in poor training compliance, and is not conducive to stimulating cerebral cortex plasticity critical to nerve function remodeling. Thirdly, the system control logic is fractured, and the multisource information cannot be fused deeply. In the architecture of the existing system, the physical control loop for processing the forces and positions, the visual feedback loop for displaying the virtual scene and the module for monitoring the physiological state are often independent of each other, the lack of deep information interaction and cooperation between the modules, especially the real-time physiological state information (such as muscle fatigue degree) of the patient is excluded from the main control loop, the active intervention adjustment of the rehabilitation therapist is needed, the system cannot render the matched physical force feedback in real time and accurately according to the key event (such as that the virtual hand contacts with an elastic object) in the virtual scene, and the difficulty of the virtual task and the strength of the physical assistance cannot be adaptively adjusted according to the real-time physiological state of the patient. This makes the whole system more like a simple functional heap rather than an intelligent, unified, organically coordinated interactive ensemble, severely limiting the individuation level and overall effectiveness of rehabilitation training. Disclosure of Invention Therefore, the invention aims to solve the technical problems that in the prior art, a control strategy of a hand rehabilitation robot is stiff and cannot respond to the physiological state change of a user, human-computer interaction experience is single and immersion is insufficient, and multi-source information is split to cause difficult collaboration. In order to solve the technical problems, the invention provides a hand exoskeleton control method integrating fatigue perception an