CN-122009052-A - In-vehicle control method and system based on multi-mode interaction

CN122009052ACN 122009052 ACN122009052 ACN 122009052ACN-122009052-A

Abstract

The invention discloses an in-vehicle control method and system based on multi-mode interaction, and aims to solve the technical problems of scene limitation, safety mechanism deficiency, multi-mode interaction deficiency and the like of the existing in-vehicle control scheme. According to the method, a face-gesture-vehicle state ternary cooperative control model is built, multisource data are collected through a near infrared camera, a millimeter wave radar and an in-vehicle environment sensor, data processing and feature extraction are carried out through a multi-mode fusion engine, a safety arbitration layer is combined to arbitrate control instructions according to preset rules, and finally a control instruction execution layer drives equipment such as vehicle windows, air conditioners and entertainment systems to finish corresponding operations. According to the invention, the CAN bus signal of the vehicle is introduced as a control enabling condition, and the radio frequency living body detection and 3D gesture space positioning technology is integrated, so that the safety, accuracy and scene adaptability of in-vehicle control are effectively improved, the false triggering rate is reduced, and the method is suitable for safety control of high-frequency interaction scenes in the driving process.

Inventors

Rao Zhangmin
CHEN SHIRUI
LI CHAO
WU YOU
LIU BO

Assignees

东风汽车集团股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260206

Claims (10)

1. An in-vehicle control method based on multi-modal interaction is characterized by comprising the following steps: acquiring automobile multi-mode data through various sensors, wherein the multi-mode data comprise facial image data, gesture space track data, in-vehicle environment data and vehicle running state data of personnel in the automobile; Detecting the face living body based on the face image data to obtain a face detection result; Constructing a safety arbitration rule base, wherein the rule base comprises a corresponding relation between a control instruction and a face detection result, a gesture matching result, in-vehicle environment data and vehicle running state data constraint, and carrying out safety arbitration on the control instruction according to the face recognition result, gesture recognition confidence, in-vehicle environment data and vehicle running state data; And the control instruction passing through the safety arbitration is issued to the corresponding in-vehicle equipment through the control instruction execution layer, and the driving equipment completes the control instruction operation.
2. The control method according to claim 1, wherein the multi-mode data of the automobile is acquired through a plurality of sensors, and the method specifically comprises the steps of acquiring facial image data of personnel in the automobile through a near infrared camera, acquiring gesture space track data through a millimeter wave radar, acquiring environment data in the automobile through an environment sensor in the automobile, and acquiring running state data of the automobile through a CAN bus.
3. The control method according to claim 1, wherein the vehicle running state data includes a vehicle speed and a gear, and the in-vehicle environment data includes an in-vehicle environment Concentration and light intensity.
4. The control method according to claim 1, wherein the face-based image data is used for detecting a face living body to obtain a face detection result, and the control method comprises the steps of processing the face-based image data by a radio frequency living body detection method to distinguish a real person from a fake target, and improving face recognition accuracy in a head deflection state through a local feature alignment network.
5. The control method according to claim 1, wherein the gesture is matched based on gesture space trajectory data to obtain a gesture matching result, and the control method specifically comprises the steps of constructing a 3D gesture trajectory model based on fusion data of millimeter wave radar and an RGB camera to realize gesture space positioning.
6. The control method according to claim 1, wherein the specific rules of the secure arbitration rule base include: when the control command is that the vehicle window is completely lowered, the condition that the face detection is passed, the gesture matching is successful and the vehicle speed is smaller than the preset speed is required to be satisfied, otherwise, the command is refused to be executed; When the control instruction is air conditioner temperature adjustment, the gesture matching success needs to be satisfied, and the speed and gear limitation is avoided; When the control instruction is an emergency ventilation start, the condition that the face detection is passed and the vehicle is in need of being satisfied The concentration is larger than a preset threshold value, gesture triggering is not needed, and skylight micro-opening and air conditioner external circulation operation are executed; When the control instruction is that the child lock is temporarily released, the condition that the face detection of an administrator passes, the specific gesture is successfully matched, the gear is the P gear, and otherwise, the rear-row control limit is not released is required.
7. The control method according to claim 1, further comprising the step of incremental user adaptation, specifically comprising generating an antagonistic sample expansion gesture library through the GAN, realizing online learning of new gestures based on a lightweight CNN classifier, automatically associating a user face ID with a common gesture set thereof, and realizing personalized control instruction matching.
8. The control method of claim 1, wherein the gesture comprises a non-contact hover gesture comprising a spaced rotation volume adjustment, a palm down pressure drop window, a fist grip rotation temperature adjustment.
9. The in-vehicle control system based on the multi-mode interaction is characterized by comprising a data acquisition module, a multi-mode fusion module, a safety arbitration module and a control instruction execution module, wherein: The system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring automobile multi-mode data through various sensors, and the multi-mode data comprise face image data, gesture space track data, in-car environment data and vehicle running state data of personnel in an automobile; the multi-mode fusion module is used for detecting the face living body based on the face image data to obtain a face detection result; The safety arbitration module is used for constructing a safety arbitration rule base, wherein the rule base comprises a corresponding relation between a control instruction and face detection results, gesture matching results, in-vehicle environment data and vehicle running state data constraint, and safety arbitration is carried out on the control instruction according to the face recognition results, gesture recognition confidence, in-vehicle environment data and vehicle running state data; the control instruction execution module is used for issuing the control instruction which passes through the safety arbitration to the corresponding in-vehicle equipment through the control instruction execution layer, and the driving equipment completes the control instruction operation.
10. An electronic device, comprising: one or more processors; A memory for storing one or more programs; When executed by the one or more processors, causes the one or more processors to implement the control method of any one of claims 1 to 8.

Description

In-vehicle control method and system based on multi-mode interaction Technical Field The invention relates to the technical field of intelligent automobile man-machine interaction, in particular to an in-car control method and system based on multi-mode interaction. Background Along with the intelligent rapid development of automobiles, the in-automobile interactive control mode gradually evolves to a multi-mode direction, and the currently mainstream in-automobile control mode mainly comprises two types of manual operation based on soft buttons of an automobile and intelligent voice control based on voice instruction recognition. The manual operation scheme based on the soft buttons of the automobile is to integrate various automobile control buttons in the automobile, and a user can finish the operations of opening music, adjusting windows, controlling an air conditioner and the like through an automobile interface. The scheme has the obvious defects that the user needs to use hands and eyes simultaneously during operation, the attention of the driver is dispersed during the running process of the vehicle, hidden danger is caused to safe driving, the operation path of part of control buttons is deeper, the quick positioning is not easy during the driving process, and the manual touch screen is difficult to accurately point the target button in the running scene such as bumping of the vehicle. According to the intelligent voice control scheme based on voice instruction recognition, an intelligent voice assistant is called through the voice of a vehicle owner, and various vehicle control instructions are issued to realize equipment control. However, the scheme has the problems that the voice recognition has high requirements on the environment in the vehicle, voice instructions are difficult to accurately recognize when the environment in the vehicle is noisy, voice operation is easy to cause embarrassment and possibly affects rest of other people when other people are in the vehicle, and the voice recognition accuracy is reduced and even the situation of misrecognition occurs if the vehicle owner has serious dialect. Therefore, there is a need for an in-vehicle control method and system based on multi-modal interaction that solves the problems of the prior art. Disclosure of Invention The invention aims to solve at least one technical problem in the prior art and provides an in-vehicle control method and system based on multi-mode interaction. In a first aspect, an embodiment of the present invention provides an in-vehicle control method based on multi-modal interaction, including: acquiring automobile multi-mode data through various sensors, wherein the multi-mode data comprise facial image data, gesture space track data, in-vehicle environment data and vehicle running state data of personnel in the automobile; Detecting the face living body based on the face image data to obtain a face detection result; Constructing a safety arbitration rule base, wherein the rule base comprises a corresponding relation between a control instruction and a face detection result, a gesture matching result, in-vehicle environment data and vehicle running state data constraint, and carrying out safety arbitration on the control instruction according to the face recognition result, gesture recognition confidence, in-vehicle environment data and vehicle running state data; And the control instruction passing through the safety arbitration is issued to the corresponding in-vehicle equipment through the control instruction execution layer, and the driving equipment completes the control instruction operation. The method comprises the steps of acquiring face image data of personnel in the automobile through a near infrared camera, acquiring gesture space track data through a millimeter wave radar, acquiring in-automobile environment data through an in-automobile environment sensor, and acquiring vehicle running state data through a CAN bus. Further, the vehicle running state data includes a vehicle speed and a gear, and the in-vehicle environment data includes in-vehicleConcentration and light intensity. Further, based on the face image data, detecting the face living body to obtain a face detection result, wherein the face detection result comprises the steps of processing the face image data by a radio frequency living body detection method, distinguishing a real person from a fake target, and improving face recognition accuracy in a head deflection state through a local feature alignment network. Further, based on gesture space track data, matching gestures to obtain gesture matching results, and the specific method comprises the steps of constructing a 3D gesture track model based on fusion data of millimeter wave radar and RGB cameras to realize gesture space positioning. Further, the specific rules of the secure arbitration rule base include: when the control command is that the vehicle window is completely lowered, the con