CN-115861981-B - Driver fatigue behavior detection method and system based on video posture invariance

CN115861981BCN 115861981 BCN115861981 BCN 115861981BCN-115861981-B

Abstract

The invention discloses a driver fatigue behavior detection method and system based on video posture invariance, and relates to the technical field of computer vision. The invention provides a key frame selection model based on facial geometry information and a head and face action information fusion space-time network. First, a driver video captured by an in-vehicle image capturing is serialized and an image preprocessing is performed. And then constructing a keyframe selection model based on facial geometric information based on the geometric characteristics of the face key points and a two-stage decision mechanism, and extracting keyframes in the video sequence. And finally, extracting a face action mode under any gesture based on face forward processing, and constructing a head-face action information fusion space-time network by combining the head gesture attribute obtained based on head gesture estimation, wherein the head-face action information fusion space-time network is used for detecting states of drivers such as yawning, speaking, normal and the like. The invention fully considers the head gesture attribute, has high gesture robustness, and can effectively distinguish fatigue behaviors such as yawning and other driver states.

Inventors

CHANG FALIANG
Lu Yansha
LIU CHUNSHENG
HUANG YIMING
LIU HUI

Assignees

山东大学

Dates

Publication Date: 20260508
Application Date: 20221125

Claims (4)

1. The driver fatigue behavior detection method based on video posture invariance is characterized by comprising the following steps of: Carrying out video serialization on the acquired driver video, and preprocessing video image data; the method comprises the specific steps of extracting facial geometric features based on facial key points, designing a two-stage decision mechanism, constructing a key frame selection model based on facial geometric information, and extracting key frames in a video sequence; The specific steps of the first stage judgment mechanism in the two-stage judgment mechanism are that the inter-point distance ratio and the angle relation of each video key frame are calculated based on a human face key point set to construct facial geometric features, then similarity among continuous frames is calculated by using Euclidean distance to obtain a similarity set among the continuous frames, and then a similarity threshold is determined to select candidate key frames; The method comprises the specific steps of selecting a video frame with an outlier characteristic from a candidate key frame queue to obtain a key frame set, and selecting a frame based on two similarity measurement indexes and outlier frame detection, wherein the similarity measurement indexes adopt Euclidean distance and mean square error, and the outlier frame detection adopts median absolute deviation; Extracting a facial action mode under any gesture based on facial forward processing according to the selected key frame, and constructing a head-face action information fusion space-time network by combining with head gesture attributes acquired based on head gesture estimation to detect fatigue behavior; the specific steps of extracting the facial action mode under any gesture based on the facial forward processing are that an encoder-decoder backbone network is adopted to represent learning, and two auxiliary mechanisms of illumination maintenance and attention are introduced on the basis, so as to generate a lifelike front image with illumination maintenance; and constructing a head and face motion information fusion space-time network based on the 3D convolution network, wherein the head gesture attribute and the face motion mode are fused into the double-channel classifier by the network to realize gesture invariance.
2. The method for detecting fatigue behavior of a driver based on video gesture invariance according to claim 1, wherein a face detection algorithm is adopted to detect a face region of the driver, a head and face action region of the driver is segmented, then image denoising is carried out based on a fast median filtering algorithm, illumination normalization is carried out by adopting a limited contrast self-adaptive histogram equalization, and finally a cascade regression tree is adopted to detect key points of the face based on dlib libraries.
3. The method for detecting fatigue behavior of a driver based on video gesture invariance according to claim 1, wherein the specific step of obtaining the head gesture attribute based on head gesture estimation is to design a head gesture estimation method based on squeze-Net, and normalize Euler angle representation of the head gesture through sine and cosine functions to obtain the head gesture attribute.
4. Driver fatigue behavior detection system based on video gesture invariance, characterized by comprising: The preprocessing module is configured to carry out video serialization on the acquired driver video and preprocess video image data; the key frame module is configured to design a key frame selection model based on facial geometric information and realize key frame selection, and comprises the specific steps of extracting facial geometric features based on facial key points and designing a two-stage judgment mechanism, further constructing the key frame selection model based on the facial geometric information and extracting key frames in a video sequence; The specific steps of the first stage judgment mechanism in the two-stage judgment mechanism are that the inter-point distance ratio and the angle relation of each video key frame are calculated based on a human face key point set to construct facial geometric features, then similarity among continuous frames is calculated by using Euclidean distance to obtain a similarity set among the continuous frames, and then a similarity threshold is determined to select candidate key frames; The method comprises the specific steps of selecting a video frame with an outlier characteristic from a candidate key frame queue to obtain a key frame set, and selecting a frame based on two similarity measurement indexes and outlier frame detection, wherein the similarity measurement indexes adopt Euclidean distance and mean square error, and the outlier frame detection adopts median absolute deviation; The fatigue behavior detection module is configured to extract a facial action mode under any gesture based on facial forward processing according to the selected key frame, and construct a head-face action information fusion space-time network in combination with the head gesture attribute obtained based on head gesture estimation so as to detect fatigue behavior; the specific steps of extracting the facial action mode under any gesture based on the facial forward processing are that an encoder-decoder backbone network is adopted to represent learning, and two auxiliary mechanisms of illumination maintenance and attention are introduced on the basis, so as to generate a lifelike front image with illumination maintenance; and constructing a head and face motion information fusion space-time network based on the 3D convolution network, wherein the head gesture attribute and the face motion mode are fused into the double-channel classifier by the network to realize gesture invariance.

Description

Driver fatigue behavior detection method and system based on video posture invariance Technical Field The invention relates to the technical field of computer vision, in particular to a driver fatigue behavior detection method and system based on video posture invariance. Background The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art. Advanced Driver Assistance Systems (ADASs) based on vision mainly provide the functions of environmental awareness, driving monitoring, early warning, assistance in vehicle control, etc. In recent years, ADAS has been a subject of intense research to improve road safety and traffic efficiency. Fatigue driving is a dangerous driving state, and the driver often has physiological and psychological dysfunction in the fatigue state, and the driving skill is objectively reduced. Fatigue driving is one of the main causes of serious traffic accidents in the world. The visual fatigue driving detection method mainly focuses on behavior characteristics, the video images of the driver are collected by means of the camera, the images are usually non-contact, the application is convenient and fast, the visual fatigue driving detection method can be applied to monitoring the state of the driver and giving early warning in time, and the visual fatigue driving detection method has high practical value in the aspect of reducing traffic accidents. Over the past few decades, many researchers have proposed different driver fatigue detection methods to help drivers drive safely and improve traffic safety. The behavior characteristics of the driver in fatigue driving include blinking, nodding, closing eyes, yawning and the like. Among them, yawning is one of the main manifestations of fatigue. In a real driving environment, due to the problems of high real-time requirements, complex facial expressions, changeable head gestures and the like, the existing method is difficult to accurately and robustly detect fatigue behaviors such as yawning and the like, so how to efficiently and accurately detect the fatigue behaviors in the actual driving environment becomes a problem to be solved urgently. Disclosure of Invention Aiming at the defects existing in the prior art, the invention aims to provide a driver fatigue behavior detection method and system based on video gesture invariance, which solve the problems that the video frames in the prior art are excessively redundant, the detection under any gesture can not be effectively realized, the yawning and similar behaviors can not be accurately distinguished, and improve the accuracy and the robustness of the fatigue behavior detection. In order to achieve the above object, the present invention is realized by the following technical scheme: the first aspect of the invention provides a driver fatigue behavior detection method based on video posture invariance, which comprises the following steps: Carrying out video serialization on the acquired driver video, and preprocessing video image data; Designing a key frame selection model based on facial geometry information, and realizing key frame selection; And extracting a facial action mode under any gesture based on the facial forward processing according to the selected key frame, and constructing a head-face action information fusion space-time network by combining the head gesture attribute acquired based on the head gesture estimation so as to detect fatigue behavior. Further, preprocessing is performed on the video image data, including image denoising, histogram normalization, face key point detection and face segmentation. Further, a face detection algorithm is adopted to detect a face region of a driver, a head and face action region of the driver is segmented, then image denoising is carried out based on a rapid median filtering algorithm, illumination normalization is carried out by adopting a limited contrast self-adaptive histogram equalization (CLAHE), and finally, a cascade regression tree (ETR) is adopted to detect key points of the face based on a dlib library. Further, the specific steps of designing a key frame selection model based on the facial geometry information and realizing key frame selection are that facial geometry characteristics are extracted based on facial key points, a two-stage judgment mechanism is designed, and then the key frame selection model based on the facial geometry information is constructed, and key frames in a video sequence are extracted. Further, the specific steps of the first stage decision mechanism in the two stage decision mechanism are that firstly, the inter-point ratio and the angle relation of each video key frame are calculated based on a human face key point set to construct facial geometric features, then, similarity among continuous frames is calculated by using Euclidean distance to obtain a similarity set among the continuous frames, and then, a simil