CN-121982684-A - Driver state early warning system based on large model

CN121982684ACN 121982684 ACN121982684 ACN 121982684ACN-121982684-A

Abstract

The invention relates to the field of traffic monitoring and discloses a driver state early warning system based on a large model, which comprises a road side image acquisition module, an edge calculation module, a central analysis module and an early warning module, wherein the edge calculation module is used for carrying out real-time processing on video streams, identifying forbidden articles in a vehicle and extracting microscopic action time sequence characteristics and vehicle movement track characteristics of a driver, the central analysis module is used for carrying out space-time association and semantic understanding on the characteristics by utilizing a pre-trained multi-mode large model and comprehensively judging dangerous driving behaviors and risk grades, and the system further comprises a multi-stage persistent state verification module which is used for carrying out secondary verification on tracks and physiological damage states on primary high risk judgment and carrying out grading decision according to verification results. According to the invention, through multi-mode fusion analysis and a multi-level verification mechanism, the accuracy and reliability of identifying the hidden dangerous driving behavior are improved, and the structured law enforcement evidence package can be automatically generated.

Inventors

CHEN YI
CHEN GUANGZHU
XU SHAOQING
Huang anli
YU JIE

Assignees

池州市公安局

Dates

Publication Date: 20260505
Application Date: 20260114

Claims (9)

1. A driver status early warning system based on a large model, comprising: the road side image acquisition module is used for acquiring video streams containing the face and hand areas of a driver of the target vehicle; the edge calculation module is deployed on a road side and is used for carrying out real-time processing on the video stream, wherein the processing comprises the steps of identifying whether a preset forbidden article appears in a vehicle or not based on a pre-trained target detection model, and extracting microscopic action time sequence characteristics of a driver and motion track characteristics of the vehicle; The central analysis module is used for receiving the identification result, the microscopic action time sequence characteristic and the movement track characteristic of the forbidden articles and inputting the identification result, the microscopic action time sequence characteristic and the movement track characteristic into a pre-trained multi-mode large model; and the early warning module is used for automatically generating an early warning instruction when the high risk behavior is determined.
2. The large model-based driver status warning system of claim 1, wherein the multi-modal large model analysis process specifically comprises: aligning the existence state of the forbidden articles, the microscopic action time sequence characteristics and the motion trail characteristics, and mapping the aligned time sequence characteristics and the motion trail characteristics to a unified time sequence embedding space; based on an attention mechanism, calculating cross-modal correlation weights among different modal features in the time sequence embedding space to infer behavior intention and driving capability damage degree; And carrying out context-aware weighted evaluation on the risk level by combining the road section type where the target vehicle is located and time information.
3. The large model based driver status warning system of claim 2, wherein the microscopic action timing features of the driver are extracted and calculated by: Judging based on the spatial position relation between the hand key points and the identified forbidden article boundary frame, and judging as a handheld state when the central point of the article boundary frame is continuously in an area taking the hand key points as circle centers and the preset pixel distance as radius and exceeds a first preset frame number; If D (t) is reduced from being larger than a first distance threshold value to being smaller than a second distance threshold value within a second preset frame number, and the duration of keeping being lower than the second distance threshold value exceeds a third preset frame number, judging a suspected sucking action; and simultaneously, calculating the standard deviation of the pitch angle and the yaw angle of the head in a short-time window as a quantification index of head control force reduction.
4. The large model-based driver status warning system of claim 3, wherein the motion trajectory features of the vehicle are extracted and calculated from the video stream by: performing perspective transformation and lane line detection on each frame of image, establishing a coordinate system taking a lane center line as a reference, and determining a transverse pixel position X (t) of a vehicle in the image through a vehicle detection frame; converting X (t) into a real world approximate coordinate system taking the lane center as a zero point to obtain a transverse offset sequence f (t), calculating the standard deviation Of f (t) in an evaluation window, and calculating the accumulated time duty ratio Of if (t) exceeding a preset safety threshold; And (3) after low-pass filtering the transverse position sequence X (t), calculating the average value of the absolute value sequence of the first-order difference of X (t) as a quantization index of track jitter.
5. The large model based driver status early warning system of claim 1, further comprising a multi-level persistence status verification module; The multistage persistence state verification module is activated after the central analysis module outputs the high risk judgment for the first time, and performs the first-stage persistence verification, specifically: Continuously analyzing the motion trail characteristics of the vehicle in a first verification time window after the initial judgment; And if the standard deviation of the transverse offset sequence f (t), the accumulated time duty ratio exceeding a preset safety threshold or the quantification index of the track jitter continuously exceeds a corresponding threshold in the motion track characteristics, judging that the track abnormality continuously exists.
6. The large model-based driver status warning system of claim 5, wherein the multi-level persistence state verification module performs a two-level depth verification if it is determined that the trajectory anomaly persists, specifically: The edge calculation module is instructed to improve the sampling frequency of the facial features of the driver aiming at the target vehicle, and updated microscopic action time sequence features are obtained; Calculating a comprehensive physiological damage continuous risk index based on the updated microscopic motion timing characteristics The calculation formula is as follows: ; ; Wherein: And Verifying the start-stop time of a time window for the secondary depth; Is that The aspect ratio of the pupil at the moment in time, A pupil aspect ratio reference value for the driver in a normal state; representing the time integral of the forward deviation of the pupil aspect ratio beyond the normal baseline over a secondary depth validation time window; And Respectively verifying standard deviations of head pitch angle and yaw angle in a window by a secondary depth; the representation takes the one with the larger absolute value as the representative value of the axial instability of the head; And The weight coefficient is preset; If it is And if the physiological damage risk is larger than the preset physiological damage risk threshold, judging that the physiological damage state is continuous.
7. The large model-based driver status early warning system of claim 6, wherein the multi-level persistence status verification module further comprises the process of: When the primary continuous verification determines that the track abnormality continuously exists and the secondary deep verification determines that the physiological damage state is continuous, confirming that the primary high risk determination is effective, and improving the final risk confirmation level; when only one level of verification result is established, marking the early warning as needing to be manually checked; And when the two-stage verification results are not established, automatically canceling the primary high risk judgment.
8. The large model based driver status early warning system of claim 5, wherein the duration of the first verification time window is dynamically adjusted according to risk confidence level at the time of initial determination, average speed of current road segment and weather visibility.
9. The large model based driver status early warning system of claim 1, further comprising: And the evidence generation module is used for synchronously solidifying the key video fragments containing the whole action process when the early warning instruction is automatically generated, binding the key video fragments with the vehicle identity information, the action type label, the time and the geographic position, and generating a structured law enforcement evidence package.

Description

Driver state early warning system based on large model Technical Field The invention relates to the field of traffic monitoring, in particular to a driver state early warning system based on a large model. Background Currently, a single-mode detection technology is mainly adopted as a technology for monitoring the state of a driver. Such techniques typically rely on a single type of sensor data for analysis. For example: and (3) based on visual behavior analysis, capturing a facial image of a driver through a vehicle-mounted or road side camera, and analyzing the characteristics of eye opening and closing degree, sight line direction, head posture and the like by utilizing a computer visual algorithm to judge fatigue or distraction state. However, the technical scheme is difficult to identify the specific hand-mouth cooperative complex action of sucking forbidden articles. Based on analysis of vehicle behavior, vehicle trajectories are analyzed through vehicle videos, and anomalies such as lane offsets, steering wheel angle fluctuations and the like are monitored. The early warning signal of this method is of hysteresis, usually triggered after substantial impairment of drivability has occurred, and it is not possible to distinguish whether the impairment is due to substance abuse, sudden illness or ordinary distraction. Disclosure of Invention The invention aims to provide a driver state early warning system based on a large model, which solves the technical problems. The aim of the invention can be achieved by the following technical scheme: A large model-based driver status early warning system, comprising: the road side image acquisition module is used for acquiring video streams containing the face and hand areas of a driver of the target vehicle; the edge calculation module is deployed on a road side and is used for carrying out real-time processing on the video stream, wherein the processing comprises the steps of identifying whether a preset forbidden article appears in a vehicle or not based on a pre-trained target detection model, and extracting microscopic action time sequence characteristics of a driver and motion track characteristics of the vehicle; The central analysis module is used for receiving the identification result, the microscopic action time sequence characteristic and the movement track characteristic of the forbidden articles and inputting the identification result, the microscopic action time sequence characteristic and the movement track characteristic into a pre-trained multi-mode large model; and the early warning module is used for automatically generating an early warning instruction when the high risk behavior is determined. As a further technical solution, the analysis process of the multi-modal large model specifically includes: aligning the existence state of the forbidden articles, the microscopic action time sequence characteristics and the motion trail characteristics, and mapping the aligned time sequence characteristics and the motion trail characteristics to a unified time sequence embedding space; based on an attention mechanism, calculating cross-modal correlation weights among different modal features in the time sequence embedding space to infer behavior intention and driving capability damage degree; And carrying out context-aware weighted evaluation on the risk level by combining the road section type where the target vehicle is located and time information. As a further technical solution, the microscopic action time sequence features of the driver are extracted and calculated by the following steps: Judging based on the spatial position relation between the hand key points and the identified forbidden article boundary frame, and judging as a handheld state when the central point of the article boundary frame is continuously in an area taking the hand key points as circle centers and the preset pixel distance as radius and exceeds a first preset frame number; If D (t) is reduced from being larger than a first distance threshold value to being smaller than a second distance threshold value within a second preset frame number, and the duration of keeping being lower than the second distance threshold value exceeds a third preset frame number, judging a suspected sucking action; and simultaneously, calculating the standard deviation of the pitch angle and the yaw angle of the head in a short-time window as a quantification index of head control force reduction. As a further technical solution, the motion trail feature of the vehicle is extracted and calculated from the video stream by the following steps: performing perspective transformation and lane line detection on each frame of image, establishing a coordinate system taking a lane center line as a reference, and determining a transverse pixel position X (t) of a vehicle in the image through a vehicle detection frame; converting X (t) into a real world approximate coordinate system taking the lane center as