CN-121982662-A - Video traffic event intelligent detection system and method based on end Bian Yun cooperation

CN121982662ACN 121982662 ACN121982662 ACN 121982662ACN-121982662-A

Abstract

The invention discloses an intelligent detection system and method for video traffic events based on end Bian Yun collaboration, wherein the intelligent detection system comprises a camera of an end layer, NPU edge computing equipment of an edge layer and a cloud server of a cloud layer, the cloud end utilizes a large model to automatically complete road scene semantic understanding and geometric calibration of camera vision, calibration data are generated and transmitted to an edge equipment cache, the edge equipment carries out local lightweight target tracking and traffic parameter calculation on video streams based on the cache data, preliminary identification of suspected events is achieved, associated video fragments and metadata are packed and uploaded only when the suspected events are identified, and the cloud end carries out fusion verification and erroneous judgment prevention processing on reported events and outputs standardized event primitive languages to a traffic management system. According to the method, automatic calibration and quick deployment are realized, the network bandwidth consumption and cloud load are greatly reduced through edge local processing and event triggering uploading, and the overall accuracy and reliability of event detection are effectively improved through cloud secondary verification.

Inventors

GONG GUOHUI
SUI QIANG
XIA YIMIN

Assignees

湖南长城银河科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260123

Claims (10)

1. An intelligent video traffic event detection system based on end Bian Yun cooperation, which is characterized by comprising: The end layer comprises a plurality of video cameras which are arranged along the road and are used for collecting real-time video streams of the road area and transmitting the real-time video streams to the side layer through a network; The side layer comprises edge computing equipment which is arranged close to a camera and is provided with NPU acceleration hardware and is used for receiving and decoding video streams, locally executing light multi-target detection and tracking, obtaining traffic parameters based on tracking results, carrying out space constraint analysis based on the traffic parameters and cached road calibration data so as to carry out preliminary identification of suspected events, packaging the preliminary identified suspected events and associated event metadata thereof with a downloading address containing a key video fragment into event packages, and uploading the event packages to the cloud layer; The cloud layer comprises a cloud server arranged in a data center, a large model reasoning engine and an event fusion platform are integrated, the cloud server is used for carrying out road scene semantic understanding and geometric calibration on camera images uploaded by the boundary layer based on a large model, generating and transmitting road calibration data to the boundary layer, receiving event packets uploaded by the boundary layer, acquiring corresponding key video fragments according to download addresses in the event packets, carrying out fusion verification and erroneous judgment prevention processing on the reported primary events, generating standardized final event primitive and outputting the standardized final event primitive to the traffic information management system; The edge computing device uploads the camera image to the cloud server to acquire and cache calibration data when the edge computing device is deployed for the first time or needs to be re-calibrated, and uploads an event package containing a downloading address of the key video fragment to the cloud for verification only when a suspected event is identified in normal operation.
2. The intelligent video traffic event detection system according to claim 1, wherein the lightweight multi-objective detection and tracking performed by the edge computing device comprises vehicle detection and multi-objective tracking, and vehicle speed estimation and density statistics are performed based on tracking results to obtain traffic parameters, the space constraint analysis analyzes the traffic parameters by using a region of interest and a pixel-geographic mapping relationship included in the road calibration data to perform preliminary identification of suspected events, and the preliminary identification of suspected events comprises identification of suspected anchor break, suspected retrograde, illegal intrusion and local congestion events.
3. The intelligent detection system for video traffic events according to claim 2, wherein the cloud server integrated large model reasoning engine comprises a large-scale visual model based on a transducer architecture and is used for carrying out full-automatic road semantic analysis and geometric calibration on an input camera image, wherein the semantic analysis comprises road semantic segmentation and lane line extraction on the image, the road semantic segmentation is at least used for identifying a travelable road area, a non-motor lane, a crosswalk and a median, the geometric calibration comprises dividing an interest area based on the image, extracting key geometric feature points, establishing a mapping relation between image pixel coordinates and actual geographic distances, and automatically identifying the number and trend of lanes to generate a bird's eye view projection matrix.
4. The intelligent video traffic event detection system according to claim 3, wherein the road calibration data is in JSON format, the data is generated by the cloud server and issued to the edge computing device for persistent storage and local caching, and the edge computing device is configured to trigger a self-checking calibration procedure periodically or when recalibration is required, and request updating of the road calibration data from the cloud server.
5. The intelligent video traffic event detection system according to claim 4, wherein the event package comprises event metadata and a download address of the key video snippets, the download address being used for the cloud server to obtain the corresponding video snippets, wherein the event metadata at least comprises event type, occurrence time, location coordinates and related object ID.
6. The intelligent detection system for video traffic events according to claim 5, wherein the fusion verification and erroneous judgment prevention processing of the reported preliminary events by the cloud server comprises time alignment and space matching of event streams from a plurality of edge nodes to judge whether the event streams are repeated reporting of the same event under different view angles, context verification by combining weather, holidays and historical traffic law auxiliary information, multi-model voting cross verification by starting a standby model, and dynamic confidence scoring according to environment complexity to adjust a judgment threshold.
7. An intelligent detection method for video traffic events based on end Bian Yun cooperation, which is characterized by being applied to the system as claimed in any one of claims 1 to 6, and comprising the following stages: The installation and calibration stage comprises the steps that when edge computing equipment is deployed for the first time or needs to be re-calibrated, a camera image is uploaded to a cloud server, and the cloud server carries out semantic understanding and geometric calibration on the image on the basis of a large model, generates road calibration data and sends the road calibration data to the edge computing equipment for caching; The operation detection stage comprises the steps that the edge computing equipment carries out multi-target detection and tracking on the real-time video stream based on cached calibration data, carries out space constraint analysis on the basis of traffic parameters and calibration data obtained through tracking to primarily identify events, takes primarily identified suspected events and associated event metadata thereof and a downloading address containing key video fragments as event packets, and uploads the event packets to the cloud server, wherein the cloud server obtains corresponding key video fragments according to the downloading address, carries out fusion verification and misjudgment prevention processing on the reported primary events, and generates final event primitive language output.
8. The intelligent detection method of video traffic events according to claim 7, wherein the installation calibration phase specifically comprises: starting and pushing the camera to the edge computing equipment; the edge computing equipment initiates a calibration request with a camera ID and an initial video frame to a cloud server; the cloud server invokes the pre-trained large model to perform semantic analysis on the input image, segments the road area, extracts geometric feature points, establishes a mapping relation between pixel coordinates and geographic distances, identifies lane information and generates a bird's-eye view projection matrix; and the cloud server returns the calibration result to the edge computing equipment for persistent storage.
9. The intelligent detection method of video traffic events according to claim 8, wherein the operation detection phase specifically comprises: the edge computing equipment receives the real-time video stream and utilizes the cached calibration data to define an analysis area; Starting a multi-target detection and tracking algorithm to continuously track the position, speed and direction of the vehicle; Calculating the actual running speed, the head space and the area occupancy index of the vehicle based on the calibrated space parameters; Triggering a preliminary event judgment logic, and packaging event related information into an event package for uploading; The cloud server receives the event package, acquires the key video fragment according to the download address in the event package, and performs time alignment and space matching, context verification, multi-model voting and dynamic confidence scoring operations; The confirmed valid event is generated into a standard event primitive and pushed to the traffic information management system through the API interface.
10. The intelligent detection method of video traffic events according to claim 9, wherein the preliminary event judgment logic comprises: If a certain vehicle is stationary for more than a threshold time, judging that the vehicle is a suspected anchoring event; if the target continuously moves reversely for multiple frames, judging that the target is a suspected retrograde event; If the pedestrian appears on the closed high-speed road section, judging that the pedestrian is an illegal break-in event; and if the vehicle density in the local area suddenly increases and the average speed is lower than the threshold value, judging that the local congestion event exists.

Description

Video traffic event intelligent detection system and method based on end Bian Yun cooperation Technical Field The invention belongs to the technical field of intelligent traffic systems and computer vision, and particularly relates to an intelligent video traffic event detection system and method based on an end-to-side cloud three-level collaborative architecture, which are particularly suitable for detecting and processing traffic events such as traffic jams, abnormal parking, reverse running, pedestrian intrusion and the like in scenes such as urban roads, highways and the like in real time. Background With the advancement of smart city construction, road traffic monitoring systems are becoming increasingly popular. The traditional video event detection mainly adopts a mode that edge equipment operates independently or a central server processes in a centralized mode, and has the problems that (1) the computing power of the edge equipment is limited, a common camera or low-end NPU equipment is difficult to support a complex model (such as a large model image segmentation and target tracking fusion algorithm), so that detection accuracy is low, (2) cloud processing delay is high, huge bandwidth pressure and response delay are generated if all video streams are uploaded to a cloud for analysis, real-time requirements cannot be met, (3) calibration depends on manual work, the work of camera visual angle calibration, road region division and the like in the traditional system usually depends on manual labeling or field debugging, efficiency is low and errors are prone to occur, and (4) misinformation rate is high, a single node judgment lacks a context verification mechanism, and misjudgment is prone to be caused by factors such as illumination change and shielding. In recent years, an end-side-cloud collaborative computing architecture is gradually rising in the application of the Internet of things and the AI, but a systematic scheme is not yet applied to the whole process of highway event detection, and particularly the design of combining large-model auxiliary calibration and multistage event judgment error prevention mechanisms still belongs to the blank. The prior art schemes are mainly divided into two categories. The first type is a central server processing architecture, which transmits all video streams back to a cloud GPU server for analysis, and the problems of high network transmission delay and high bandwidth dependency are solved although the computing power is strong, so that the real-time alarm requirement is difficult to meet. The second type is a traditional embedded edge detection architecture, a low-computation-force processor is adopted to run a traditional image algorithm, and although bandwidth pressure is reduced, a high-precision deep learning model cannot be supported due to serious insufficient computation force, so that detection performance is low. In summary, the above two existing schemes cannot achieve a good balance among accuracy, real-time performance and resource consumption. Therefore, a need exists for an efficient, accurate, scalable end Bian Yun collaborative highway event detection scheme. Disclosure of Invention Aiming at the technical problems, the invention provides an intelligent video traffic event detection system and method based on the cooperation of the terminals Bian Yun. The technical scheme adopted for solving the technical problems is as follows: an intelligent detection system for video traffic events based on end Bian Yun cooperation, comprising: The end layer comprises a plurality of video cameras which are arranged along the road and are used for collecting real-time video streams of the road area and transmitting the real-time video streams to the side layer through a network; The side layer comprises edge computing equipment which is arranged close to a camera and is provided with NPU acceleration hardware and is used for receiving and decoding video streams, locally executing light multi-target detection and tracking, obtaining traffic parameters based on tracking results, carrying out space constraint analysis based on the traffic parameters and cached road calibration data so as to carry out preliminary identification of suspected events, and uploading the preliminary identified suspected events and associated event metadata thereof and a downloading address containing a key video fragment to the cloud layer as event packets; The cloud layer comprises a cloud server arranged in a data center, a large model reasoning engine and an event fusion platform are integrated, the cloud server is used for carrying out road scene semantic understanding and geometric calibration on camera images uploaded by the boundary layer based on a large model, generating and transmitting road calibration data to the boundary layer, receiving event packets uploaded by the boundary layer, acquiring corresponding key video fragments according to download addresses in the