Search

CN-122024130-A - 3D high-altitude parabolic detection method and system based on monocular camera

CN122024130ACN 122024130 ACN122024130 ACN 122024130ACN-122024130-A

Abstract

The embodiment of the specification provides a 3D high-altitude parabolic detection method and system based on a monocular camera, wherein the method comprises the steps of collecting video images of a monitoring area in real time, preprocessing, obtaining depth information depth maps containing each object in the images based on the preprocessed video images, synchronously inputting the preprocessed video images into a pre-trained target detection model, identifying and positioning potential parabolic objects in the images, tracking motion tracks, outputting class labels of each potential parabolic object and two-dimensional boundary boxes of each potential parabolic object under an image coordinate system, reconstructing three-dimensional motion tracks of each tracked potential parabolic object by combining a two-dimensional boundary box sequence of each tracked potential parabolic object in continuous image frames and depth maps of corresponding frames, analyzing the three-dimensional motion tracks, calculating kinematic features of the object in the vertical direction, and judging that a high-altitude parabolic event occurs when the kinematic features meet preset high-altitude parabolic judgment conditions.

Inventors

  • SHAO YI
  • TIAN LEI
  • WANG FEI

Assignees

  • 中建材信息技术股份有限公司
  • 中建材信云智联科技有限公司
  • 中建材信息科技有限公司
  • 中建材信云智联科技有限公司北京分公司
  • 中建材信云智联科技(北京)有限公司

Dates

Publication Date
20260512
Application Date
20251231

Claims (10)

  1. 1. The 3D high-altitude parabolic detection method based on the monocular camera is characterized by comprising the following steps of: S1, acquiring video images of a monitoring area in real time through a monocular camera deployed in the monitoring area, and preprocessing the video images; S2, inputting the preprocessed video image into a pre-trained monocular depth estimation model to obtain a depth map corresponding to the image, wherein the depth map contains depth information of each object in the image; s3, synchronously inputting the preprocessed video images into a pre-trained target detection model, identifying and positioning potential parabolic objects in the images, tracking the motion trail of the identified targets, and outputting class labels of each potential parabolic object and a two-dimensional boundary box of each potential parabolic object under an image coordinate system; s4, reconstructing a three-dimensional motion track of each tracked potential parabolic object by combining a two-dimensional boundary frame sequence of each tracked potential parabolic object in continuous image frames and a depth map of a corresponding frame; S5, analyzing the three-dimensional motion trail, calculating the kinematic characteristics of the object in the vertical direction, and judging that a high-altitude parabolic event occurs when the kinematic characteristics meet preset high-altitude parabolic judgment conditions.
  2. 2. The method according to claim 1, wherein the monocular depth estimation model is constructed based on a convolutional neural network, a multi-scale feature fusion mechanism is introduced, training is performed by using a depth estimation data set comprising a plurality of indoor and outdoor scenes, and a higher weight is allocated to a depth prediction error of an image distant region in a loss function of the monocular depth estimation model.
  3. 3. The method according to claim 1, wherein the target detection model is based on YOLOv architecture, a convolution attention module is embedded in the network, and adaptation adjustment is performed on the preset anchor frame size of the model according to the size statistical distribution of typical objects in the high-altitude parabolic scene.
  4. 4. The method according to claim 1, wherein reconstructing a three-dimensional motion trajectory of each tracked potential parabolic object in combination with its two-dimensional bounding box sequence in successive image frames and a depth map of the corresponding frame comprises: Using a multi-target tracking algorithm to correlate two-dimensional boundary frames belonging to the same object detected in continuous image frames to form a two-dimensional image track sequence of the object; For each frame in the two-dimensional image track sequence, based on a depth map corresponding to the frame and an internal reference matrix of a camera, carrying out back projection calculation on key pixel points at the bottom of an object boundary frame to obtain three-dimensional space coordinates of the key pixel points at the moment of the frame; and filtering and smoothing the three-dimensional space coordinate sequence by using a Kalman filter, and predicting the future motion state of the object to generate a stable three-dimensional motion track.
  5. 5. The method of claim 4, wherein the key pixel point is a pixel coordinate of a midpoint of a bottom edge of the two-dimensional bounding box of the object.
  6. 6. The method of claim 1, wherein the kinematic features include instantaneous velocity and acceleration of the object in a vertical direction, and wherein the preset high altitude parabolic determination condition comprises: the vertical downward acceleration value of the object is in a preset error range taking the gravity acceleration as the center; The vertical downward velocity of the object continues to increase; the total displacement of the object in the vertical direction exceeds a height threshold preset according to the monitoring environment; The above condition continues to hold in consecutive multiframes.
  7. 7. The method of claim 1, further comprising triggering an alarm and storing event related data if it is determined that an overhead parabolic event has occurred.
  8. 8. 3D high altitude parabolic detection system based on monocular camera, characterized in that includes: The video acquisition and processing module is used for acquiring video images of the monitoring area in real time through a monocular camera deployed in the monitoring area and preprocessing the video images; the depth estimation module is used for inputting the preprocessed video image into a pre-trained monocular depth estimation model to obtain a depth map corresponding to the image, wherein the depth map contains depth information of each object in the image; The target detection module is used for synchronously inputting the preprocessed video images into a pre-trained target detection model, identifying and positioning potential parabolic objects in the images, tracking the motion trail of the identified targets, and outputting class labels of each potential parabolic object and a two-dimensional boundary frame of each potential parabolic object under an image coordinate system; The track analysis module is used for reconstructing a three-dimensional motion track of each tracked potential parabolic object by combining a two-dimensional boundary frame sequence of each tracked potential parabolic object in continuous image frames and a depth map of a corresponding frame; The judging module is used for analyzing the three-dimensional motion trail, calculating the kinematic characteristics of the object in the vertical direction, and judging that a high-altitude parabolic event occurs when the kinematic characteristics meet preset high-altitude parabolic judging conditions.
  9. 9. An electronic device, comprising: Processor, and A memory arranged to store computer executable instructions which when executed cause the processor to implement the steps of the monocular camera based 3D high altitude parabolic detection method as claimed in any one of claims 1 to 7.
  10. 10. A storage medium storing computer executable instructions which when executed implement the steps of the monocular camera based 3D high altitude parabolic detection method of any one of claims 1 to 7.

Description

3D high-altitude parabolic detection method and system based on monocular camera Technical Field The document relates to the technical field of computer vision, in particular to a 3D high-altitude parabolic detection method and system based on a monocular camera. Background With the acceleration of the urban process, high-rise buildings are increasingly increased, and the problem of high-altitude throwing is increasingly remarkable. The high-altitude parabolic not only seriously threatens the life safety of pedestrians and causes personal casualties and property loss, but also forms challenges for social order and public safety. For example, a small object falling from high altitude may generate a large impact force under the action of gravitational acceleration, which is sufficient to cause serious injury or even fatal injury to pedestrians. According to the related statistics, the casualties caused by high-altitude parabolic events are in an ascending trend in recent years, and a heavy burden is brought to society. Traditional high-altitude parabolic detection mainly relies on manual patrol and monitoring video playback. The manual patrol has the problems of limited coverage, low efficiency, incapability of real-time response and the like, and is difficult to comprehensively and effectively monitor the high altitude parabolic behavior. The monitoring video playback needs to consume a great deal of time and manpower to check after the event occurs, so that the throwing behavior can not be early warned and stopped in time, and the trouble-causing person can not be traced accurately. In addition, complex environmental factors, such as illumination changes, weather conditions, background interference, and the like, further increase the difficulty of the traditional detection method. The depth detection technology based on the monocular camera provides an innovative and effective solution for high-altitude parabolic detection. The monocular camera has the advantages of low cost, convenient deployment and the like, and is suitable for large-scale application. Through advanced algorithm, the monocular camera can carry out depth analysis on the video image, not only can accurately detect the motion track and the position of the object, but also can judge the height and the speed of the object through the depth information, thereby effectively identifying the high-altitude parabolic behavior. Compared with the traditional method, the technology has the remarkable advantages of strong real-time performance, high detection precision, low false alarm rate and the like, can timely discover and early warn high-altitude parabolic events, and provides powerful support for guaranteeing public safety. Disclosure of Invention One or more embodiments of the present disclosure provide a 3D high altitude parabolic detection method based on a monocular camera, including: S1, acquiring video images of a monitoring area in real time through a monocular camera deployed in the monitoring area, and preprocessing the video images; S2, inputting the preprocessed video image into a pre-trained monocular depth estimation model to obtain a depth map corresponding to the image, wherein the depth map contains depth information of each object in the image; s3, synchronously inputting the preprocessed video images into a pre-trained target detection model, identifying and positioning potential parabolic objects in the images, tracking the motion trail of the identified targets, and outputting class labels of each potential parabolic object and a two-dimensional boundary box of each potential parabolic object under an image coordinate system; s4, reconstructing a three-dimensional motion track of each tracked potential parabolic object by combining a two-dimensional boundary frame sequence of each tracked potential parabolic object in continuous image frames and a depth map of a corresponding frame; S5, analyzing the three-dimensional motion trail, calculating the kinematic characteristics of the object in the vertical direction, and judging that a high-altitude parabolic event occurs when the kinematic characteristics meet preset high-altitude parabolic judgment conditions. Further, the monocular depth estimation model is constructed based on a convolutional neural network, a multi-scale feature fusion mechanism is introduced, a depth estimation data set containing various indoor and outdoor scenes is used for training, and higher weight is distributed to depth prediction errors of an image distant region in a loss function of the monocular depth estimation model. Further, the target detection model takes YOLOv architecture as a basic model, a convolution attention module is embedded in the network, and the size of an anchor frame preset by the model is adaptively adjusted according to the size statistical distribution of typical objects in the high-altitude parabolic scene. Further, for each tracked potential parabolic object, reconstructin