CN-116704436-B - Target detection method and device, electronic equipment and storage medium

CN116704436BCN 116704436 BCN116704436 BCN 116704436BCN-116704436-B

Abstract

The embodiment of the application relates to the technical field of artificial intelligence, and provides a target detection method and device, electronic equipment and a storage medium, wherein a target boundary frame is generated by detecting a target in a video frame, then whether the target boundary frame exists in front and rear video frames is searched, if not, a first detection boundary frame is generated in the front and rear frames, the correlation degree between the first detection boundary frame and a real target boundary frame is calculated, and if the correlation degree indicates that the current video frame is a difficult case, the video frame is added into a difficult case set; and updating the model after obtaining the updated receipt, and detecting the video stream by using the updated model. According to the embodiment of the application, the identification capability of the target detection model on the target and the robustness of the model when the target is identified in the environment with similar objects can be improved.

Inventors

ZHANG NAN
WANG JIANZONG
QU XIAOYANG

Assignees

平安科技（深圳）有限公司

Dates

Publication Date: 20260505
Application Date: 20230607

Claims (8)

1. A method of object detection, characterized in that it is applied to an edge device, said method comprising the steps of: acquiring a video stream through a camera unit; respectively carrying out target detection on each video frame in the video stream through a target detection model positioned at the edge equipment so as to generate a target boundary frame in each video frame; Acquiring target boundary frames existing in the current traversing video frame, determining whether the target boundary frames exist in adjacent video frames or not for each acquired target boundary frame, determining the target boundary frames as real target boundary frames if the target boundary frames exist in the adjacent video frames, determining the target boundary frames as suspicious pseudo target boundary frames if the target boundary frames do not exist in the adjacent video frames, generating first detection boundary frames matched with the suspicious pseudo target boundary frames in the first M video frames and the last N video frames of the current traversing video frame respectively, and determining that the current traversing video frame belongs to a difficult case if the intersection ratio of the calculated correlation degree of the first detection boundary frames and the real target boundary frames is smaller than a threshold value, and adding the current traversing video frame into the difficult case, wherein M is equal to or greater than 0; the difficult case set is sent to a server, so that the server trains copies of the target detection model according to the difficult case set to obtain model update data; Receiving model update data sent by the server; updating the target detection model according to the model updating data to obtain an updated target detection model; and carrying out target detection on the video stream acquired by the camera unit through the updated target detection model.
2. The object detection method according to claim 1, wherein before calculating the degree of correlation from the first detection bounding box and the real object bounding box, the method further comprises: amplifying the first detection bounding box for each of the first M video frames and the last N video frames, generating a second detection bounding box in the video frames; And calculating a normalized cross-correlation NCC value according to the second detection boundary box and the suspected pseudo-target boundary box, and taking the video frame as a reserved video frame if the NCC value is greater than or equal to a first preset threshold value.
3. The method according to claim 1, wherein generating a first detection bounding box matching the suspected pseudo-target bounding box in the first M video frames and the last N video frames of the current traversal video frame, respectively, comprises: For each video frame in the first M video frames and the last N video frames, searching a region matched with the suspected pseudo-target boundary frame in the video frames through a template matching algorithm; and determining the first detection boundary box according to the searched area.
4. The method of claim 1, wherein training copies of the object detection model based on the set of difficult cases to obtain model update data comprises: Labeling the video frames in the difficult case set by adopting a teacher model to obtain a labeling target boundary frame corresponding to the video frames; Inputting the video frames in the difficult cases into a copy of the target detection model to obtain a prediction target boundary box; Determining a loss value according to the labeling target boundary box and the prediction target boundary box corresponding to each video frame in the difficult case set; and adjusting model parameters of the copy of the target detection model according to the loss value until a preset training ending condition is met.
5. The object detection method according to claim 1, wherein before updating the object detection model based on the model update data, the method further comprises: Obtaining a target detection backup model; Performing model updating on the target detection backup model according to the model updating data to obtain an updated target detection backup model; and carrying out target detection on the video stream acquired by the camera unit through the updated target detection backup model.
6. An object detection device, the device comprising: the video stream acquisition module is used for acquiring a video stream through the camera unit; The first target detection module is used for respectively carrying out target detection on each video frame in the video stream through a target detection model positioned at the edge equipment so as to generate a target boundary frame in each video frame; The difficult-case mining module is used for traversing each video frame in the video stream and executing the following processing on a current traversed video frame, wherein the processing comprises the steps of acquiring target boundary frames existing in the current traversed video frame, determining whether the target boundary frames exist in adjacent video frames or not respectively, determining the target boundary frames as real target boundary frames if the target boundary frames exist in the adjacent video frames, determining the target boundary frames as suspicious pseudo-target boundary frames if the target boundary frames do not exist in the adjacent video frames, determining the target boundary frames as real target boundary frames if the target boundary frames do not exist in the adjacent video frames, generating first detection boundary frames matched with the suspicious pseudo-target boundary frames in the first M video frames and the last N video frames of the current traversed video frame respectively, determining that the current traversed video frame belongs to a difficult case if the cross ratio of the first detection boundary frames to the real target boundary frames is smaller than a threshold value, and adding the current traversed video frame into a difficult case set, wherein M is equal to or greater than 0; The sending module is used for sending the difficult case set to a server side so that the server side trains the copy of the target detection model according to the difficult case set to obtain model update data; the receiving module is used for receiving the model update data sent by the server; the updating module is used for updating the target detection model according to the model updating data to obtain an updated target detection model; And the second target detection module is used for carrying out target detection on the video stream acquired by the camera unit through the updated target detection model.
7. An electronic device comprising a memory storing a computer program or instructions and a processor that when executed implements the method of any one of claims 1 to 5.
8. A computer-readable storage medium, characterized in that it has stored thereon a computer program or instructions which, when executed by a processor, implement the method according to any of claims 1 to 5.

Description

Target detection method and device, electronic equipment and storage medium Technical Field The present application relates to the field of artificial intelligence technologies, and in particular, to a target detection method and apparatus, an electronic device, and a storage medium, which may be applied to a financial scenario. Background With the development of artificial intelligence technology, the application of artificial intelligence technology in financial scenes is becoming more and more popular, and the traditional financial industry is gradually changing to financial science and technology (Fintech). Target detection is an important technical application of artificial intelligence technology, and in the field of security, the target detection can help a monitoring system to discover abnormal conditions in time and provide effective early warning and alarming. The financial industry is also an important point of security work, including banks, postal service, certificates, etc. The safety precaution is an important daily work in the financial industry, and has extremely important significance for the normal operation of enterprises and obtaining good economic and social benefits. When the target detection model is deployed on the lightweight edge equipment and the edge equipment realizes target detection, the edge equipment generally calculates based on pixel points on an image and assisted by a related algorithm, and when pixel groups between two objects are similar, the existing target detection technology can be used for misidentifying a false object similar to the target object with a certain probability. That is, the object detection model of the edge device may be affected by the similar object, and when the object similar to the object exists in the identified content, the object detection model is unstable and false detection occurs, which may bring unreliable factors to security monitoring in the financial industry. Disclosure of Invention The embodiment of the application mainly aims to provide a target detection method and device, electronic equipment and storage medium, aiming at improving the recognition capability of a model on a target and the robustness of the model when the target is recognized in an environment with similar objects, and further improving the reliability of security monitoring in the financial industry. In order to achieve the above object, a first aspect of an embodiment of the present application provides an object detection method, which is applied to an edge device, and the method includes: acquiring a video stream through a camera unit; respectively carrying out target detection on each video frame in the video stream through a target detection model positioned at the edge equipment so as to generate a target boundary frame in each video frame; If a real target boundary frame and a suspected pseudo target boundary frame exist in the current traversal video frame, respectively generating first detection boundary frames matched with the suspected pseudo target boundary frames in the first M video frames and the last N video frames of the current traversal video frame, calculating the correlation degree according to the first detection boundary frames and the real target boundary frames, and if the current traversal video frame is determined to belong to a difficult case according to the correlation degree, adding the current traversal video frame into a difficult case set; the difficult case set is sent to a server, so that the server trains copies of the target detection model according to the difficult case set to obtain model update data; Receiving model update data sent by the server; updating the target detection model according to the model updating data to obtain an updated target detection model; and carrying out target detection on the video stream acquired by the camera unit through the updated target detection model. In some possible embodiments of the present application, the determining that the real target bounding box and the suspected pseudo target bounding box exist in the current traversal video frame includes: Acquiring a target boundary box existing in the current traversal video frame; For each acquired target bounding box, determining whether the target bounding box exists in adjacent video frames respectively; If the target bounding box exists in the adjacent video frames, determining the target bounding box as the real target bounding box; and if the target boundary box does not exist in the adjacent video frames, determining the target boundary box as the suspected pseudo-target boundary box. In some possible embodiments of the present application, before calculating the degree of correlation from the first detection bounding box and the real target bounding box, the method further comprises: amplifying the first detection bounding box for each of the first M video frames and the last N video frames, generating a second detection boundi