US-20260127866-A1 - SYSTEM AND METHOD FOR LOW-POWER OBJECT DETECTION WITH REDUCED FALSE POSITIVES

US20260127866A1US 20260127866 A1US20260127866 A1US 20260127866A1US-20260127866-A1

Abstract

Systems and methods for low-power object detection with reduced false positives in video analytics are disclosed. A video analysis system receives a video stream, isolates frames, and executes a first object-detection algorithm to identify a potential object in the frame. The frame is reduced in size to focus on the object and is passed to one or more additional algorithms that confirm detection within progressively smaller regions of interest. Object detection is confirmed based on agreement or composite confidence scores among the algorithms. By limiting analysis to relevant frame regions, the system improves detection accuracy while substantially reducing processing time, energy consumption, and false detections.

Inventors

Rick A. Britton

Assignees

DIGITAL MONITORING PRODUCTS, INC.

Dates

Publication Date: 20260507
Application Date: 20251104

Claims (20)

1 . A video analysis system for low-power object detection with reduced false positives, comprising: a processor configured to execute computer program instructions; a memory in communication with the processor and configured to store the computer program instructions and data; a video analytics module executable by the processor to: receive a video stream comprising a plurality of video frames; isolate individual frames from the video stream for analysis; execute a first object-detection algorithm on a first frame to detect a potential object within the frame; generate a reduced-size frame corresponding to a region of the first frame in which the potential object was detected; execute at least one subsequent object-detection algorithm on the reduced-size frame to confirm detection of the object; and confirm detection of the object based on a composite confidence score of the executed algorithms.
2 . The system of claim 1 , wherein each subsequently executed object-detection algorithm is different than the preceding algorithm.
3 . The system of claim 1 , wherein the video analytics module selects a number and type of object-detection algorithms based on at least one of: scene complexity, motion level, illumination, or available power.
4 . The system of claim 1 , wherein the video analytics module assigns a confidence score to each algorithm output and confirms detection when an aggregate or average confidence score exceeds a threshold.
5 . The system of claim 1 , wherein the video analytics module reduces redundant analysis by excluding portions of each frame in which no object is detected.
6 . The system of claim 1 , further comprising a network interface configured to communicate detected object information or confidence scores to an external system.
7 . The system of claim 1 , further comprising power management circuitry operable to reduce voltage or clock frequency to conserve power in response to reduced frame size.
8 . The system of claim 1 , wherein the video analytics module is configured to operate with a plurality of cameras that share object detection data or confidence scores.
9 . The system of claim 1 , wherein the processor comprises a graphics processing unit (GPU).
10 . The system of claim 1 , wherein the system is implemented as an embedded processor within a camera housing.
11 . A computer-implemented method for low-power object detection with reduced false positives, comprising: receiving, by a processor, a video stream comprising a plurality of video frames; isolating individual frames from the video stream for analysis; executing a first object-detection algorithm on a first frame to detect a potential object within the frame; reducing the frame size to generate a reduced-size frame corresponding to a region in which the potential object was detected; executing at least one subsequent object-detection algorithm on the reduced-size frame; assigning a confidence score to each detection result; and confirming detection of the object based on agreement among the algorithms or a composite confidence score meeting a predetermined threshold.
12 . The method of claim 11 , further comprising selecting the number and type of algorithms based on real-time parameters including motion level, illumination, or available processing power.
13 . The method of claim 11 , further comprising adjusting system power parameters based on the size of the analyzed frame or detected computing resources demand.
14 . The method of claim 11 , wherein each algorithm is executed only on a region of interest identified by a previous algorithm.
15 . The method of claim 11 , further comprising storing or transmitting confirmed detection results and corresponding confidence scores.
16 . The method of claim 11 , wherein the first object-detection algorithm and the subsequent algorithm are different algorithms.
17 . The method of claim 11 , further comprising normalizing video frames to exclude expected background elements from triggering object detection.
18 . The method of claim 11 , further comprising distributing the execution of object-detection algorithms between local and remote computing systems to reduce local power consumption.
19 . The method of claim 11 , wherein each subsequent algorithm generates a smaller reduced-size frame until object detection is confirmed or a predetermined maximum number of algorithms is executed.
20 . The method of claim 11 , further comprising recording operational parameters including total execution time, power usage, and confidence scores.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application No. 63/716,298, filed November 5, 2024, the disclosure of which is hereby incorporated herein in its entirety by reference. BACKGROUND Processing video recordings and/or data streams to detect and identify objects within the video stream is a resource-intensive task. Current video processing hardware typically uses multiple processors, with multiple cores within each processor, to analyze video data and determine or detect one or more objects within one or more video frames of the data stream. As the size and resolution of modern cameras and image sensors continue to increase, the volume of data to be analyzed for object detection has grown exponentially, placing an ever-increasing demand on available computing and electrical power resources. Current methods of processing and analyzing video data to detect objects typically involve a series of steps in which each video frame is treated as an individual image, and objects within each of those individual frames are identified and classified. For example, captured and recorded (or live) video data streams may be split into multiple individual frames, with each frame processed and analyzed as a standalone image to detect the presence of an object or objects. Because the processing of every frame takes time and consumes additional power, the number of individual frames actually analyzed in any given application may vary. For example, every frame of the stream’s multiple frames may be analyzed, frames may only be selected and analyzed periodically, or only specifically selected frames may be processed. However, in continuous-monitoring video environments - such as security systems, autonomous vehicle navigation systems, or industrial robotic or machine-vision equipment - image processing must typically be performed continuously, with every frame analyzed, thus requiring electrical power and dedicated computer processing hardware for the entire time that the system is up and operational. While pre-processing each frame to be further analyzed can streamline the analysis process to some extent, pre-processing - such as resizing the image/frame, normalizing the frame data to adjust pixel values to a standard range, and adjusting the color (if applicable) to match a desired color scale, such as RGB or grayscale - the computing power (and thus the corresponding electrical power) required to process video data to identify and detect one or more objects in a single or series of video frames is significant. This high computing demand and corresponding power consumption leads to increased heat generation and reduced operating life, especially for battery powered or heat sensitive devices. Furthermore, once an object is initially detected or identified within a video frame, that specific frame must typically be analyzed multiple times in order to confirm the detection of an object and to ensure that the initial detection is not a false detection (i.e., a false positive). Conventional video analysis systems thus often execute the same, or similar, object detection algorithms across the entire video frame for verification, even though only a portion of the frame may actually contain a potential object of interest. This redundant re-analysis of areas of the frame in which an object was not detected requires substantial processing time and power without providing a proportional improvement in the accuracy of the detection. These limitations are particularly troublesome in power limited environments, such as battery powered surveillance cameras, unmanned aerial vehicle cameras, drones), vehicle navigation systems, and embedded Internet-of-Things (IoT) vision devices, where computing power must be balanced against available electrical power resources in order to maximize operational time. Thus, it can be seen that there remains a need in the art for improved systems and methods for detecting objects in video data streams that provide high object detection reliability while reducing the computational and electrical power requirements to do so. SUMMARY Embodiments of the invention are defined by the claims below, not this summary. A high-level overview of various aspects of the invention is provided here to introduce certain concepts that are further described in the detailed description section below. This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used in isolation to determine the scope of the claimed subject matter. The present invention is directed to systems and methods for minimizing false positives and reducing power consumption for video-based object detection. Known object detection systems typically analyze entire video frames multiple times, using multiple algorithms to confirm the presence of an object, resulting in significant computational and electrical power demand. The systems and met