CN-115836321-B - Video analysis system

CN115836321BCN 115836321 BCN115836321 BCN 115836321BCN-115836321-B

Abstract

The invention relates to a computer-implemented method for sampling and analyzing data from at least one image frame (531) of at least one series of image frames captured by at least one sensor, comprising defining at least one sampling model (501), wherein the sampling model (501) is defined in a virtual 3D vector space (521) and is based on one or more predetermined shapes (505) in the virtual 3D vector space (521), applying at least one sampling model (501) to at least a portion of the at least one image frame (531) of the at least one series of image frames, wherein the application of the at least one sampling model defines at least one region (529) of the at least one image frame (531) from which data is extracted, extracting (423) data from the at least one region (529) of the at least one image frame (531) defined by the sampling model (501), and analyzing the extracted data.

Inventors

Andrew Fitzgerald Runt

Assignees

阿外特人工智能有限公司

Dates

Publication Date: 20260508
Application Date: 20210601
Priority Date: 20200602

Claims (19)

1. A computer-implemented method for sampling and analyzing data from at least one image frame (531), the at least one image frame (531) from at least one series of image frames captured by at least one sensor, the method comprising: Defining at least one sampling model (501), wherein the sampling model (501) is defined in a virtual 3D vector space (521) and is based on one or more predetermined shapes (505) in the virtual 3D vector space (521), wherein a shape (200) in a virtual 3D vector space (212) on which the sampling model (210) is based is divided into elements or blocks (208) constituting the shape (200), Applying the at least one sampling model (501) to at least a portion of the at least one image frame (531) in the at least one series of image frames, wherein the application of the at least one sampling model defines at least one region (529) of the at least one image frame (531) from which data is extracted, Extracting (423) data from the at least one region (529) of the at least one image frame (531) defined by the sampling model (501), And analyzing the extracted data.
2. The method of claim 1, wherein the one or more predetermined shapes in the virtual 3D vector space (521, 212, 310) are selected from at least one of a 3D shape (200, 505, 506, 507), a 2D shape (301, 302, 303, 304), a 1D shape, or a 0D shape.
3. The method according to claim 2, wherein the 3D shape (200, 505, 506, 507) is a parallelepiped and/or a polyhedron and/or a sphere and/or a cylinder, and/or wherein the 2D shape is a plane or curved surface and/or a parallelogram (301, 302, 303, 304), and/or wherein the 1D shape is a line segment, and/or wherein the 0D shape is a point.
4. A method according to any one of claims 1 to 3, wherein applying the at least one sampling model (501) to the at least one portion of the at least one image frame (531) in the at least one series of image frames comprises associating the at least one sampling model (501) with one or more reference points in the at least one image frame (531) in the at least one series of image frames.
5. The method of claim 4, the associating further comprising performing a mapping transformation (417) between one or more points of the at least one sampling model (413) and the one or more reference points in the at least one image frame of the at least one series of image frames.
6. The method of claim 5, wherein the mapping transformation is a parallel projection.
7. The method of claim 1, wherein the shape (200) in the virtual 3D vector space (212) on which the sampling model (210) is based is divided, uniformly or non-uniformly in any or all geometric dimensions thereof, into one or more elements or blocks (208) that constitute the shape (200).
8. The method of claim 1, wherein extracting (423) data from the at least a portion of the at least one image frame of the at least one series of image frames to which the sampling model (413) is applied comprises extracting data from image frame pixels in an image frame region (421, 422), the image frame region (421, 422) being contained in or covered by a shape of the sampling model applied to the at least a portion of the at least one image, and storing the extracted data in an array.
9. The method of claim 8, wherein the array is a multi-dimensional array (424, 425).
10. The method of claim 8, comprising extracting data from image frame pixels in an image frame region (421, 422), the image frame region (421, 422) being contained in or covered by an element or block (420) of a shape of the sampling model applied to the at least a portion of the at least one image, and storing the extracted data in at least one array.
11. The method of claim 10, wherein the at least one array is a multi-dimensional array (424, 425).
12. The method according to claim 1, wherein the same sampling model (413) is applied to different parts of the at least one series of image frames, and/or wherein the same sampling model is applied to a plurality of images of the at least one series of image frames, or wherein the same sampling model is applied to all of the images of the at least one series of image frames.
13. The method of claim 1, wherein the extracting (423) data from the at least a portion of the at least one image frame (412) to which the sampling model is applied comprises converting the data.
14. The method of claim 1, wherein the at least one image frame (412) of the sampling model is applied or wherein the at least a portion of the at least one image frame of the sampling model is applied is preprocessed prior to the extracting data.
15. The method of claim 1, wherein analyzing the extracted data comprises: Analyzing the extracted data to detect a desired pattern, wherein the pattern can comprise predetermined conditions and/or movements and/or behaviors and/or actions of objects and/or subjects within a real 3D scene (410, 508, 527) represented in the at least a portion of the at least one image frame of the at least one series of image frames captured by the at least one sensor and providing a notification or alert upon detection of the pattern, and/or Training the machine learning system for said detecting a desired pattern using the extracted data as input to the machine learning system, wherein the pattern can comprise predetermined conditions and/or movements and/or behaviors and/or actions of objects and/or subjects within a real 3D scene (410, 508, 527) represented in the at least a portion of the at least one image frame of the at least one series of image frames captured by the at least one sensor, And/or The extracted data is used as input to a trained machine learning system to detect a desired pattern, wherein the pattern can include predetermined conditions and/or movements and/or behaviors and/or actions of objects and/or subjects within a real 3D scene (410, 508, 527) represented in the at least a portion of the at least one image frame of the at least one series of image frames captured by the at least one sensor, and a notification or alert is provided upon detection of the pattern.
16. The method according to claim 15, wherein the predetermined condition and/or movement and/or behavior and/or action of objects and/or subjects within a real 3D scene is controlling fraudulent entry at a door.
17. The method of claim 1, wherein applying the at least one sampling model to at least a portion of the at least one image frame of the at least one series of image frames accounts for movement of the at least one sensor during capturing image frames from the at least one series of image frames, and/or the method comprises: Data from a plurality of different series of image frames acquired by a plurality of sensors having different viewpoints for capturing image frames is sampled and analyzed, and the different viewpoints of the plurality of sensors are considered when applying the at least one sampling model to the image frames acquired by the plurality of sensors.
18. One or more computer-readable storage media having instructions stored therein that, when executed by one or more processors, instruct the one or more processors to perform the method of any one of claims 1-17.
19. A video analysis system, comprising: At least one sensor configured to capture image frames, and At least one computing system comprising one or more processors, the one or more processors are configured to implement the method of any one of claims 1 to 17.

Description

Video analysis system Technical Field The present invention relates to a computer-implemented method, a computer-readable storage medium, and a video analysis system for sampling and analyzing data from at least one image frame. Background Image and/or video analysis systems are often used to investigate or monitor scenes, places, objects and/or subjects, persons or groups of people of interest to detect and alert to the occurrence of specific situations, patterns, movements, behavioural actions. In the following, the detection of specific or sortable situations, patterns, movements, behavioural actions, etc. may in particular be referred to simply as detection problems or problems to be detected by an image and/or video analysis system. For example, the image and/or video analysis system may use captured image and/or video data to detect fraudulent or illegal entry (access) events, such as trailing or carrying at a control gate or gate, a typical example of such fraudulent entry being for example the case where one or more subjects or persons attempt to pass through a control gate (e.g. a gate of a subway station) with a secure delay for a preceding subject or person to effectively pass through a rear gate closure, another example being the subject skipping the gate. Such illegal entry events may be more diverse in the case of other access control doors, such as a tripod turnstile, where the fraudulent mode may be for the evacuee to skip the turnstile, pass underneath, pass through the same turnstile with another passenger (in a mode commonly referred to as 2x 1) or swing the tripod turnstile upper arm back and forth to enter the payment area using the movement gaps in some turnstiles (commonly found in those designed for access ways) without verifying fare. However, providing reliable, accurate, fast, real-time, automated detection and alerting remains a challenge for current systems and techniques, especially when processing large amounts of image and video data from different scenes, different viewpoints, and different perspectives, for example, captured by a plurality of different cameras, which may be stationary or may be moving themselves. Disclosure of Invention Problem(s) It is therefore an object of the present invention to improve a computer implemented video analysis method and video analysis system for analyzing image and/or video data to detect specific situations, patterns, movements, behavioral actions of interest. For example, this may include improving computer-implemented video analysis methods and video analysis systems, particularly in terms of automation, speed, efficiency, reliability, and simplicity. Solution scheme According to the present invention, this object is achieved by a computer-implemented method, a computer-readable storage medium and a video analysis system for sampling and analyzing data from at least one image frame. An exemplary computer-implemented video analysis method for detecting a desired specific (e.g., sortable) pattern or problem in image and/or video data in accordance with the present invention may include a computer-implemented method for sampling and analyzing data from at least one image frame of at least one series of image frames captured by at least one sensor, and may include one or some or all of the following steps: defining at least one sampling model, wherein the sampling model is defined in a 3D vector space or a virtual 3D vector space and is based on one or more predetermined shapes in the 3D vector space or virtual 3D vector space, Applying at least one sampling model to at least a portion of at least one image frame of at least one series of image frames, wherein said application of at least one sampling model defines at least one region of at least one image frame from which data is extracted, Extracting data from at least one region of at least one image frame defined by the sampling model, And analyzing the extracted data. In other words, the exemplary steps for sampling and analyzing data from at least one image frame of at least one series of image frames captured by at least one sensor as described above may be used in a computer-implemented video analysis method or video analysis system, such as a video analysis system configured to detect problems. A series or sequence of image frames may be understood as in particular a video stream or a series or sequence of image frames extracted from a video stream, wherein the image frames are captured by a sensor. A sensor is herein understood to be, inter alia, a device that can generate data suitable for presentation as an image. The term camera should in particular be understood to mean all such devices or sensors. Further, it should be understood that the image or image frame is/may be in a digital format, such as a digital pixel array, or has/may be converted from an analog format to a digital format. The analysis of the representation of the extracted data herein may be understood as comp