US-12620232-B2 - Systems and methods for detecting fall events

US12620232B2US 12620232 B2US12620232 B2US 12620232B2US-12620232-B2

Abstract

Example implementations include a method, apparatus and computer-readable medium for computer vision detection of a fall event, comprising detecting a person in a first image captured at a first time. The implementations further include identifying a plurality of keypoints on the person in an image, wherein the plurality of keypoints, when connected, indicate a pose of the person. Additionally, the implementations further include detecting, using the plurality of keypoints, that the person has fallen in response to determining that, subsequent to the pose being the standing pose in a previous image, the keypoints of the plurality of keypoints associated with the shoulders of the person are higher than the keypoints of the second plurality of keypoints associated with the eyes and the ears of the person in the second image. Additionally, the implementations further include generating an alert indicating that the person has fallen.

Inventors

Abhishek Mitra
Gopi Subramanian
Yash Chaturvedi

Assignees

Sensormatic Electronics, LLC

Dates

Publication Date: 20260505
Application Date: 20231002

Claims (18)

1 . An apparatus for computer vision detection of a fall event, comprising: at least one memory; and at least one processor coupled with the at least one memory and configured, individually or in combination, to: detect a person in a first image captured at a first time; identify a first plurality of keypoints on the person in the first image, wherein the first plurality of keypoints, when connected, indicate a pose of the person, wherein a location of each keypoint has a horizontal component and a vertical component; classify, using the first plurality of keypoints, the pose as a standing pose in response to determining that keypoints of the first plurality of keypoints associated with shoulders of the person are lower than keypoints of the first plurality of keypoints associated with eyes and ears of the person; detect the person in a second image captured at a second time; identify a second plurality of keypoints on the person in the second image; map the first plurality of keypoints to a one-dimensional line based on each respective vertical component of the first plurality of keypoints; determine, based on the one-dimensional line, a first distance between a highest keypoint and a lowest keypoint of the first plurality of keypoints; map the second plurality of keypoints to the one-dimensional line based on each respective vertical component of the second plurality of keypoints; determine, based on the one-dimensional line, a second distance between a highest keypoint and a lowest keypoint of the second plurality of keypoints; and detect that the person has fallen in response to determining that the first distance is greater than a threshold distance and the second distance is not greater than the threshold distance; generate an alert indicating that the person has fallen.
2 . The apparatus of claim 1 , wherein the at least one processor is further configured to: classify the pose as the standing pose further in response to the first distance being greater than the threshold distance.
3 . The apparatus of claim 1 , wherein the at least one processor is further configured to generate a first boundary box around the person in the first image and a second boundary box around the person in the second image.
4 . The apparatus of claim 3 , wherein to classify the pose as the standing pose is further in response to determining that an aspect ratio of the first boundary box is greater than a threshold aspect ratio.
5 . The apparatus of claim 3 , wherein to detect that the person has fallen is further in response to determining that an aspect ratio of the second boundary box is not greater than a threshold aspect ratio.
6 . The apparatus of claim 3 , wherein to detect the person in the second image captured at the second time the at least one processor is further configured to: determine that a person detection model has failed to detect the person in the second image; generate at least one proximity search region based on coordinates and dimensions of the first boundary box in response to determine that the first boundary box is a latest boundary box generated for the person; generate at least one input image by cropping the second image to the at least one proximity search region; apply a rotation to the at least one input image; and detect the person in the at least one input image after the rotation is applied.
7 . The apparatus of claim 6 , wherein an area of the at least one proximity search region matches an area of the first boundary box, and wherein a center point of the at least one proximity search region is within a threshold distance from a center point of the first boundary box.
8 . The apparatus of claim 1 , wherein the at least one processor is further configured to transmit the alert to a compute device.
9 . A method for computer vision detection of a fall event, comprising: detecting a person in a first image captured at a first time; identifying a first plurality of keypoints on the person in the first image, wherein the first plurality of keypoints, when connected, indicate a pose of the person, wherein a location of each keypoint has a horizontal component and a vertical component; classifying, using the first plurality of keypoints, the pose as a standing pose in response to determining that keypoints of the first plurality of keypoints associated with shoulders of the person are lower than keypoints of the first plurality of keypoints associated with eyes and ears of the person; detecting the person in a second image captured at a second time; identifying a second plurality of keypoints on the person in the second image; mapping the first plurality of keypoints to a one-dimensional line based on each respective vertical component of the first plurality of keypoints; determining, based on the one-dimensional line, a first distance between a highest keypoint and a lowest keypoint of the first plurality of keypoints; mapping the second plurality of keypoints to the one-dimensional line based on each respective vertical component of the second plurality of keypoints; determining, based on the one-dimensional line, a second distance between a highest keypoint and a lowest keypoint of the second plurality of keypoints; and detecting that the person has fallen in response to determining that the first distance is greater than a threshold distance and the second distance is not greater than the threshold distance; and generating an alert indicating that the person has fallen.
10 . The method of claim 9 , further comprising: classifying the pose as the standing pose further in response to the first distance being greater than the threshold distance.
11 . The method of claim 9 , further comprising generating a first boundary box around the person in the first image and a second boundary box around the person in the second image.
12 . The method of claim 11 , wherein classifying the pose as the standing pose is further in response to determining that an aspect ratio of the first boundary box is greater than a threshold aspect ratio.
13 . The method of claim 11 , wherein detecting that the person has fallen is further in response to determining that an aspect ratio of the second boundary box is not greater than a threshold aspect ratio.
14 . The method of claim 11 , wherein detecting the person in the second image captured at the second time further comprises: determining that a person detection model has failed to detect the person in the second image; generating at least one proximity search region based on coordinates and dimensions of the first boundary box in response to determining that the first boundary box is a latest boundary box generated for the person; generating at least one input image by cropping the second image to the at least one proximity search region; applying a rotation to the at least one input image; and detecting the person in the at least one input image after the rotation is applied.
15 . The method of claim 14 , wherein an area of the at least one proximity search region matches an area of the first boundary box, and wherein a center point of the at least one proximity search region is within a threshold distance from a center point of the first boundary box.
16 . The method of claim 9 , further comprising transmitting the alert to a computing device.
17 . An apparatus for computer vision detection of a fall event, comprising: means for detecting a person in a first image captured at a first time; means for identifying a first plurality of keypoints on the person in the first image, wherein the first plurality of keypoints, when connected, indicate a pose of the person; means for classifying, using the first plurality of keypoints, the pose as a standing pose in response to determine that keypoints of the first plurality of keypoints associated with shoulders of the person are lower than keypoints of the first plurality of keypoints associated with eyes and ears of the person; means for detecting the person in a second image captured at a second time; means for identifying a second plurality of keypoints on the person in the second image; means for detecting, using the second plurality of keypoints, that the person has fallen in response to determine that, subsequent to the pose being the standing pose in the first image, the keypoints of the second plurality of keypoints associated with the shoulders of the person are higher than the keypoints of the second plurality of keypoints associated with the eyes and the ears of the person in the second image; and means for generating an alert indicating that the person has fallen.
18 . A non-transitory computer-readable medium having instructions stored thereon for computer vision detection of a fall event, wherein the instructions are executable by one or more processors, individually or in combination, to: detect a person in a first image captured at a first time; identify a first plurality of keypoints on the person in the first image, wherein the first plurality of keypoints, when connected, indicate a pose of the person, wherein a location of each keypoint has a horizontal component and a vertical component; classify, using the first plurality of keypoints, the pose as a standing pose in response to determining that keypoints of the first plurality of keypoints associated with shoulders of the person are lower than keypoints of the first plurality of keypoints associated with eyes and ears of the person; detect the person in a second image captured at a second time; identify a second plurality of keypoints on the person in the second image; map the first plurality of keypoints to a one-dimensional line based on each respective vertical component of the first plurality of keypoints; determine, based on the one-dimensional line, a first distance between a highest keypoint and a lowest keypoint of the first plurality of keypoints; map the second plurality of keypoints to the one-dimensional line based on each respective vertical component of the second plurality of keypoints; determine, based on the one-dimensional line, a second distance between a highest keypoint and a lowest keypoint of the second plurality of keypoints; and detect that the person has fallen in response to determining that the first distance is greater than a threshold distance and the second distance is not greater than the threshold distance; generate an alert indicating that the person has fallen.

Description

CLAIM OF PRIORITY The present Application for Patent claims priority to U.S. Provisional Application No. 63/378,116 entitled filed on Oct. 3, 2022, and assigned to the assignee hereof and hereby expressly incorporated by reference. TECHNICAL FIELD The described aspects relate to fall event detection systems. BACKGROUND Fall events result in more than 2.8 million injuries treated in emergency departments annually, including over 800,000 hospitalizations and more than 27,000 deaths. Early fall detection ensures prompt notification to and quick response from the health professionals thereby reducing negative outcome of the accident/fall event. Conventional systems often fail to provide timely detection and timely recognition of the fall event. This results in delayed alerts that may not be recognized by security personnel as being emergencies. Accordingly, there exists a need for improvements in conventional fall event detection systems. SUMMARY The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later. An example aspect includes a method for computer vision detection of a fall event, comprising detecting a person in a first image captured at a first time. The method further includes identifying a first plurality of keypoints on the person in the first image, wherein the first plurality of keypoints, when connected, indicate a pose of the person. Additionally, the method further includes classifying, using the first plurality of keypoints, the pose as a standing pose in response to determining that keypoints of the first plurality of keypoints associated with shoulders of the person are lower than keypoints of the first plurality of keypoints associated with eyes and ears of the person. Additionally, the method further includes detecting the person in a second image captured at a second time. Additionally, the method further includes identifying a second plurality of keypoints on the person in the second image. Additionally, the method further includes detecting, using the second plurality of keypoints, that the person has fallen in response to determining that, subsequent to the pose being the standing pose in the first image, the keypoints of the second plurality of keypoints associated with the shoulders of the person are higher than the keypoints of the second plurality of keypoints associated with the eyes and the ears of the person in the second image. Additionally, the method further includes generating an alert indicating that the person has fallen. Another example aspect includes an apparatus for computer vision detection of a fall event, comprising at least one memory and one or more processors coupled with the one or more memories and configured, individually or in combination, to: detect a person in a first image captured at a first time. The at least one processor is further configured to identify a first plurality of keypoints on the person in the first image, wherein the first plurality of keypoints, when connected, indicate a pose of the person. Additionally, the at least one processor further configured to classify, using the first plurality of keypoints, the pose as a standing pose in response to determining that keypoints of the first plurality of keypoints associated with shoulders of the person are lower than keypoints of the first plurality of keypoints associated with eyes and ears of the person. Additionally, the at least one processor further configured to detect the person in a second image captured at a second time. Additionally, the at least one processor further configured to identify a second plurality of keypoints on the person in the second image. Additionally, the at least one processor further configured to detect, using the second plurality of keypoints, that the person has fallen in response to determining that, subsequent to the pose being the standing pose in the first image, the keypoints of the second plurality of keypoints associated with the shoulders of the person are higher than the keypoints of the second plurality of keypoints associated with the eyes and the ears of the person in the second image. Additionally, the at least one processor further configured to generate an alert indicating that the person has fallen. Another example aspect includes an apparatus for computer vision detection of a fall event, comprising means for detecting a person in a first image captured at a first time. The apparatus further includes means for identifying a first plurality of keypoints on the person in the first image, wherein the first plurality of keypoints, when co