US-12616391-B2 - Neural networks to determine respiration rates

US12616391B2US 12616391 B2US12616391 B2US 12616391B2US-12616391-B2

Abstract

In some examples, a non-transitory computer-readable medium stores executable code, which, when executed by a processor, causes the processor to receive a video of at least part of a human torso, use a neural network to produce multiple vector fields based on the video, the multiple vector fields representing movement of the human torso, and determine a respiration rate of the human torso using the multiple vector fields.

Inventors

TIANQI GUO
Qian Lin
Jan Allebach

Assignees

HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.
PURDUE RESEARCH FOUNDATION

Dates

Publication Date: 20260505
Application Date: 20201029

Claims (18)

1 . A non-transitory computer-readable medium storing executable code, which, when executed by a processor, causes the processor to: receive a video of at least part of a human torso; use a first neural network to produce multiple vector fields based on the video, the multiple vector fields representing movement of the human torso; determine a respiration rate of the human torso using the multiple vector fields; use a second neural network and the video to produce a segmentation mask; and multiply the segmentation mask by one of the multiple vector fields to remove vectors in the one of the multiple vector fields, the removed vectors corresponding to a background to the human torso.
2 . The computer-readable medium of claim 1 , wherein the executable code, when executed by the processor, causes the processor to: determine an average value of multiple vectors in each of the multiple vector fields; determine a frequency distribution of the average values; and designate the dominant frequency in the frequency distribution as the respiration rate.
3 . The computer-readable medium of claim 2 , wherein the executable code, when executed by the processor, causes the processor to: determine an average horizontal value of horizontal components of multiple vectors in at least one of the multiple vector fields; determine an average vertical value of vertical components of multiple vectors in the at least one of the multiple vector fields; and designate the greater of the average horizontal value and the average vertical value as the average value.
4 . The computer-readable medium of claim 1 , wherein the first neural network includes a contraction portion that is to cause the processor to reduce spatial resolution of images in the video, and wherein the first neural network includes an expansion portion that is to cause the processor to recover spatial resolution of the images.
5 . The computer-readable medium of claim 1 , wherein the first neural network is a convolutional neural network.
6 . The computer-readable medium of claim 1 , wherein the second neural network is a convolutional neural network.
7 . An electronic device, comprising: an interface to receive a video of at least part of a human torso; a memory storing executable code; and a processor coupled to the interface and to the memory, wherein, as a result of executing the executable code, the processor is to: iteratively use a neural network to produce multiple vector fields, each vector field indicative of movement of the human torso between a different pair of consecutive images of the video; calculate multiple average values, each average value calculated using vectors of a different one of the multiple vector fields; determine a frequency distribution of the multiple average values; and designate a frequency in the frequency distribution as a respiration rate of the human torso.
8 . The electronic device of claim 7 , wherein the interface is a network interface, and wherein the video is a live-stream video.
9 . The electronic device of claim 7 , wherein the interface is a peripheral interface.
10 . The electronic device of claim 7 , wherein the frequency is the dominant frequency in the frequency distribution.
11 . The electronic device of claim 7 , wherein the neural network is a convolutional neural network.
12 . The electronic device of claim 7 , further comprising a camera to capture the video.
13 . The electronic device of claim 12 , wherein the electronic device is a smartphone.
14 . A method, comprising: receiving a video of at least part of a human torso; obtaining a first pair of consecutive images from the video and a second pair of consecutive images from the video; using a neural network to produce a first vector field based on the first pair of consecutive images, the first vector field representative of movement of the human torso; using the neural network to produce a second vector field based on the second pair of consecutive images, the second vector field representative of movement of the human torso; calculating first and second average values using the first and second vector fields, respectively; and determining a respiration rate of the human torso using the first and second average values.
15 . The method of claim 14 , wherein receiving the video comprises receiving a live-stream video via a network interface.
16 . The method of claim 14 , wherein calculating the first average value using the first vector field comprises: calculating an average horizontal value of horizontal components of multiple vectors in the first vector field; calculating an average vertical value of vertical components of the multiple vectors in the first vector field; and designating the greater of the average horizontal value and the average vertical value as the first average value.
17 . The method of claim 14 , comprising: using a second neural network and the video to produce a segmentation mask; and filtering the first vector field using the segmentation mask prior to calculating the first average value.
18 . The method of claim 14 , wherein determining the respiration rate using the first and second average values includes determining a frequency distribution of the first and second average values and designating the dominant frequency in the frequency distribution as the respiration rate.

Description

BACKGROUND The human respiration rate is frequently measured in a variety of contexts to obtain information regarding pulmonary, cardiovascular, and overall health. For example, doctors often measure respiration rate in clinics and hospitals. BRIEF DESCRIPTION OF THE DRAWINGS Various examples will be described below referring to the following figures: FIG. 1 is a block diagram of an electronic device to determine respiration rates using neural networks, in accordance with various examples. FIG. 2 is a flow diagram of a method to determine respiration rates using neural networks, in accordance with various examples. FIGS. 3A-3F are a process flow to determine respiration rates using neural networks, in accordance with various examples. FIGS. 4 and 5 are block diagrams of electronic devices to determine respiration rates using neural networks, in accordance with various examples. FIG. 6 is a flow diagram of a method to determine respiration rates using neural networks, in accordance with various examples. FIG. 7 is a block diagram of an electronic device to determine respiration rates using neural networks, in accordance with various examples. DETAILED DESCRIPTION Various techniques and devices can be used to measure respiration rate. Many such techniques and devices operate based on the fact that inhalation and exhalation during respiration cycles are associated with pulmonary volume changes as well as the expansion and contraction of the anteroposterior diameters of the rib cage and abdomen. Accordingly, common techniques for measuring respiration rate include visual observation, impedance pneumography, and respiration belts that include accelerometers, force sensors, and pressure sensors that sense motions of the chest wall. These approaches for measuring respiration rate have multiple disadvantages. For example, because the subject is present in person for her respiration rate to be measured, she is at risk for the transmission of pathogens via respiration rate monitoring devices or via the air, and she spends time and money traveling to and from the clinic at which her respiration rate is to be measured. Techniques for measuring respiration rate from a remote location using a camera suffer from poor accuracy, particularly in challenging conditions (e.g., in poorly-lit environments). This disclosure describes various examples of a technique for using a camera to remotely measure respiration rate in a variety of challenging conditions. In examples, the technique includes receiving a video of a human torso (e.g., including the shoulders, chest, back, and/or abdomen of a subject), such as a live-stream video via a network interface or a stored video via a peripheral interface. Other body parts, such as the head, also may be used, although the accuracy may be less than if the shoulders, chest, back, and/or abdomen are used. The term human torso, as used herein, may include any such body part(s), including the head, shoulders, chest, back, and/or abdomen of a subject. The technique includes applying multiple pairs of consecutive images from the video to a convolutional neural network (CNN) to produce multiple vector fields, each vector field corresponding to a different pair of consecutive images from the video and indicating movement (e.g., respiratory movement) between the consecutive images in the pair. In some examples, images from the video may be applied to another CNN to produce segmentation masks, and these segmentation masks may be applied to corresponding vector fields to filter vectors in the vector fields that do not correspond to the human torso. For instance, multiplying a vector field by a segmentation mask may cause the vectors corresponding to a background or to another subject(s) in the video to be removed from the vector field. The technique also includes, for each vector field, calculating an average horizontal value using the horizontal components of some or all of the vectors in the vector field, and calculating an average vertical value using the vertical components of some or all of the vectors in the vector field. The greater of the average horizontal and average vertical values is selected as an average value that is representative of respiratory movement of the subject. These average values may be plotted over a target length of time (e.g., 60 seconds), for example, on a graph of time versus spatial displacement. The set of average values may be converted to the frequency domain to produce a frequency distribution, and the dominant frequency in the frequency distribution (e.g., the frequency with the greatest normalized coefficient) may be designated as the respiration rate of the subject in the video. The CNN used to produce the vector fields may be trained using data sets having various lighting conditions, subjects with different types of clothing, etc., to mitigate the effect of these variables on the accuracy of the determined respiratory rate. FIG. 1 is a block diagram of an el