Search

CN-116468884-B - Visual stimulus response monitoring method based on head action and intelligent terminal

CN116468884BCN 116468884 BCN116468884 BCN 116468884BCN-116468884-B

Abstract

The invention discloses a visual stimulus response monitoring method based on head actions, which comprises the steps of obtaining image or video data containing head images, inputting the image or video data into a head detection network model for training to obtain a head detection rectangular frame, inputting the head detection rectangular frame into a head key point positioning model for training to obtain head key point position coordinates, and obtaining a visual stimulus direction and a visual stimulus response according to the head key point position coordinates. According to the invention, the stimulation is applied to the tested body by combining a simple test task, the head movement parameters are monitored, the cognitive state of the tested body is estimated according to the head movement parameters, and the requirement of hardware is reduced and the universality of visual stimulation response monitoring is increased by a head characteristic point positioning tracking method based on deep learning.

Inventors

  • ZHOU YONGJIN
  • CAI SIJIN
  • LIU YANJIE

Assignees

  • 深圳大学

Dates

Publication Date
20260505
Application Date
20230331

Claims (7)

  1. 1. A method of monitoring visual stimulus response based on head movements, the method comprising: acquiring image or video data containing a head phantom; inputting the image or video data into a head detection network model for training to obtain a head detection rectangular frame; Inputting the head detection rectangular frame into a head key point positioning model for training to obtain head key point position coordinates; according to the head key point position coordinates, a visual stimulus direction and a visual stimulus response are obtained; Inputting the image or video data into a head detection network model for training to obtain a head detection rectangular frame, wherein the method comprises the following steps of: Inputting the head image data into a head detection network model for training to obtain a detection result, wherein the head detection network model is MobileFaceNet models, the MobileFaceNet models comprise a first separable convolution module and a second separable convolution module, the first separable convolution module comprises a convolution kernel of 5×5 and is used for a network shallow layer, the second separable convolution module comprises convolution kernels of 5×5 and is used for a network deep layer, and the MobileFaceNet models adopt feature graphs of 16×16 and 8×8 feature dimensions for candidate frame learning and adopt 6 feature graphs of 8×8 for candidate frame related feature extraction; Performing non-maximal inhibition post-processing operation on the detection result to obtain the head detection rectangular frame; Inputting the head detection rectangular frame into a head key point positioning model for training to obtain head key point position coordinates, wherein the head detection rectangular frame comprises the following components: Inputting a head image in the head detection rectangular frame into MobileMeshNet network models, wherein the MobileMeshNet network models comprise 2D convolution layers and residual modules, the residual modules comprise 2D convolution layers with the size of 3 multiplied by 3, the convolution kernel of the 2D convolution layers is 3 multiplied by 3, the MobileMeshNet network models adopt asymmetric network structures, and the connection adopts maximum pooling with the pooling size of 2 multiplied by 2; Training the head image by adopting a downsampled residual error module to obtain a dependency relationship between key point features; and decoding the head image by adopting a head of a convolution layer-residual error module-convolution layer according to the dependency relationship between the key point characteristics to obtain the head key point position coordinates.
  2. 2. The method for monitoring the visual stimulus response based on the head actions according to claim 1, wherein the step of obtaining the visual stimulus direction and the visual stimulus response according to the head key point position coordinates comprises the following steps: Obtaining head positions, head characteristic point coordinates and head characteristic point relative depth estimation according to the head key point position coordinates; obtaining head action parameters according to the head positions, the coordinates of all the characteristic points of the head and the relative depth estimation of all the characteristic points of the head; And obtaining the visual stimulus direction and visual stimulus response according to the head action parameters.
  3. 3. The method for monitoring the visual stimulus response based on the head movements according to claim 2, wherein the head movement parameters include the head orientation, the head rotation direction, the head rotation speed, the head movement direction, the head movement speed, the head movement locus area, the head movement acceleration and the response time.
  4. 4. A method of monitoring visual stimulus response based on head movements as recited in claim 3 wherein said deriving visual stimulus direction and visual stimulus response from said head movement parameters comprises: According to the head orientation, to the visual stimulus direction; and obtaining the visual stimulus response according to the head rotation direction, the head rotation speed, the head movement direction, the head movement speed, the head movement track area and the head movement acceleration.
  5. 5. A visual stimulus response monitoring device based on head movements applied to the steps of implementing the visual stimulus response monitoring method based on head movements as claimed in any one of claims 1-4, said device comprising: The image acquisition module is used for acquiring images or video data containing the head images; the head detection rectangular frame acquisition module is used for inputting the image or video data into a head detection network model for training to obtain a head detection rectangular frame; the head key point position coordinate acquisition module is used for inputting the head detection rectangular frame into a head key point positioning model for training to obtain head key point position coordinates; And the visual stimulus response acquisition module is used for acquiring the visual stimulus direction and the visual stimulus response according to the head key point position coordinates.
  6. 6. A smart terminal comprising a memory, a processor and a head-action-based visual stimulus response monitoring program stored in the memory and executable on the processor, the processor implementing the steps of the head-action-based visual stimulus response monitoring method of any of claims 1-4 when executing the head-action-based visual stimulus response monitoring program.
  7. 7. A computer readable storage medium, wherein a visual stimulus response monitoring program based on head movements is stored on the computer readable storage medium, which when executed by a processor, implements the steps of the visual stimulus response monitoring method based on head movements as claimed in any one of claims 1-4.

Description

Visual stimulus response monitoring method based on head action and intelligent terminal Technical Field The invention relates to the field of visual stimulus response monitoring methods and intelligent terminals based on head actions. Background The reaction monitoring of visual stimulus has wide application in the fields of psychology and neuroscience research, infant research and the like. In the field of clinical research, the response to visual stimuli can be used for diagnosis of ophthalmic diseases and brain and neurological disorders, such as autism and parkinson's disease, and can also provide early sign information of some diseases, such as alzheimer's disease, parkinson's disease, and the like. Specific parameter data allows objective quantification of disease or rehabilitation progression studies. At present, a method of eye movement tracking is mainly adopted for monitoring visual stimulus response received by a user in front of a computer or mobile equipment, and the cognitive function state of the user is analyzed by monitoring eyeball movement of the user in the reading process or the task completion process. But the eye movement instrument for eye movement analysis has higher cost and is more inconvenient to carry. In addition, the method for completing eye movement analysis through a single camera can replace an eye movement instrument to a certain extent, but has higher requirements on definition and resolution of images or videos acquired by the camera. Meanwhile, the device cannot adapt to the situation that the light conditions such as backlight, strong light and the like are poor and the situation that the eyes are not completely shot in the shooting process; and moreover, the eye opening range of the old people with higher requirements for cognitive function evaluation is generally smaller than that of young people, so that eye movement tracking based on a single camera is difficult to realize, and visual stimulus response is difficult to monitor. Accordingly, the prior art is still in need of improvement and development. Disclosure of Invention The invention aims to solve the technical problems that aiming at the defects in the prior art, a visual stimulus response monitoring method and an intelligent terminal based on head actions are provided, and aims to solve the problems that the eye movement detection is high in implementation cost, the eye movement tracking based on a single camera is difficult to implement, and the visual stimulus response is difficult to monitor in the prior art. The technical scheme adopted for solving the technical problems is as follows: In a first aspect, the present invention provides a method of monitoring visual stimulus response based on head movements, wherein the method comprises: acquiring image or video data containing a head phantom; inputting the image or video data into a head detection network model for training to obtain a head detection rectangular frame; Inputting the head detection rectangular frame into a head key point positioning model for training to obtain head key point position coordinates; And obtaining the visual stimulus direction and visual stimulus response according to the head key point position coordinates. In one implementation, the inputting the image or video data into a head detection network model for training, to obtain a head detection rectangular frame includes: inputting the head image data into a head detection network model for training to obtain a detection result; and performing non-maximal inhibition post-processing operation on the detection result to obtain the head detection rectangular frame. In one implementation, the head detection network model is MobileFaceNet models, wherein the MobileFaceNet model includes a first separable convolution module and a second separable convolution module, the first separable convolution module includes one 5×5 convolution kernel for the shallow layer of the network, and the second separable convolution module includes two 5×5 convolution kernels for the deep layer of the network. In one implementation, the inputting the head detection rectangular box into the head key point positioning model for training to obtain the head key point position coordinates includes: Inputting the head image in the head detection rectangular frame into the MobileMeshNet network model, wherein the MobileMeshNet network model comprises 2D convolution layers and a residual module, wherein the residual module comprises 2 3X 3 2D convolution layers, and the convolution kernel size of the 2D convolution layers is 3X 3; Training the head image by adopting a downsampled residual error module to obtain a dependency relationship between key point features; and decoding the head image by adopting a head of a convolution layer-residual error module-convolution layer according to the dependency relationship between the key point characteristics to obtain the head key point position coordinates. In one i