Search

CN-121971030-A - Pupil and reflection facula self-adaptive tracking method and system integrating multiple sensors

CN121971030ACN 121971030 ACN121971030 ACN 121971030ACN-121971030-A

Abstract

The invention discloses a pupil and reflection facula self-adaptive tracking method and system integrating multiple sensors, and belongs to the field of eye movement tracking signal processing. The eye movement tracking method comprises the steps of generating cornea reflection light spots by an eye movement tracking device, collecting images and events of eyes, respectively obtaining a light spot template, a light spot coordinate list and pupil fitting ellipses according to the images, screening the events according to the pupil fitting ellipses and the light spot templates in an eye movement tracking stage to obtain pupil events and light spot events, sequentially obtaining pupil movement direction angles and light spot event templates at the current moment according to the pupil events to further obtain light spot candidate positions, screening the light spot candidate positions according to the light spot coordinate list at the previous moment to update the light spot coordinate list and the light spot templates, and finally comparing pupil fitting ellipses in the previous moment and the later moment to determine whether eye movement tracking is finished. The invention realizes pupil and facula tracking with high precision and high time resolution by fusing frame images and event streams.

Inventors

  • WANG KAIWEI
  • LI BIHUA
  • TIAN WENJUN
  • GONG YANYUN
  • MA YUQIN
  • YANG ZIXIN
  • HU WEIJIAN
  • ZHU HUIXING
  • ZHANG YU
  • WANG LUMING
  • XIE SENDONG
  • HAN JIANGTAO

Assignees

  • 浙江大学
  • 舜宇光学(浙江)研究院有限公司

Dates

Publication Date
20260505
Application Date
20260121

Claims (10)

  1. 1. The pupil and reflection facula self-adaptive tracking method integrating multiple sensors is characterized by comprising the following steps of: S1, infrared light is emitted to a pupil to be tracked by an infrared LED, a reflection light spot is formed on a cornea, an eye image and an event generated by eye brightness change are acquired and transmitted in the form of an image frame and an event stream respectively; s2, extracting a facula image according to the eye image, and further obtaining the facula template shape; S3, extracting a facula image according to the eye image, and further obtaining a facula coordinate list at the initial moment and a facula template position at the next moment; s4, screening events in the current moment based on pupil fitting ellipses in the previous moment, and further calculating to obtain pupil movement direction angles in the current moment after pupil events are obtained; screening an event in the current moment according to the shape of the spot template and the position of the spot template at the current moment to obtain a spot event, and combining the spot event template at the previous moment or the spot event template at the current moment to obtain a spot candidate position; S5, screening the spot candidate positions according to the spot coordinate list at the previous moment to sequentially obtain the spot coordinate list at the current moment and the spot template position at the next moment; S6, combining all effective pupil events in the current moment and pupil fitting ellipses at the previous moment to obtain pupil fitting ellipses at the current moment, wherein the effective pupil events are obtained by screening pupil events according to the pupil movement direction angle at the current moment, the pupil fitting ellipses at the previous moment and a spot coordinate list at the current moment; s7, comparing whether the difference value of the pupil fitting ellipse vertical direction axial length at the previous moment and the current moment exceeds a pupil axial length change threshold value, if so, ending the eye tracking of the round, starting the new eye tracking, and if not, circulating S4-S7.
  2. 2. The method for adaptively tracking pupils and reflection spots by fusing multiple sensors according to claim 1, wherein in S4, events in the current moment are screened based on pupil fitting ellipses in the previous moment to obtain pupil events, specifically: drawing an elliptical edge with the width of 5 pixels based on elliptical parameters of pupil fitting ellipse at the previous moment, namely, a pupil elliptical edge mask at the current moment; And screening the events in the current moment according to the pupil elliptical edge mask at the current moment, and only reserving the events in the pupil elliptical edge mask at the current moment as pupil events.
  3. 3. The method for adaptively tracking pupils and reflection spots by fusing multiple sensors according to claim 1, wherein in S4, the pupil movement direction angle at the current moment is calculated according to the pupil event, specifically: Obtaining coordinates of the pupil center at the previous moment based on the pupil fitting ellipse at the previous moment; Taking a single pupil event as a unit, when the pupil event is a positive event, taking the difference between the pupil center coordinate at the previous moment and the pixel point coordinate corresponding to the pupil event to obtain an event pupil motion direction vector; and adding all event pupil movement direction vectors in the current moment to obtain a pupil movement direction vector in the current moment, and further calculating to obtain a direction angle, namely the pupil movement direction angle in the current moment.
  4. 4. The multi-sensor-fused pupil and reflection spot self-adaptive tracking method according to claim 1 is characterized in that in S4, a spot event template is specifically constructed by taking a template bisector perpendicular to a vector ray corresponding to a pupil moving direction angle at a previous moment as a boundary, a forward part value of the vector ray is +1, a backward part value of the vector ray is-1, and a finally constructed graph is the spot event template at the current moment.
  5. 5. The pupil and reflection spot adaptive tracking method of a fusion multisensor according to claim 1, wherein in S4, spot candidate positions are obtained by combining a spot event template at a current moment, specifically: Traversing all facula events in the current moment to obtain a facula positive event accumulation frame and a facula negative event accumulation frame at the current moment; Convolving the spot event template at the current moment with the positive and negative event accumulation frames of the spot at the current moment respectively; subtracting the convolution result of the light spot negative event accumulation frame from the convolution result of the light spot positive event accumulation frame to obtain a light spot response diagram at the current moment, and reserving pixel points with pixel values higher than a light spot response threshold value to form a light spot candidate region; Calculating the mass center of the candidate area corresponding to each light spot in the light spot candidate area based on the light spot candidate area; and moving each centroid by 1 pixel in the pupil moving direction at the previous moment, and forming spot candidate positions by all the moved centers.
  6. 6. The method for adaptively tracking pupils and reflection light spots by fusing multiple sensors according to claim 1, wherein in S4, the light spot candidate positions are obtained by combining the light spot event templates at the previous moment, specifically: According to each spot event type in the current moment, sequentially accumulating the spot event templates at the previous moment by taking the pixel position of the generated event as the center to obtain a spot response diagram at the current moment; Calculating the mass center of the candidate area corresponding to each light spot in the light spot candidate area based on the light spot candidate area; Moving each centroid by 1 pixel in the pupil moving direction at the previous moment, and forming a spot candidate position by all the moved centers; The step of carrying out different processing on the spot event templates at the previous moment according to each spot event type comprises the step of reversing the spot event templates at the previous moment when the spot event type is positive, and the step of keeping the spot event templates at the previous moment unchanged when the spot event type is negative.
  7. 7. The pupil and reflection spot adaptive tracking method based on the multi-sensor fusion of claim 1, wherein in S5, the spot template position at the next moment is kept unchanged, the spot template position is moved until the sum of the distance differences between each spot in the spot coordinate list at the current moment and the corresponding reference spot on the spot template is the smallest, and the spot template position at the next moment is the spot template position at the next moment.
  8. 8. The multi-sensor-fused pupil and reflection spot self-adaptive tracking method is characterized in that in S6, pupil fitting ellipses at the current moment are obtained by combining all effective pupil events in the current moment and pupil fitting ellipses at the previous moment, specifically, partial points on the pupil fitting ellipses at the previous moment are randomly sampled, and ellipse fitting is carried out on the partial points and all the effective pupil events to obtain the pupil fitting ellipses at the current moment.
  9. 9. The method for adaptively tracking pupils and reflection light spots by fusing multiple sensors according to claim 1, wherein in S6, the effective pupil event is obtained by screening the pupil event according to the pupil movement direction angle at the current moment, the pupil fitting ellipse at the previous moment and the light spot coordinate list at the current moment, specifically: taking a single pupil event as a unit, connecting a pixel point generating the pupil event with the pupil fitting ellipse center at the previous moment to obtain an intersection point of the connecting line and the pupil fitting ellipse edge at the previous moment; And further calculating the product of the distance vector and the pupil movement direction vector at the current moment, filtering the pupil event and ending calculation if the product result is smaller than the tolerance threshold, further comparing the coordinate corresponding to the pupil event with all coordinates in the light spot coordinate list at the current moment if the product result is larger than or equal to the tolerance threshold, filtering the pupil event and ending calculation if the coordinate position distance is smaller than the distance threshold, otherwise recording the pupil event as an effective pupil event.
  10. 10. A multi-sensor fused pupil and reflected light spot adaptive tracking system for implementing the multi-sensor fused pupil and reflected light spot adaptive tracking method of claim 1, comprising: The eye movement tracking device is composed of a plurality of infrared LEDs and two hybrid sensors and is used for emitting infrared light to pupils to be tracked, generating light spots on cornea, collecting eye images and respectively transmitting the eye images to the pupil tracking module and the light spot detection module, collecting events generated by eye brightness change and transmitting the events to the pupil tracking module and the light spot detection module one by one; The pupil tracking module is used for extracting pupil images according to eye images to further obtain an initial moment pupil fitting ellipse, screening events in the current moment based on the previous moment pupil fitting ellipse to obtain pupil events, further calculating to obtain a current moment pupil movement direction angle, combining all effective pupil events in the current moment and the previous moment pupil fitting ellipse to obtain the current moment pupil fitting ellipse, screening the pupil events according to the current moment pupil movement direction angle, the previous moment pupil fitting ellipse and a current moment spot coordinate list, comparing whether the difference value of the vertical direction axial length of the previous moment pupil fitting ellipse and the current moment pupil fitting ellipse exceeds a pupil axial length change threshold, ending the current eye movement tracking and restarting a new round of eye movement tracking if the difference value exceeds the threshold, and continuing the current round of eye movement tracking if the difference value does not exceed the threshold; The spot detection module is used for extracting a spot image according to the eye image to further obtain a spot template shape, extracting the spot image according to the eye image to further obtain an initial moment spot coordinate list and a next moment spot template position, screening an event in the current moment according to the spot template shape and the current moment spot template position to obtain a spot event, combining the previous moment spot event template or the current moment spot event template to obtain a spot candidate position, screening the spot candidate position according to the previous moment spot coordinate list, and sequentially obtaining the current moment spot coordinate list and the next moment spot template position.

Description

Pupil and reflection facula self-adaptive tracking method and system integrating multiple sensors Technical Field The invention relates to the field of eye movement tracking signal processing, in particular to a pupil and reflection facula self-adaptive tracking method and system integrating multiple sensors. Technical Field Currently, frame cameras are mainly used for eye tracking, position information of pupils and light spots is obtained from gray level images, an eyeball three-dimensional model is modeled, and then gaze point estimation is achieved. But this approach is relatively high in end-to-end power consumption, which is a large power consumption burden for XR devices. The dynamic vision sensor has the advantages of high time resolution and low power consumption, and is expected to realize eye tracking with lower power consumption while maintaining eye tracking precision. Unlike conventional frame cameras which output images at fixed time intervals, dynamic vision sensors asynchronously output image plane brightness change events. During eye movement, the lower intensity pupil moves with the higher intensity corneal reflection spot, triggering an event response. Eye tracking methods based on dynamic vision sensors can be mainly divided into two types, namely a traditional method and a learning-based method. One typical conventional method is a modeling fitting method that combines a frame camera image with event stream information. The method comprises the steps of firstly extracting pupil edges, eyelid edges and reflection light spots from frame images through morphological operation, respectively fitting pupils, parabolas to eyelid and circular fitting light spots through ellipses, and then carrying out on-line adjustment on fitting results among frames by utilizing asynchronous event streams, so that eye movement state feedback with high frame rate is realized. If only a dynamic vision sensor is relied on, the neural network is used to regress pupil center position from accumulated event frames or voxel grids. For the reflected light spots, the coded differential illumination method based on the dynamic vision sensor realizes the accurate extraction and correspondence of the reflected light spots. Differential illumination means that each illumination light source consists of two closely attached LEDs, and researchers verify that the signal to noise ratio of the differential illumination is significantly higher than that of a single LED spot. Meanwhile, the correspondence between the LED numbers and the light spots is realized through coded illumination. But the method only realizes the extraction and the correspondence of the light spots at present, and does not combine the position tracking of pupils. In the current tracking method, simultaneous tracking of pupils and a plurality of reflecting bright spots based on pure event information is not realized, so that the acquired information quantity is insufficient to realize stable output of a pupil cornea reflection method. Under the condition of a plurality of reflection bright spots, the reflection light spots and the pupil edge can be mutually influenced, and the condition of inaccurate tracking is easy to occur. Meanwhile, the current scheme based on only dynamic vision sensors is generally low in accuracy because various types of noise exist in the response of the event camera, wherein the influence of tailing event noise on estimation is the greatest. A tailing event means that an event is more likely to occur in an area where the event has occurred, and thus an area where the edge of the pupil has moved is more likely to occur, and these points may make the position of the pupil inaccurate when the pupil is fitted, which is not considered by the current method. In the prior mixed sensor eye movement scheme of the fusion image sensor and the dynamic vision sensor, the dynamic vision sensor mainly has the effects of increasing the feedback frame rate of eye movement data, not reducing the original power consumption of a frame camera and not fully playing the advantage of low power consumption of the dynamic vision sensor. Disclosure of Invention The invention dynamically combines frame images, reduces the overall power consumption of the system, realizes the tracking of a plurality of reflection facula and pupils with self-adaptive frame rate by using the input of the dynamic vision sensor, and further provides stable input for the subsequent pupil cornea reflection method. In a first aspect, the present invention provides a pupil and reflection spot adaptive tracking method for integrating multiple sensors, including: S1, infrared light is emitted to a pupil to be tracked by an infrared LED, a reflection light spot is formed on a cornea, an eye image and an event generated by eye brightness change are acquired and transmitted in the form of an image frame and an event stream respectively; s2, extracting a facula image according to the eye