CN-122024212-A - Early warning prompting method, device, equipment and storage medium

CN122024212ACN 122024212 ACN122024212 ACN 122024212ACN-122024212-A

Abstract

The application provides an early warning prompting method, a device, equipment and a storage medium, which are applied to target equipment, wherein the target equipment is connected with a main vision camera and at least one auxiliary camera; the method comprises the steps of obtaining a main view image and an auxiliary image, wherein the main view image is a target face image shot by a main view camera, the auxiliary image is an image of a target environment shot by an auxiliary camera, estimating a user fixation direction of the main view image to obtain a user fixation direction of a target user in the main view image, identifying an object position of a potential fixation object in the auxiliary image and a relative depth of the potential fixation object, wherein the relative depth is used for reflecting a distance between the potential fixation object and the target user, determining the user fixation degree corresponding to the user fixation object and the user fixation object according to the user fixation direction and the object position, and carrying out early warning prompt according to the user fixation degree corresponding to the user fixation object and the relative depth. The technical scheme can save the application cost of distraction early warning.

Inventors

Sun tianyuan
WANG MEIQI
ZHENG ZHONGYANG
LIU HONGBO

Assignees

广州视源电子科技股份有限公司
广州视源人工智能创新研究院有限公司

Dates

Publication Date: 20260512
Application Date: 20241112

Claims (11)

1. The early warning prompting method is characterized by being applied to target equipment, wherein the target equipment is connected with a main vision camera and at least one auxiliary camera, the main vision camera is arranged facing a user, and the auxiliary camera is arranged facing a user gazing direction, and the method comprises the following steps: acquiring a main view image and an auxiliary image, wherein the main view image is a target face image shot by the main view camera, the target face image is an image of the face of a target user, the auxiliary image is an image of a target environment shot by the auxiliary camera, and the target environment is an environment within the visual field range of the target user; estimating the user gazing direction of the main view image to obtain the user gazing direction of the target user in the main view image; Identifying an object position of a potential gaze in the auxiliary image and a relative depth of the potential gaze, the relative depth being used to reflect a distance of the potential gaze relative to the target user, the potential gaze being an object in the target environment; Determining a user gazing object and a user attention degree corresponding to the user gazing object according to the user gazing direction and the object position; And carrying out early warning prompt according to the user attention degree corresponding to the user attention object and the relative depth of the user attention object.
2. The method of claim 1, wherein the number of potential fixations is at least one; The determining, according to the user gazing direction and the object position, the user gazing object and the user attention corresponding to the user gazing object includes: determining at least one gaze matching value according to the user gaze direction and the object position of at least one potential gaze, wherein the at least one gaze matching value is the gaze matching value of the at least one potential gaze and is used for reflecting the matching degree between the object position of the potential gaze and the user gaze direction; and according to the at least one gazing object matching value, determining a potential gazing object with the gazing object matching value larger than a preset threshold value as a user gazing object, and determining the gazing object matching value of the user gazing object as a user attention degree corresponding to the user gazing object.
3. The method of claim 2, wherein the determining at least one gaze matching value based on the user gaze direction and the object location of at least one potential gaze comprises: Determining a projection distance of a target position vector relative to a gazing position vector to obtain a target projection distance, wherein the target position vector is a position vector of a target object position, the target object position is an object position of any one of the at least one potential gazing object, and the gazing position vector is a position vector of the gazing direction of the user; and determining a gaze object matching value of the potential gaze object corresponding to the target object position according to the target projection distance.
4. The method of claim 3, wherein the determining a gaze object matching value for the potential gaze object corresponding to the target object location based on the target projection distance comprises: acquiring the confidence coefficient of the user's gazing direction, wherein the confidence coefficient is used for reflecting the credibility of the user's gazing direction; And carrying out weighted summation on the confidence coefficient of the user gazing direction and the target projection distance to obtain a gazing object matching value of the potential gazing object corresponding to the target object position.
5. The method of claim 1, wherein the performing early warning according to the user attention degree corresponding to the user attention object and the relative depth of the user attention object comprises: If the user attention degree corresponding to the user attention object is lower than a preset attention threshold value within a preset time period, and the relative depth of the user attention object is reduced within the preset time period, outputting an early warning prompt.
6. The method according to any one of claims 1-5, wherein said estimating the user gaze direction of the main view image to obtain the user gaze direction of the target user in the main view image comprises: And inputting the main view image into a gazing direction estimation model to estimate the user gazing direction, and obtaining the user gazing direction of the target user in the main view image.
7. The method according to claim 6, wherein the gaze direction estimation model is trained from a preset network model, the preset network model including a feature extraction network, a first output branch and a second output branch, the first output branch and the second output branch being respectively connected to the feature extraction network, the feature extraction network being configured to perform feature extraction on a sample face image, resulting in image features, the sample face image being a face image used as a training sample, the first output branch being configured to output a gaze direction of a user in the sample face image based on the image features, the second output branch being configured to output facial landmark points of the user in the sample face image based on the image features, the facial landmark points being configured to adjust parameters of the feature extraction network during training of the preset network model, the gaze direction estimation model including the feature extraction network and the first output branch obtained by training.
8. The method of claim 7, wherein the pre-set network model further comprises a third output branch connected to the feature extraction network, the third output branch for outputting a head rotation direction of a user in the sample facial image based on the image features, the head rotation direction being used to adjust parameters of the feature extraction network during the training process.
9. The early warning prompting device is characterized by being applied to target equipment, wherein the target equipment is connected with a main vision camera and at least one auxiliary camera, the main vision camera is arranged facing a user, the auxiliary camera is arranged facing a user gazing direction, and the device comprises: the image acquisition module is used for acquiring a main view image and an auxiliary image, wherein the main view image is a target face image shot by the main view camera, the target face image is an image of the face of a target user, the auxiliary image is an image of a target environment shot by the auxiliary camera, and the target environment is an environment within the visual field range of the target user; The gazing direction estimation module is used for estimating the user gazing direction of the main view image to obtain the user gazing direction of the target user in the main view image; An object recognition module for recognizing an object position of a potential gazing object in the auxiliary image and a relative depth of the potential gazing object, wherein the relative depth is used for reflecting a distance between the potential gazing object and the target user, and the potential gazing object is an object in the target environment; The attention degree determining module is used for determining user attention degrees corresponding to user attention objects and the user attention objects according to the user attention direction and the object position; And the early warning prompt module is used for carrying out early warning prompt according to the user attention degree corresponding to the user attention object and the relative depth of the user attention object.
10. A computer device comprising a memory and a processor, the memory being connected to the processor, the processor being for executing one or more computer programs stored in the memory, the processor, when executing the one or more computer programs, causing the computer device to implement the method of any of claims 1-8.
11. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1-8.

Description

Early warning prompting method, device, equipment and storage medium Technical Field The application relates to the field of gaze point tracking, in particular to an early warning prompting method, device, equipment and storage medium. Background Along with the development of technology, intelligent automobiles mainly use driving auxiliary systems, wherein the driving auxiliary systems mainly comprise a lane keeping auxiliary system, an automatic parking auxiliary system, a braking auxiliary system, a reversing auxiliary system, a driving auxiliary system and the like. At present, the driving assistance system is improved from the angle of the automobile, so that the automobile is more intelligent, the screen in the automobile is bigger and bigger along with the more complex driving system in the automobile, the driver is easily disturbed by distraction, and once the driver distracts or walks away, the result is not considered. Therefore, the driver needs to find out the distraction condition in time and give an early warning prompt. How to perform early warning prompt becomes a technical problem to be solved urgently. Disclosure of Invention The application provides an early warning prompt method, device, equipment and storage medium, aiming at distraction early warning prompt. In a first aspect, an early warning prompting method is provided and applied to a target device, wherein the target device is connected with a main vision camera and at least one auxiliary camera, the main vision camera is set to face a user, and the auxiliary camera is set to face a user's gaze direction, and the method comprises: acquiring a main view image and an auxiliary image, wherein the main view image is a target face image shot by the main view camera, the target face image is an image of the face of a target user, the auxiliary image is an image of a target environment shot by the auxiliary camera, and the target environment is an environment within the visual field range of the target user; estimating the user gazing direction of the main view image to obtain the user gazing direction of the target user in the main view image; Identifying an object position of a potential gaze in the auxiliary image and a relative depth of the potential gaze, the relative depth being used to reflect a distance of the potential gaze relative to the target user, the potential gaze being an object in the target environment; Determining a user gazing object and a user attention degree corresponding to the user gazing object according to the user gazing direction and the object position; And carrying out early warning prompt according to the user attention degree corresponding to the user attention object and the relative depth of the user attention object. In the technical scheme, after a main view image and an auxiliary image are acquired, a user gaze direction estimation is carried out on the main view image to obtain a user gaze direction of a target user in the main view image, the object position of a potential gaze in the auxiliary image and the relative depth of the potential gaze are identified, then the user gaze degree corresponding to the user gaze and the user gaze is determined according to the user gaze direction and the object position of the potential gaze, finally the user gaze degree corresponding to the user gaze and the relative depth of the user gaze are carried out according to the user gaze degree corresponding to the user gaze, the distraction early warning prompt can be realized, because the main view image is a user face image shot by a main view camera, the main view camera faces the user, the auxiliary image is an image of an environment in a user view range shot by an auxiliary camera, the auxiliary camera faces the user gaze direction, after the object position of the potential gaze is identified from the auxiliary image, the user gaze degree corresponding to the user gaze can be determined according to the relationship between the position of the potential gaze and the user gaze, and the user gaze degree corresponding to the user gaze can be accurately determined, and the user gaze degree corresponding to the user gaze can be reasonably realized through the user gaze degree corresponding to the user gaze and the relative depth of the user gaze and the user gaze, the user gaze can be reasonably realized through the early warning camera and the additional warning has no need to be increased due to the fact that the attention has the attention effect. According to the method, the number of potential gazing objects is at least one, the user gazing objects and the user attention degrees corresponding to the user gazing objects are determined according to the user gazing directions and the object positions, the method comprises the steps of determining at least one gazing object matching value according to the user gazing directions and the object positions of the at least one potential gazing objects, wherein the at