CN-115273221-B - Video surface examination assisting method, device, equipment and storage medium

CN115273221BCN 115273221 BCN115273221 BCN 115273221BCN-115273221-B

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a video surface examination auxiliary method, device, equipment and storage medium, which are used for improving the identification accuracy of fraudulent conduct in video surface examination. The video surface examination assisting method comprises the steps of obtaining corresponding audio and video data when an examination person and a target person are in a preset audio and video detection area, carrying out audio detection based on the audio and video data, carrying out recognition of various actions on the target person when the audio of the examination person is finished if the audio of the examination person exists in the audio and video data, enabling the various actions to comprise a gaze action, a head action and a hand action, determining that surface examination fraud behaviors exist on the target person if the gaze action accords with the preset gaze action, enabling the head action accords with the preset head action or the hand action forms shielding on the face of the target person, generating reminding information corresponding to the surface examination fraud behaviors, and sending the reminding information to a surface examination reminding terminal.

Inventors

XIONG WENSHUO
ZENG FANTAO
LIU YUYU

Assignees

平安科技（深圳）有限公司

Dates

Publication Date: 20260508
Application Date: 20220617

Claims (8)

1. The video surface review assisting method is characterized by comprising the following steps of: when the approver and the target person are in a preset audio and video detection area, corresponding audio and video data are obtained, and audio detection is carried out based on the audio and video data; If the audio of the approver exists in the audio-video data, the target person is identified when the audio of the approver is finished, wherein the multiple actions comprise a gaze action, a head action and a hand action; If the gaze movement accords with a preset gaze movement, the head movement accords with a preset head movement or the hand movement forms shielding on the face of the target person, determining that the target person has surface examination fraudulent conduct, wherein the preset gaze movement comprises a gaze slow glance sideways at, a gaze fast glance sideways at and a gaze shake, and the preset head movement comprises a head fast rotation, a head left rotation and a head right rotation; generating reminding information corresponding to the surface examination fraud, and sending the reminding information to a surface examination reminding terminal; If the audio of the approver exists in the audio-video data, the target person is identified by a plurality of actions when the audio of the approver is finished, wherein the plurality of actions comprise a gaze action, a head action and a hand action, and the method comprises the steps of acquiring a face video of the target person when the audio of the approver is finished if the audio of the approver exists in the audio-video data; performing eye movement recognition according to the face video to obtain an eye movement recognition result, wherein the eye movement recognition result is obtained by performing plane rectangular coordinate system mapping on eye-drop angle values of the target person in each frame of video of the face video, connecting eye coordinate points corresponding to the eye-drop angle values in each frame of video to generate eye movement line segments of the target person, wherein abscissa corresponding to the eye coordinate points is used for indicating video frames, ordinate corresponding to the eye coordinate points is used for indicating eye-drop angle values, invoking a preset eye point detection model to perform template matching on the eye movement line segments, determining that eye movement of the target person accords with preset eye movement if the matching distance between any line segment in the eye movement line segments and a preset eye movement curve template is larger than or equal to the preset eye movement matching distance, determining that eye movement of the target person accords with the preset eye movement, the preset eye movement includes eye movement slow glance sideways at, eye movement fast glance sideways at and eye movement jitter, determining that each line segment in the eye movement line segments and the preset eye movement curve template are smaller than the preset eye movement curve template, and performing head recognition according to the eye movement recognition result, and obtaining a hand motion recognition result.
2. The video surface review assisting method according to claim 1, wherein the performing head motion recognition according to the face video to obtain a head motion recognition result comprises: Performing plane rectangular coordinate system mapping on the head posture angle value of the target person in each frame of video of the face video, connecting head posture coordinate points corresponding to the head posture angle value in each frame of video, and generating a head action line segment of the target person, wherein an abscissa corresponding to the head posture coordinate points is used for indicating a video frame, and a ordinate corresponding to the head posture coordinate points is used for indicating the head posture angle value; template matching is carried out on the head action line segment through a preset head gesture detection model; If the matching distance between any one of the head action line segments and the preset head action curve template is greater than or equal to the preset head action matching distance, determining that the head action of the target person accords with the preset head action, wherein the preset head action comprises head rapid rotation, head leftward rotation and head rightward rotation; And if the matching distance between each of the head action line segments and the preset head action curve template is smaller than the preset head action matching distance, determining that the head action of the target person does not accord with the preset head action.
3. The video surface review assistance method according to claim 1, wherein the step of performing hand motion recognition according to the face video to obtain a hand motion recognition result includes: generating a face region position frame of the target person according to the face video; performing hand detection on the face video; If the hand exists in the face video, generating a hand position frame corresponding to the hand; Calculating an intersection value between the face region position frame and the hand position frame, wherein the intersection value is used for indicating the ratio of the area of an overlapping region between the face region position frame and the hand position frame to the total area of the face region position frame and the hand position frame; If the intersection value is larger than or equal to a preset value, determining that the hand motion of the target person forms shielding on the face of the target person; and if the intersection value is smaller than a preset value, determining that the hand motion of the target person does not form shielding on the face of the target person.
4. The video surface review assistance method according to claim 1, wherein when the approver and the target person are in a preset audio/video detection area, acquiring corresponding audio/video data and performing audio detection based on the audio/video data, comprises: when the approver and the target person are in a preset audio and video detection area, corresponding audio and video data are acquired; Extracting the audio data in the audio-video data to obtain audio data; Extracting voiceprint features of the audio data to obtain a voiceprint feature sequence; If the voiceprint feature sequence is matched with a preset approver voiceprint feature sequence, determining that the audio of the approver exists in the audio and video data; if the voiceprint feature sequence is not matched with the voiceprint feature sequence of the preset approver, determining that the audio of the approver does not exist in the audio and video data.
5. The video surface review assistance method according to any one of claims 1 to 4, wherein when the approver and the target person are in a preset audio/video detection area, acquiring corresponding audio/video data, and performing audio detection based on the audio/video data, and before generating the reminder corresponding to the surface review fraud and transmitting the reminder to the surface review reminder terminal, further comprising: If the audio of the approver exists in the audio and video data, acquiring a face video of the target person when the audio of the approver is finished; Performing color detection on the ears of the target personnel according to the face video; And if the color of the ear accords with the preset color, determining that the target person has surface examination fraudulent behaviors.
6. A video censoring aid, the video censoring aid comprising: The audio detection module is used for acquiring corresponding audio and video data when the approver and the target person are in a preset audio and video detection area and carrying out audio detection based on the audio and video data; The action recognition module is used for recognizing multiple actions of the target person when the audio of the approver is finished if the audio of the approver exists in the audio-video data, wherein the multiple actions comprise a gaze action, a head action and a hand action; A first determining module, configured to determine that a target person has a surface-inspection fraud if the gaze movement conforms to a preset gaze movement, the head movement conforms to a preset head movement, or the hand movement forms a mask for a face of the target person, where the preset gaze movement includes a gaze slowness glance sideways at, a gaze quickness glance sideways at, and a gaze shake, and the preset head movement includes a head quick rotation, a head left rotation, and a head right rotation; the information sending module is used for generating reminding information corresponding to the surface examination fraud and sending the reminding information to the surface examination reminding terminal; The motion recognition module is specifically configured to, if audio of the approver exists in the audio-video data, acquire a face video of the target person when the audio of the approver is finished, perform eye motion recognition according to the face video, obtain an eye motion recognition result, map an eye-drop point angle value of the target person in each frame of video of the face video with a plane rectangular coordinate system, connect eye coordinate points corresponding to the eye-drop point angle value in each frame of video, generate eye motion line segments of the target person, wherein abscissa corresponding to the eye coordinate points is used for indicating a video frame, and ordinate corresponding to the eye-drop point angle value, invoke a preset eye-point detection model to perform template matching on the eye motion line segments, determine that eye motion of the target person accords with preset eye motion if a matching distance between any line segment in the eye motion and a preset eye motion curve template is greater than or equal to a preset eye motion matching distance, and perform eye motion recognition according to the preset eye motion glance sideways at, and obtain a hand motion recognition result if eye motion of the target person accords with the eye motion of the target person, and the head motion is not matched with the preset eye motion curve.
7. The video surface review auxiliary device is characterized by comprising a memory and at least one processor, wherein the memory stores instructions; The at least one processor invoking the instructions in the memory to cause the video censoring assistance device to perform the video censoring assistance method of any one of claims 1-5.
8. A computer readable storage medium having instructions stored thereon, which when executed by a processor, implement the video review aid method of any of claims 1-5.

Description

Video surface examination assisting method, device, equipment and storage medium Technical Field The present invention relates to the field of artificial intelligence technologies, and in particular, to a video surface review assisting method, apparatus, device, and storage medium. Background The video surface examination refers to examining the user in a video mode and a manual mode, and examination personnel are required to combine the surface examination experience of the user and materials provided by the user to provide surface examination questions, pay attention to whether the user has abnormal behaviors when answering the questions, and therefore whether the user has lie and fraud is judged. Due to the fact that standards of abnormal behaviors of various approvers are inconsistent, part of abnormal behaviors cannot be paid attention to enough, hidden fraud risks are huge, meanwhile, due to the fact that levels of the approvers are uneven, under the long-time working condition, the approvers easily relax attention, abnormal behaviors of missing users are caused, and the fraud risks are difficult to find. Disclosure of Invention The invention provides an auxiliary method, device, equipment and storage medium for video surface examination, which are used for improving the identification accuracy of fraudulent conduct in video surface examination. The first aspect of the invention provides a video surface examination assisting method, which comprises the steps of obtaining corresponding audio and video data when an examination person and a target person are in a preset audio and video detection area, carrying out audio detection based on the audio and video data, carrying out recognition of various actions on the target person when the audio of the examination person is finished if the audio of the examination person exists in the audio and video data, wherein the various actions comprise eye-ward actions, head actions and hand actions, and determining that surface examination fraud exists on the target person if the eye-ward actions accord with preset eye-ward actions, the head actions accord with preset head actions or the hand actions form shielding on the face of the target person, wherein the preset eye-ward actions comprise eye-ward slow glance sideways at, eye-ward fast glance sideways at and eye-ward jitter, the preset head actions comprise head-fast rotation, head-left rotation and head-right rotation, generating reminding information corresponding to the surface examination fraud, and sending the reminding information to a surface examination reminding terminal. In a possible implementation manner, if the audio of the approver exists in the audio-video data, the target person is identified by multiple actions when the audio of the approver is finished, wherein the multiple actions include a gaze action, a head action and a hand action, and the method comprises the steps of acquiring a face video of the target person when the audio of the approver is finished if the audio of the approver exists in the audio-video data; the method comprises the steps of carrying out gaze motion recognition according to the face video to obtain a gaze motion recognition result, carrying out head motion recognition according to the face video to obtain a head motion recognition result, and carrying out hand motion recognition according to the face video to obtain a hand motion recognition result. In a possible implementation manner, the method for recognizing the eye movement according to the face video to obtain an eye movement recognition result includes that a plane rectangular coordinate system is mapped to eye-drop angle values of the target person in each frame of video of the face video, eye coordinate points corresponding to the eye-drop angle values in each frame of video are connected to generate eye movement line segments of the target person, horizontal coordinates corresponding to the eye coordinate points are used for indicating video frames, corresponding vertical coordinates are used for indicating eye-drop angle values, a preset eye-point detection model is called to perform template matching on the eye movement line segments, if matching distance between any line segment of the eye movement line segments and a preset eye movement curve template is larger than or equal to a preset eye movement matching distance, eye movement of the target person is determined to be in accordance with a preset eye movement, the preset eye movement includes eye slowness glance sideways at, eye quickness glance sideways at and eye shake, and if matching distance between each line segment of the eye movement line segment and the preset eye movement curve template is smaller than the preset eye movement curve template, and the eye movement is determined to be not in accordance with the preset eye movement curve template. In a possible implementation manner, the head motion recognition is performed according to the face