CN-116740804-B - Detection method and device for photographing behaviors, electronic equipment and storage medium

CN116740804BCN 116740804 BCN116740804 BCN 116740804BCN-116740804-B

Abstract

The invention discloses a method, a device, electronic equipment and a storage medium for detecting a stealth shooting behavior, which are characterized in that firstly, pedestrian images in a target area are collected, then, the arm gestures of all pedestrians in the images of the travelers are identified through a machine identification technology, a stealth shooting suspected object is determined according to the arm gestures, then, the type of a handheld object of the stealth shooting suspected object and the dressing type of the pedestrians in the corresponding stealth shooting image area are identified, finally, whether the stealth shooting behavior exists in the stealth shooting suspected object is finally judged, the interception of the stealth shooting image can be carried out after the stealth shooting behavior is judged, and the stealth shooting behavior is sent to a monitoring terminal, so that evidence retention and secondary detection of the stealth shooting behavior are realized.

Inventors

HUANG JINYE
CHEN LEI

Assignees

深圳市旗扬特种装备技术工程有限公司

Dates

Publication Date: 20260508
Application Date: 20230506

Claims (9)

1. The detection method for the candid behavior is characterized by comprising the following steps of: Acquiring a pedestrian image in a target area, wherein the pedestrian image comprises at least two pedestrians; extracting an image area of each pedestrian from the pedestrian image, and carrying out gesture recognition on the pedestrians in each image area to obtain arm gestures of each pedestrian; based on the arm gesture of each pedestrian, determining a robbery suspected object from each pedestrian; performing target detection processing on an image area corresponding to the candid suspects to obtain the types of the handheld objects of the candid suspects, and judging whether the types of the handheld objects belong to preset types; If so, determining an image area of the candid photograph suspected object from a target image area based on the image area of the candid photograph suspected object, wherein the target image area is the image area remained after deleting the image area of the candid photograph suspected object in an image set, and the image set comprises the image areas of all pedestrians; performing dressing identification on pedestrians in the image area to obtain the dressing type of the object to be photographed, and judging whether the dressing type of the object to be photographed belongs to a preset dressing type; if yes, a candid photograph image is cut from the pedestrian image, wherein the candid photograph image comprises an image area corresponding to the candid photograph suspected object and the candid photograph image area; Transmitting the candid images to a monitoring terminal to finish detection of candid behaviors of pedestrians in the pedestrian images; carrying out gesture recognition on pedestrians in each image area to obtain arm gestures of the pedestrians, wherein the gesture recognition comprises the following steps: Inputting any image area into a gesture recognition model for gesture pre-recognition processing to obtain a joint score map and a joint neighbor score map of a target pedestrian, wherein the target pedestrian is a pedestrian in the any image area, any pixel point in the joint score map corresponds to 14 joint category probabilities, the joint categories corresponding to the 14 joint category probabilities of each pixel point in the joint score map are the same, and the joint neighbor score map is used for representing bias between any joint category of 14 joint categories and the rest 13 joint categories contained in the joint score map; inputting any image area into a body part recognition model to perform body part recognition processing to obtain a body part score map of a target pedestrian, wherein the body part score map comprises six body part labels and a background label of the target pedestrian; determining a candidate joint set of the target pedestrian based on the joint score map; calculating joint label values of each candidate joint in the candidate joint set according to the joint score map, the joint neighbor score map and the body part score map, wherein any joint label value corresponds to a human joint; performing gesture configuration on joint label values of all candidate joints to obtain a plurality of gesture configuration sequences, wherein any gesture configuration sequence comprises at least three joint label values; And determining a gesture configuration sequence closest to the arm gesture of the target pedestrian from a plurality of gesture configuration sequences, so as to determine the arm gesture of the target pedestrian based on the gesture configuration sequence closest to the arm gesture of the target pedestrian.
2. The method of claim 1, wherein determining the candidate joint set for the target pedestrian based on the joint score map comprises: For the kth pixel point in the joint score chart, arranging the 14 joint class probabilities corresponding to the kth pixel point in a sequence from big to small, and taking the joint class corresponding to the joint class probability of the first 6 bits of the sequence as a candidate joint of the kth pixel point; Adding 1 to k, and rearranging the 14 joint class probabilities corresponding to the kth pixel point in sequence from large to small until k is equal to m, so as to obtain candidate joints of each pixel point, wherein the initial value of k is 1, and m is the total number of pixel points in the joint score map; and utilizing the candidate joints of each pixel point to form a candidate joint set of the target pedestrian.
3. The method of claim 1, wherein any one of the set of candidate joints corresponds to a joint feature value, wherein calculating a joint label value for each candidate joint in the set of candidate joints from the joint score map, the joint neighbor score map, and the body part score map comprises: For an ith candidate joint in the candidate joint set, calculating a joint label value of the ith candidate joint based on the joint score map, the joint neighbor score map and the body part score map according to the following formula (1); (1) (2) (3) (4) in the above-mentioned formula (1), Joint label values representing the i-th candidate joint, A joint characteristic value representing the i-th candidate joint, Representing a unitary item of value, Represents a binary item and, A joint characteristic value representing a j-th candidate joint in the set of candidate joints, Represents a set of edges, an , Representing the i-th candidate joint, Represents the j-th candidate joint in the set of candidate joints, Representing the probability that the ith candidate joint and the jth candidate joint belong to the same human body, wherein n is the total number of the candidate joints in the candidate joint set; In the above-mentioned formula (2), A graph of the joint score is shown, Representing joint class probabilities of the ith candidate joint in the joint score graph; In the above-mentioned formula (3) and formula (4), A graph representing the score of a neighbor of a joint, A map of the body part score is represented, The intermediate parameter is represented by a value representing, The weight of the joint is represented by the weight, Representing a characteristic logistic regression operation, Representing a feature vector between the ith candidate joint and the jth candidate joint; And (3) adding 1 to the i, and calculating the joint label value of the i candidate joint again based on the joint score map, the joint neighbor score map, the body part score map and the formula (1) until n is reached, and obtaining the joint label value of each candidate joint, wherein the initial value of i is 1.
4. A method according to claim 3, wherein the probability that the i candidate joint and the j candidate joint belong to the same human body is obtained by the following steps; acquiring a first direct vector of the ith candidate joint pointing to the jth candidate joint and a second direct vector of the jth candidate joint pointing to the ith candidate joint; Calculating a first estimated candidate joint of the ith candidate joint and a second estimated candidate joint of the jth candidate joint based on the joint neighbor score map; Determining a third direct vector of the ith candidate joint pointing to the first estimated candidate joint based on the ith candidate joint and the first estimated candidate joint, and determining a fourth direct vector of the jth candidate joint pointing to the second estimated candidate joint based on the jth candidate joint and the second estimated candidate joint; And calculating the probability that the ith candidate joint and the jth candidate joint belong to the same human body by using the first direct vector, the second direct vector, the third direct vector and the fourth direct vector.
5. The method of claim 4, wherein calculating a first estimated candidate joint for the ith candidate joint based on the joint neighbor score map comprises: Calculating a first estimated candidate joint of the ith candidate joint based on the joint neighbor score map by adopting the following formula (5); (5) in the above-mentioned formula (5), Representing a first estimated candidate joint, Representing the bias between the joint class of the ith candidate joint and the joint class of the jth candidate joint in the joint neighbor score graph; correspondingly, calculating the probability that the ith candidate joint and the jth candidate joint belong to the same human body by using the first direct vector, the second direct vector, the third direct vector and the fourth direct vector includes: Calculating a characteristic association value between an ith candidate joint and the jth candidate joint by using the first direct vector, the second direct vector, the third direct vector and the fourth direct vector and adopting the following formula (6); (6) in the above-mentioned formula (6), The characteristic association value is represented as such, The first direct vector is represented by a first direct vector, A second direct vector is represented and is used to represent, A third direct vector is represented and is used to represent, A fourth direct vector is represented and is used to represent, Representing the vector euclidean distance operator, Representing vector included angle operators; and obtaining the probability that the ith candidate joint and the jth candidate joint belong to the same human body based on the characteristic association value between the ith candidate joint and the jth candidate joint.
6. The method of claim 1, wherein determining a pose configuration sequence closest to the target pedestrian arm pose from among a number of pose configuration sequences, comprises: Based on a plurality of gesture configuration sequences, determining human joints contained in each gesture configuration sequence; according to the human body joints contained in each gesture configuration sequence, a joint region corresponding to the human body joints contained in each gesture configuration sequence is cut out from any image region; acquiring the center coordinates of the joint region corresponding to each gesture configuration sequence and the center coordinates of any image region; calculating the distance between each joint region and any image region based on the center coordinates of the joint region corresponding to each gesture configuration sequence and the center coordinates of any image region; and taking the gesture configuration sequence corresponding to the joint region with the smallest distance from any image region as the gesture configuration sequence closest to the gesture of the arm of the target pedestrian.
7. A device for detecting a candid behavior, which is configured to perform the candid behavior detection method according to any one of claims 1-6, where the device includes: An acquisition unit, configured to acquire a pedestrian image in a target area, where the pedestrian image includes at least two pedestrians; The gesture recognition unit is used for extracting an image area of each pedestrian from the pedestrian image, and carrying out gesture recognition on the pedestrians in each image area to obtain arm gestures of the pedestrians; The gesture recognition unit is used for determining a robbery suspected object from each pedestrian based on the arm gesture of each pedestrian; The target detection unit is used for carrying out target detection processing on the image area corresponding to the candid photograph suspected object, obtaining the hand-held object type of the candid photograph suspected object, and judging whether the hand-held object type belongs to a preset type; The image extraction unit is used for determining an image area of the candid photograph suspected object from a target image area based on the image area of the candid photograph suspected object when the hand-held object type is judged to belong to a preset type, wherein the target image area is the image area remained after the image area of the candid photograph suspected object is deleted in an image set, and the image set comprises the image areas of all pedestrians; The dressing identification unit is used for carrying out dressing identification on the pedestrians in the candid image area to obtain the dressing type of the candid object, and judging whether the dressing type of the candid object belongs to a preset dressing type or not; The image extraction unit is used for cutting out an illegal photographing image from the pedestrian image when judging that the dressing type of the illegal photographing object belongs to a preset dressing type, wherein the illegal photographing image comprises an image area corresponding to the illegal photographing suspected object and the illegal photographing image area; and the sending unit is used for sending the candid photograph images to a monitoring terminal so as to finish detection of candid photographing behaviors of all pedestrians in the pedestrian images.
8. An electronic device comprising a memory, a processor and a transceiver in communication with each other in sequence, wherein the memory is configured to store a computer program, the transceiver is configured to send and receive a message, and the processor is configured to read the computer program and execute the method for detecting a candid behavior according to any one of claims 1-6.
9. A storage medium having stored thereon instructions which, when executed on a computer, perform the method for detecting a candid behavior as defined in any one of claims 1-6.

Description

Detection method and device for photographing behaviors, electronic equipment and storage medium Technical Field The invention belongs to the technical field of image recognition, and particularly relates to a method and a device for detecting a photographing behavior, electronic equipment and a storage medium. Background Recently, electronic civilization rapidly develops, the life of modern people becomes more moist and convenient due to the automation of various products, the electronic equipment brings convenience to people, meanwhile, personal secret guarantee is more and more difficult, particularly, the micro camera and the intelligent mobile phone are widely used, the phenomenon of infringement of personal life is greatly increased, for example, in public crowded transportation means such as buses and subways, illegal molecules conduct candid shooting activities by means of the characteristic of ultra-clear pixels of the mobile phone, the privacy of other people is infringed, irreversible damage is brought to the mind and the spirit of victims due to the fact that the data of candid shooting are transmitted to the internet, and serious influence is brought to life and work, so the phenomenon of surreptitious shooting on public transportation is prevented. At present, most of the detection of the stealth shooting in public transportation is manual inspection and the principal takes precautions, so that time and labor are wasted, effective detection of the stealth shooting can not be realized, illegal molecules can not be deterred, and based on the detection, how to provide a detection method for the stealth shooting, which can effectively detect the stealth shooting and has high detection efficiency, becomes a problem to be solved urgently. Disclosure of Invention The invention aims to provide a method, a device, electronic equipment and a storage medium for detecting a candid photographing behavior, which are used for solving the problems that manual inspection is time-consuming and labor-consuming and the candid photographing behavior cannot be effectively detected in the prior art. In order to achieve the above purpose, the present invention adopts the following technical scheme: in a first aspect, a method for detecting a candid behavior is provided, including: Acquiring a pedestrian image in a target area, wherein the pedestrian image comprises at least two pedestrians; extracting an image area of each pedestrian from the pedestrian image, and carrying out gesture recognition on the pedestrians in each image area to obtain arm gestures of each pedestrian; based on the arm gesture of each pedestrian, determining a robbery suspected object from each pedestrian; performing target detection processing on an image area corresponding to the candid suspects to obtain the types of the handheld objects of the candid suspects, and judging whether the types of the handheld objects belong to preset types; If so, determining an image area of the candid photograph suspected object from a target image area based on the image area of the candid photograph suspected object, wherein the target image area is the image area remained after deleting the image area of the candid photograph suspected object in an image set, and the image set comprises the image areas of all pedestrians; performing dressing identification on pedestrians in the image area to obtain the dressing type of the object to be photographed, and judging whether the dressing type of the object to be photographed belongs to a preset dressing type; if yes, a candid photograph image is cut from the pedestrian image, wherein the candid photograph image comprises an image area corresponding to the candid photograph suspected object and the candid photograph image area; and sending the candid photograph image to a monitoring terminal to finish detection of candid photograph behaviors of all pedestrians in the pedestrian image. Based on the above disclosure, the method comprises the steps of collecting a pedestrian image in a target area, then carrying out image segmentation on each pedestrian in the pedestrian image to obtain an image area of each pedestrian, then carrying out gesture recognition on the pedestrian in each image area to obtain an arm gesture of each pedestrian, determining a suspected object to be photographed from each pedestrian according to the arm gesture (the suspected object to be photographed is temporarily determined as a suspected object) according to the arm gesture, then carrying out target detection on an image area corresponding to the suspected object to be photographed, recognizing whether the type of the handheld object of the suspected object to be photographed is a preset type, if so, carrying out object holding such as a mobile phone or a handheld package, and the like, based on the target detection, judging whether the next step of the detection of the aerial object can be carried out according to the type of the handh