CN-116188538-B - Behavior track tracking method for multiple cameras

CN116188538BCN 116188538 BCN116188538 BCN 116188538BCN-116188538-B

Abstract

The invention relates to a behavior track tracking method of multiple cameras, which belongs to the technical field of computer vision and comprises the following steps of S1, designating multiple cameras in a public service hall, respectively collecting coordinate information of service areas under the multiple cameras, binding the coordinate information with the cameras, S2, tracking a customer track by combining a target detection model and a pedestrian re-identification model, establishing a customer activity track library of the multiple cameras, S3, calculating IoU and a depth map of a service area under the multiple cameras and a customer target detection frame to confirm whether a customer reaches the service area, S4, identifying behaviors from the customer entering the service area to the customer leaving the service area, S5, constructing a customer behavior track library under the multiple cameras, aligning the behaviors through time information, and screening the behaviors to obtain the behavior track of the customer. The system and the method can be used for assisting in making and dividing the business flow of the public service hall by tracking the customer to obtain more effective customer behavior tracks under a plurality of cameras.

Inventors

LIU XIN
WANG XINYI
QIAN YING
WAN BANGRUI
CHEN FENG
LIANG JINZHOU
CHEN XUE
KE LILING

Assignees

重庆邮电大学

Dates

Publication Date: 20260508
Application Date: 20221122

Claims (4)

1. A behavior track tracking method of multiple cameras is characterized by comprising the following steps: s1, calibrating a plurality of cameras, respectively acquiring coordinate information of service areas under the cameras, and binding the coordinate information with the cameras; s2, tracking a customer track by combining a target detection model and a pedestrian re-identification model, and establishing a customer activity track library of a plurality of cameras, wherein the step S2 specifically comprises the following steps: S21, using a target detection model to locate customers under a plurality of cameras, extracting frames from the pictures of cameras in a service hall according to a certain frame rate, sequentially identifying monitoring videos of the plurality of cameras by using the trained target detection model, and sampling according to a designated step length to obtain Frame picture under camera and time information corresponding to frame picture Wherein Recording a sampled frame picture Is provided with a time information of (a), Representing total frame number, then using target detection model to make the video frames The customer is detected and obtained Coordinate information of target detection frame , Wherein Represent the first The customer 1 represents the upper left corner coordinate, 2 represents the lower right corner coordinate, and the time is Is a frame of (2) The set of the location information of all the customers detected in the process is expressed as Wherein Represent the first The first in the frame Coordinate information of individual customers, i.e. common in the frame Coordinate information of individual customers "after obtaining the coordinate information of the customers, bind it with frame and time information, expressed as ; S22, establishing a customer id set, storing the frame sequence of the same customer according to the customer id set to form an active track set, namely, according to each frame of image in S21 Customer location information stored therein Cutting the customer and inputting the customer into a pedestrian re-identification model PCB, identifying the corresponding id value of the customer at each position and storing the id value in a customer id set, if the customer id set does not exist the customer and does not belong to a staff id, indicating that the customer is an emerging customer, giving the new id to the customer and storing the new id in the customer id set, if the customer id set has the information of the customer, extracting the id as an identification of track tracking, and taking the frame images containing the same customer id as the identification of the track tracking according to the time sequence stored in S21 Saved as the customer Is expressed as Wherein , Representing an id value as Is a customer of the (a) in the (c) system, Representing with customers Frame picture of coordinate information and time information of (a) and (b) making Is of the moving track of (1) Traversing All customers under the camera acquire different activity tracks, and store the different activity tracks as a customer activity track set expressed as Wherein , Representing customers In total, have A customer; s3, calculating the intersection ratio IoU of the service areas under the cameras and the customer target detection frame, and when IoU is larger than a certain threshold value, determining whether the customer reaches the service area by using a depth map detection model, wherein the step S3 specifically comprises the following steps: s31, taking out in the active track set Lower customer Is a sequence of moving trajectories of (a) Sequentially calculating the customer and service area Target detection frame cross ratio IoU, customer target detection frame A, service area target detection frame B The coordinate information of the service area and the customer is taken out from the step S12 and the step S21, which are respectively @ , ),( , ) And% , ),( , ), =( - ) ( - ), =( - ) ( - ); ∩ Forming a rectangle , Length of (2) =max{ , }-min{ , }, Is of a width of (1) =max{ , }-min{ , Then (V) is = According to Finally, ioU values are obtained; If IoU is greater than a threshold If the customer target frame is overlapped with the service area target detection frame, starting a depth map detection model, screening video frames overlapped with the service area target detection frame from the customer activity track, and storing the video frame sequence as ; S32, overlapping the customer target detection frame and the service area target detection frame obtained in the step S31 Sending the depth map to a depth map detection model MEGADEPTH for detection to obtain pixel values of each pixel point on the customer and the service area, wherein the pixel values represent the depth of the point from the camera, and the depth pixel set in the customer target detection frame is { , , ..., }, Wherein Representing customers The pixel value of the a-th pixel point of the service area target detection frame is { with the depth pixel set in the service area target detection frame as { , , ,..., }, Wherein Representing service areas A pixel value of the b-th pixel point; processing the data by using absolute median deviation (MAD), and finally obtaining an optimized customer pixel value set as { , , ..., Service area pixel value set is { , , ,..., }; S33, respectively averaging the customer and service area depth pixel value sets optimized in the step S32, and subtracting to obtain a depth matching value , If the depth matches the value Less than a certain threshold Indicating that the customer entered the service area; S34, deleting the depth matching value in the active track graph set Greater than Is a video frame representing a customer Not stay in the service area in the frame; the optimized active track graph set is expressed as Wherein , When the target detection frame of the customer has no intersection with the service area and the depth matching value mc is larger than K, the customer is not near the service area or has left the service area; s35, repeating the steps S31-S34, continuing to track the customer, and storing tracking information into the active track set of the customer after the customer enters the next service area; S4, identifying the behavior from the customer entering the service area to the customer leaving the service area by using a behavior identification model, wherein the step S4 specifically comprises the following steps: s41, training a behavior recognition model, namely extracting coordinate information of an activity track sequence of a customer in each service area, marking the customer behavior frame by frame according to the coordinate information, extracting a rear 63 frame of a marking frame during training, taking 64 frames as a video set to be input into the behavior recognition model for training, and circularly inputting the same 64 frame activity track sequence in two network branches through different sampling frequencies; S42, establishing a camera The method comprises the steps of detecting the behavior of a customer in a service area by using a trained behavior recognition model Slowfast, calculating the residence time of the customer in the service area, and storing the obtained behavior into an active track set to form a behavior track set, wherein the behavior track set comprises the following steps of Individual customers in a certain service area Behavior of customers Expressed as Wherein Representing all possible actions of the customer in the service lobby, Representing the total number of behavior categories, acquiring the specific service area obtained in step S34 Lower frame sequence { The frame sequence is obtained by sampling 64 frames by one frame so that one frame represents two seconds, counting the total number of frames and multiplying by 2 to save as customer residence time in the service area Customer joining time information At the position of The behavior trace under the camera is expressed as Wherein Represent the first The number of service areas in the service area, Indicating all the possible actions that the customer may take, Indicating that the customer is at Service area residence time; S43, traversing all cameras Repeating the steps S41-S43 to obtain a plurality of customer behavior track sets under a plurality of cameras, thereby forming a customer behavior track library under the plurality of cameras; s5, constructing a customer behavior track library under a plurality of cameras, deleting repeated behaviors through time information alignment, and selecting final behaviors according to priorities by different behaviors appearing under different cameras at the same time to obtain a customer behavior track, wherein the step S5 is used for constructing the customer behavior track library under the plurality of cameras and specifically comprises the following steps of: s51, establishing a connection with a certain customer behavior track under a plurality of cameras, namely enabling a certain customer to communicate with the customer behavior track In a single camera Lower behavior trace Is that I.e. = According to the customer id, storing the behavior track set of the same customer under a plurality of cameras as a behavior track library, namely : Wherein Indicating that the customer is at the first Behavior track under each camera ; S52, aligning the customer behavior tracks of the cameras through time information, namely inquiring the customer behavior track of the same customer under the cameras after acquiring the behavior track of the same customer under the cameras in S22 Lower customer Information of (2) : Obtaining the time of the customer reaching the service area, and adding the time into the customer behavior track to obtain And (3) the customer is forced to Behavior trace of (a) By time sequence And comparing sequentially to remove repeated behaviors, wherein when a certain customer recognizes different behaviors at the same time by different cameras, the behavior with the highest priority is selected as the final behavior.
2. The method for tracking behavior tracks of multiple cameras according to claim 1, wherein the step S1 specifically comprises the following steps: S11, acquiring shooting information under a plurality of cameras in a public service hall, designating corresponding cameras and marking the cameras as , Wherein Represent the first A total of cameras Different cameras; s12, acquiring coordinate information of a public service hall service area under a plurality of cameras by using a target detection model, marking and sorting the collected image information, then sending the image information into the model for training to obtain a target detection model for identifying the service area, and according to different cameras The obtained image information with different visual angles is sequentially calculated through a trained target detection model to obtain coordinate information of a plurality of lower service areas of the cameras, wherein the coordinate information comprises corner marks of the upper left corner and the lower right corner of the target detection frame, and the corner marks are respectively expressed as @ , ),( , ) Wherein Represent the first The number of service areas in the service area, Is in common with The service areas, 1 for the upper left corner and 2 for the lower right corner, are respectively Coordinate information acquired under the camera is expressed as :{ ( , ),( , )}。
3. The method for tracking behavior trace of multiple cameras according to claim 1, wherein in step S32, pixel values in a customer target detection frame are processed by using absolute median deviation MAD, and median values of all pixels are calculated first , Calculating the absolute deviation of all elements from the median Obtaining the median value of the absolute deviation And then determining parameters All data were adjusted as follows: Finally, the optimized customer pixel value set is obtained as The service area pixel value set is 。
4. The method for tracking behavior trace by multiple cameras according to claim 1, wherein in step S52, customer behaviors are prioritized, and when different cameras of a certain customer recognize different behaviors at the same time, the behavior with the highest priority is selected as the final behavior, and the behavior trace of the customer in the service hall is finally obtained as follows : Wherein Representing customers At the time of Reach to Service area does Behavior, and residence time of 。

Description

Behavior track tracking method for multiple cameras Technical Field The invention belongs to the technical field of computer vision, and relates to a behavior track tracking method of multiple cameras. Background The public service lobby receives a large number of customers each day, and some service transacting processes are complicated. There is currently no way to better identify the behavior trace of a customer. By identifying the behavior track of the customer, the method can help staff to grasp the business handling process and the residence time of the customer, thereby optimizing business process formulation and service hall equipment placement. The business information of the customer is processed by combining a plurality of cameras, so that the whole business processing flow of the customer can be identified. At present, research means for behavior track recognition can be divided into two types, namely a research means based on computer vision and a research means based on a sensor for establishing a dynamic model. Sensor-based methods require customers to wear the corresponding sensors, but public service halls have more customers each day, and sensor-based methods are costly. This approach has certain limitations, subject to cost limitations. Based on a computer vision method, modeling simulation is carried out on a real scene through a model, and a relatively accurate identification result is obtained. The method based on computer vision does not need the cooperation of customers, thereby saving time and money cost. The method based on computer vision can skillfully combine the customer behavior tracks of a plurality of cameras, avoid the situation of error recognition and shielding, obtain more effective behavior tracks according to the service handling priority of customers, and be used for assisting in making and dividing the business processes of the public service hall. Disclosure of Invention Therefore, the invention aims to provide a behavior track tracking method with multiple cameras, which is used for tracking the behavior track of a customer in a service hall and grasping the business handling process and the residence time of the customer so as to optimize the business process and the equipment placement in the service hall. And judging the movement track of the customer in the service hall according to the position information of the customer and the key service area. In critical service areas, customer behavior is recorded. And the behavior track information of different angles shot by the cameras is aligned and integrated through time information, so that the behavior track handled by the customer in the service hall is finally formed. In order to achieve the above purpose, the present invention provides the following technical solutions: a behavior track tracking method of multiple cameras comprises the following steps: s1, calibrating a plurality of cameras, respectively acquiring coordinate information of service areas under the cameras, and binding the coordinate information with the cameras; s2, tracking a customer track by combining a target detection model and a pedestrian re-identification model, and establishing a customer activity track library of a plurality of cameras; S3, calculating the intersection ratio IoU of the service areas under the cameras and the customer target detection frame, and when IoU is larger than a certain threshold value, determining whether a customer reaches the service area by using a depth map detection model; s4, identifying the behavior from the customer entering the service area to the customer leaving the service area by using a behavior identification model; S5, constructing a customer behavior track library under a plurality of cameras, deleting repeated behaviors through time information alignment, and selecting final behaviors according to priorities by different behaviors under different cameras at the same time so as to obtain the behavior track of the customer. Further, the step S1 specifically includes the following steps: And S11, acquiring shooting information under a plurality of cameras in a public service hall, designating corresponding cameras and marking the corresponding cameras as { C 1,…,Ck,…,Ccam }, wherein k epsilon { 1..once, cam }, wherein C k represents a kth camera, and totally having cam different cameras. And S12, acquiring the coordinate information of the service areas of the public service halls under the cameras by using the target detection model, marking and sorting the collected image information, and then sending the image information into the model for training to obtain the target detection model for identifying the service areas. According to the image information of different visual angles obtained by different cameras { C 1,…,Ck,…,Ccam }, sequentially calculating by trained target detection models to obtain coordinate information of a plurality of lower service areas of the cameras, wherein the coor