CN-121999435-A - Robot following sensing method based on fine-grained pedestrian re-recognition and related device
Abstract
The invention discloses a robot following sensing method based on fine-grained pedestrian re-recognition and a related device. The method comprises the steps of obtaining a video, extracting a pixel mask area of each pedestrian in each frame of image to obtain a pedestrian mask, distributing a unique tracking ID for each segmented pedestrian mask in continuous frames in the video, inputting each pedestrian mask distributed to the tracking ID into a trained fine-granularity pedestrian re-recognition model, wherein the fine-granularity pedestrian re-recognition model comprises a feature extraction network and two parameter sharing global branches and local branches, extracting feature vectors through the trained fine-granularity pedestrian re-recognition model, comparing the feature vectors with pedestrian feature vectors stored in advance in a master feature library, and returning corresponding tracking IDs based on comparison results to serve as retrieval results. The invention can realize more accurate pedestrian recognition function and following function.
Inventors
- Bian Kaikai
Assignees
- 苏州深庭纪智能科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260127
Claims (10)
- 1. The robot following sensing method based on fine-grained pedestrian re-recognition is characterized by comprising the following steps of: acquiring a video, and extracting a pixel mask area of each pedestrian in each frame of image to acquire a pedestrian mask; Distributing a unique tracking ID for each partitioned pedestrian mask in a continuous frame in the video, and inputting each pedestrian mask distributed to the tracking ID into a trained fine-granularity pedestrian re-recognition model, wherein the fine-granularity pedestrian re-recognition model comprises a feature extraction network and two global branches and local branches with shared parameters; Extracting feature vectors through the trained fine-granularity pedestrian re-recognition model, performing feature comparison on the feature vectors and pedestrian feature vectors stored in advance in a master feature library, and returning corresponding tracking IDs based on comparison results to serve as retrieval results.
- 2. The robot following sensing method based on fine-grained pedestrian re-recognition according to claim 1, wherein the extracting the pixel mask area of each pedestrian in each frame of image to obtain the pedestrian mask comprises the following steps: Identifying each pedestrian in each frame of image by adopting YOLOv network, wherein the YOLOv network comprises a main network, a neck network and a head network, extracting a multi-scale feature map through the main network, performing up-sampling and down-sampling operations on the multi-scale features through the neck network to obtain fusion features, predicting the fusion features through a detection head in the head network, and generating coordinates, classification probability and target confidence of a target boundary frame; And dividing the pedestrians and the background in the target boundary box based on the coordinates of the target boundary box, the classification probability and the confidence of the target, and obtaining a pixel mask area only retaining the pedestrians.
- 3. The robot following perception method based on fine-grained pedestrian re-recognition according to claim 1, wherein a ByteTrack multi-objective tracking algorithm is used to assign a unique tracking ID to each segmented pedestrian mask in successive frames in the video.
- 4. The robot following sensing method based on fine-grained pedestrian re-recognition according to claim 1, wherein the training process of the fine-grained pedestrian re-recognition model comprises the following steps: extracting the features of the pedestrian mask through a feature extraction network to obtain a global feature map; The method comprises the steps of inputting a global feature map into a global branch and a local branch respectively, inputting the global feature map into a full-connection layer to conduct classification prediction, calculating global classification loss, aggregating the global feature map along a channel dimension in the local branch to obtain a global activation map, adopting anchor blocks with different sizes to operate on the global activation map, generating a plurality of sliding windows, calculating average activation values of each sliding window, conducting sorting and non-maximum suppression operation to obtain N local windows, inputting the N local windows into a feature extraction network of a pixel region corresponding to a pedestrian mask, extracting features of the N local windows, inputting the N local windows into the full-connection layer to conduct classification prediction, calculating corresponding local classification loss, summing the global classification loss and the N local classification loss, obtaining total loss, and updating model parameters by using a back propagation algorithm until a model converges.
- 5. The robot following perception method based on fine-grained pedestrian re-recognition of claim 4, wherein the feature extraction network is a ResNet-50 backbone network.
- 6. The robot following perception method based on fine-grained pedestrian re-recognition of claim 4, wherein the global classification loss and the local classification loss both employ cross entropy loss.
- 7. The robot following sensing method based on fine-grained pedestrian re-recognition according to claim 1, wherein the feature comparison is performed between the feature vector and pedestrian feature vectors pre-stored in a master feature library, and a corresponding tracking ID is returned as a retrieval result based on a comparison result, and the method comprises the following steps: calculating cosine similarity one by the feature vector and pedestrian feature vectors stored in advance in a master feature library; And returning the tracking ID corresponding to the highest cosine similarity.
- 8. Robot following perception device based on fine granularity pedestrian re-identification, its characterized in that includes: The pedestrian segmentation module is used for acquiring videos, extracting a pixel mask area of each pedestrian in each frame of image and acquiring a pedestrian mask; The target tracking module is used for distributing a unique tracking ID for each segmented pedestrian mask in a continuous frame in the video, and inputting each pedestrian mask distributed to the tracking ID into a trained fine-granularity pedestrian re-recognition model, wherein the fine-granularity pedestrian re-recognition model comprises a feature extraction network and two global branches and local branches with shared parameters; and the re-recognition module is used for extracting feature vectors through the trained fine-granularity pedestrian re-recognition model, carrying out feature comparison on the feature vectors and pedestrian feature vectors stored in advance in the master feature library, and returning corresponding tracking IDs based on comparison results to serve as retrieval results.
- 9. A robot comprising a robot body, a robot body and a robot body, characterized by comprising the following steps: a body; A control system in communication with the fuselage, the control system comprising a memory for storing a computer program and a processor for implementing the method of any one of claims 1 to 7 when the computer program is executed.
- 10. A readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the method according to any of claims 1 to 7.
Description
Robot following sensing method based on fine-grained pedestrian re-recognition and related device Technical Field The invention relates to the technical field of robots, in particular to a robot following sensing method based on fine-grained pedestrian re-recognition and a related device. Background Along with the rapid development of intelligent mobile platforms such as service robots, logistics robots, personal companion robots and the like, the realization of stable autonomous following of the robots to specific personnel becomes a key requirement. The core challenge of this task is that the robot needs to continuously and accurately perceive and track a specific following target in a complex dynamic environment, overcoming disturbances of occlusion, illumination change, target appearance change, view angle switching, etc. Traditional robot following schemes rely mainly on several types of technologies, but all have obvious limitations: color/feature tracking based methods such as mean shift, correlation filtering, etc., tracking by color histogram or manually designed features (e.g., HOG, SIFT). Such methods are computationally simple, but are extremely sensitive to target occlusion, rapid movement, and appearance changes, and once lost, it is difficult to re-identify the target. The method based on the specific beacon requires the target personnel to wear or carry RFID, UWB, a marker with specific color, a two-dimensional code and the like. The method is stable, but greatly sacrifices the naturalness and convenience of user experience, and has no universality. Based on the paradigm of detection and tracking, the class of "people" is continuously detected by a deep learning object detector (e.g., YOLO, SSD), and data correlation is performed in conjunction with a tracker (e.g., kalman filtering, SORT/DeepSORT). However, in a scene where people are dense or there is long-time shielding, the method cannot distinguish different pedestrian individuals, and the problem of target identity switching is very easy to occur, namely, the robot may follow other people by mistake. To overcome the above limitations, in recent years, researchers have introduced pedestrian re-recognition technology into robotic follower systems, creating a more advanced perception paradigm. The pedestrian re-recognition aims to solve the problem of pedestrian matching under the conditions of camera crossing and time interval crossing, namely judging whether pedestrians in different images or video sequences are the same person or not. The technical advantages are as follows: Strong distinguishing ability-fine grained visual characteristics (such as clothing, body shape, etc.) that distinguish different pedestrians can be learned, rather than merely identifying the category "people. Robustness to occlusion and loss when the target temporarily leaves the field of view or reappears after being occluded, the pedestrian re-recognition model can re-recognize the target according to the appearance characteristics, so that the robot can resume tracking, but not permanently keep track of the target. No special mark is needed, no sense following of pedestrians in a natural state is realized, and user experience is better. However, existing pedestrian re-identification schemes still face challenges: Adaptability to drastic changes in appearance when the target pedestrian changes clothing, carries large items, or the lighting conditions change extremely, the pedestrian re-recognition model based on the apparent features may fail. Feature confusion and similar interference-mismatching is easy to occur in the scene of similar pedestrians in appearance. The long-distance small target recognition effect is poor, and the effect of the long-distance small target recognition is obviously reduced because of low image resolution and lack of characteristic information and severe target shielding and gesture change, which is one of the core challenges faced by a robot following algorithm based on pedestrian re-recognition. Therefore, there is a strong need in the art for an efficient, robust autonomous following algorithm that can operate stably on a resource-constrained robotic platform and that can effectively address challenges of complex real-world scenarios. Disclosure of Invention In order to solve the technical problems, the invention provides a robot following sensing method and a related device based on fine-grained pedestrian re-recognition, and the method and the related device further improve the performance and the practicability of a robot following system by optimizing pedestrian re-recognition feature learning so that learned features not only comprise the whole features of pedestrians, but also relate to local features with discrimination. In order to achieve the above purpose, the technical scheme of the invention is as follows: the robot following sensing method based on fine-grained pedestrian re-recognition comprises the following steps: