CN-122024294-A - Face dynamic identification system and synchronization method based on edge calculation and cloud cooperation

CN122024294ACN 122024294 ACN122024294 ACN 122024294ACN-122024294-A

Abstract

The invention relates to the technical field of face dynamic identification, in particular to a face dynamic identification system and a synchronization method based on edge calculation and cloud cooperation, wherein the hardware adopts an edge end cloud layered architecture to realize reasonable distribution of computing resources and efficient interaction of data; the system comprises a front-end edge computing terminal and a rear-end cloud processing system, wherein the front-end edge computing terminal is subdivided into a multi-mode sensing module, an edge computing unit and a communication module, the rear-end cloud processing system comprises an encryption comparison server cluster and a dynamic update feature database, and a closed loop flow of 'detection-preliminary screening-fine screening-tracking' is constructed through hierarchical processing of an edge computing layer and a cloud collaborative layer, and the steps are divided into real-time preprocessing of the edge computing layer and deep processing of the cloud collaborative layer. The synchronization method is based on a rear-end cloud processing system and comprises an encryption comparison server cluster, a dynamic update feature database and a track prediction module, wherein the encryption comparison server cluster consists of three parallel servers and outputs the three parallel servers as a comparison result.

Inventors

HUANG HONGJIN

Assignees

超感纪数字科技(东莞)有限公司

Dates

Publication Date: 20260512
Application Date: 20251231

Claims (11)

1. The human face dynamic identification tracking system based on edge calculation and cloud cooperation is characterized in that a system hardware system of the tracking system adopts an edge-cloud layered architecture to achieve reasonable distribution of computing power resources and efficient interaction of data, the system comprises a front-end edge calculation terminal (10) and a rear-end cloud processing system (20), the front-end edge calculation terminal (10) is subdivided into a multi-mode sensing module (11), an edge calculation unit (12) and a communication module (13), the multi-mode sensing module (11) integrates a dual-mode imaging unit of a visible camera and an infrared camera, supports effective working distances of 0.5-5 m (meets requirements of law enforcement short-distance interrogation and medium-distance monitoring), the visible camera adopts 1920×1080 resolution sensors to support dynamic video acquisition of 30fps, the infrared camera is provided with a 850nm/940nm dual-band light supplementing module, the light supplementing distance is 3-5 m (accords with IEC 62471 light biosafety standards), the image can be stably imaged in a wide dynamic illumination environment, the edge calculation unit (12) is provided with an embedded type U chip (ASP 4 nd, and a novel dynamic processing model is supported by a model) and is integrated with a novel dynamic processing model of the model 4. The unit is internally provided with a 512MB cache, can locally store face feature templates (128 dimensions of feature vectors) of 500-1000 high-risk personnel for edge side fast comparison, and is characterized in that the communication module (13) integrates a 4G/5G wireless communication module (supporting NSA/SA dual mode) and a Wi-Fi 6 transmission unit to realize encrypted data interaction between an edge end and a cloud end. The communication protocol adopts the TLS 1.3 encryption standard, so that the security in the characteristic data transmission process is ensured.
2. The human face dynamic identification tracking system according to claim 1, wherein the back-end cloud processing system (20) comprises an encryption comparison server cluster (21) and a dynamic update feature database (22), the encryption comparison server cluster (21) is composed of distributed computing nodes, a NVIDIA GPU (model: A100) is configured by a single node, and parallel comparison operation of millions of human face features is supported (the processing capacity of the single node per second is more than or equal to 10 ten thousand times comparison). The cluster adopts a load balancing algorithm to dynamically allocate computing tasks and ensure response efficiency under a high concurrency scene, and the dynamic updating feature database (22) stores full-quantity face feature data (supporting more than or equal to 1000 tens of thousands of level feature items) and has the function of automatic duplication removal and incremental updating. The database updating mechanism comprises version number verification and bidirectional authentication, so that the integrity and legality of the feature data are ensured, and the updating delay is less than or equal to 5 seconds.
3. The face dynamic identification tracking system according to claim 1, wherein the system constructs a closed loop flow of detection-preliminary screening-fine screening-tracking through hierarchical processing of an edge computing layer and a cloud cooperative layer, and the specific edge computing layer real-time preprocessing (front-end equipment) comprises the following steps of S1 dynamic face detection and quality assessment, namely performing frame-by-frame detection on a real-time video stream by using a YOLO-Lite optimization algorithm (model parameter scale is less than or equal to 8 MB), and supporting multi-target simultaneous identification (the number of single-frame detection targets is more than or equal to 10). A face quality scoring model is operated synchronously, a low-quality image with a shielding area of more than 40% and a ambiguity threshold value (calculated based on LBP texture characteristics) is automatically filtered, the subsequent comparison effectiveness is improved, S2 local characteristics are rapidly extracted, 128-dimensional characteristic vectors are extracted from a detected face area through a MobileFaceNet compression model (model size is less than or equal to 15 MB), characteristic extraction time is less than or equal to 15 ms/frame (actually measured under 4TOPS calculation power), the model adopts Triplet Loss optimization training, the inter-class discrimination is enhanced, the robustness of edge characteristic representation is improved, S3 local blacklist preliminary comparison is conducted, cosine similarity calculation is conducted on the extracted characteristic vectors and a local cached high-risk personnel characteristic library, a primary comparison threshold value (default 0.85) is set, targets with the similarity of more than or equal to 0.85 are marked as suspicious objects, edge pre-alarm is triggered, the targets with the similarity of less than or equal to 0.85 are ready to upload cloud invalid data are packed, and direct filtering is conducted on the targets with the similarity of less than 0.85, and data transmission is reduced.
4. The face dynamic identification tracking system according to claim 1, wherein the system constructs a closed loop flow of 'detection-preliminary screening-fine screening-tracking' through hierarchical processing of an edge computing layer and a cloud cooperative layer, and the specific cloud cooperative layer depth processing steps are as follows: The T1 encryption characteristic transmission and format verification comprises the steps that a characteristic vector (128-dimensional floating point data, data volume of about 512 bytes/target) of a suspicious object is uploaded to a cloud end through a TLS encryption channel by an edge end, and transmission delay is less than or equal to 50ms in a 5G network environment; And (3) accurately comparing the T2 million-level databases, namely adopting ArcFace algorithm to carry out secondary comparison on the feature vectors, and matching the feature vectors with a cloud full-quantity feature database (supporting more than or equal to 1000 ten thousand-level items). A confidence coefficient weighting mechanism is introduced in the comparison process, and the comprehensive confidence coefficient score (range 0-1) is generated by combining the face quality score (weight 0.3) detected by the edge end and the cloud end comparison similarity (weight 0.7); and meanwhile, the cloud end builds a target motion model based on monitoring point location data (comprising longitude and latitude and a time stamp) within 30 minutes of history, adopts a Kalman filtering algorithm to predict a moving track within 30 seconds in the future, generates tracking guide instructions (such as adjusting camera holder angles and linkage peripheral equipment distribution control), and delays the instruction issuing by less than or equal to 100ms.
5. The system according to claim 1, wherein the hardware module sensing and processing unit mainly comprises a multi-mode camera module (51), an edge computing unit core (52), a communication and power supply unit (53), a power management module (54) and a peripheral interface (55); the multimode camera module (51) is characterized in that a visible light camera (marked with ' 1920×1080@30fps ') and an infrared camera (marked with ' 850nm light supplementing module ') are drawn in parallel and connected to an NPU chip through a CSI interface, the edge computing unit core (52) is provided with a central marked with ' NPU chip (4 TOPS computing power) ', a peripheral connection ' cache (512 MB) ' local Flash memory (8 GB) ', the peripheral connection ' cache and model memory are respectively used for characteristic comparison, the communication and power supply unit (53) is provided with a communication module comprising a 5G module (marked with ' Sub-6GHz ') and a Wi-Fi 6 chip, the communication module is connected with an NPU through a PCIe interface, outputs an ' encrypted data link ' to the cloud, the power supply management module (54) is provided with a 9-36V DC (vehicle-mounted power) ', outputs ' 5V/3.3V ' to each component, the peripheral interface (55) ' wake-up control ' function is provided with a 485-reserved ' RS/USB interface ' (marked with ' external display device/memory expansion '), the communication module is adapted to diversified access requirements of a scene, and the power supply is provided with a 20 ℃ protection temperature is marked with ' 20 ℃ and ' 60 ℃, the hardware environment adaptability is embodied.
6. The synchronization method of the human face dynamic identification tracking system based on edge calculation and cloud cooperation is characterized by comprising a front end edge calculation terminal (10), a rear end cloud processing system (20) and data interaction logic (30), wherein the front end edge calculation terminal (10) is divided into a multi-mode sensing module (11), an edge calculation unit (12) and a communication module (13), the multi-mode sensing module comprises a visible light camera (marked with a visible light sensor) and an infrared camera (marked with an infrared light supplementing module), the visible light camera and the infrared camera are connected to the edge calculation unit through double-headed arrows, real-time image data input is indicated, the edge calculation unit comprises a core processing module, an NPU chip and a local feature cache, the local feature cache is used for receiving camera data and then outputting suspicious feature packets to the communication module, and the communication module is connected through an encryption channel (marked with a TLS 1.3 encryption link) and used for transmitting suspicious feature data in the uplink and receiving cloud instructions (such as tracking guide signals).
7. The synchronization method of the human face dynamic identification tracking system according to claim 1, wherein the back-end cloud processing system (20) comprises an encryption comparison server cluster (21), a dynamic update feature database (22) and a track prediction module (23), wherein the encryption comparison server cluster (21) is represented by 3 parallel server icons, is marked with a distributed computing node, is input into feature data uploaded by an edge end and is output as a comparison result, the dynamic update feature database (22) is connected with the server cluster, is marked with a more than or equal to 1000-ten-thousand-level feature item and is marked with a version number verification mechanism, and the track prediction module (23) receives a comparison result of the server cluster, generates a target track prediction curve by combining with history monitoring point data and is output to a communication module for front-end equipment linkage.
8. The synchronization method of a face dynamic identification tracking system according to claim 1, wherein the data interaction logic (30) comprises edge-to-cloud (31) cloud-to-edge (32), the edge-to-cloud (31) solid arrows mark "suspicious feature vectors (128 dimensions)" encrypted transmissions ", the cloud-to-edge (32) solid arrows mark" comparison results "and" tracking instructions ", and a bidirectional synergistic relationship is embodied.
9. The method according to claim 1, wherein the multi-stage recognition process includes two processing stages of "edge computing layer" and "cloud collaboration layer", and the edge computing layer workflow is as follows: the U1 rectangular frame is marked with 'YOLO-Lite face detection + quality score', a 'real-time video stream' is input, and a 'valid face region' (filtering shielding > 40%/blurred image) is output; The U2 rectangular frame is marked with 'MobileFaceNet feature extraction', an 'effective face area' is input, and a '128-dimensional feature vector' is output; U3 diamond marks "local black list comparison (threshold 0.85)", and branch arrows point to "filtering non-target" (similarity < 0.85) and "packaging suspicious feature" (similarity is more than or equal to 0.85) respectively.
10. The method according to claim 1, wherein the multi-stage recognition process includes two processing stages of "edge calculation layer" and "cloud collaboration layer", and the cloud collaboration layer workflow is as follows: the U4 rectangular frame marks TLS encrypted transmission and format verification, inputs suspicious characteristic packets and outputs effective characteristic data; The U5 rectangular frame is marked with 'ArcFace million-level comparison+confidence calculation', effective characteristic data is input, and 'comprehensive confidence score (0-1') is output; The U6 rectangular frame is marked with 'hierarchical early warning+track prediction', the 'primary/secondary early warning' is output according to the confidence (more than or equal to 0.9/0.8-0.9), and the 'predicted track' is generated through 'Kalman filtering'; u7 auxiliary labeling, namely processing time labeling (for example, feature extraction is less than or equal to 15 ms/frame and cloud comparison delay is less than or equal to 100 ms) in each step, so that the real-time advantage is reflected; U8 data volume labels (e.g., "512 bytes/target") illustrate edge-side lightweight designs.
11. The synchronization method of a face dynamic recognition tracking system according to claim 1, wherein the front edge computing terminal (10) uses the following face comparison techniques in a large model, including image preprocessing, feature extraction, comparison algorithm selection, multi-factor comprehensive judgment and threshold setting.

Description

Face dynamic identification system and synchronization method based on edge calculation and cloud cooperation Technical Field The application relates to a face recognition technology in the fields of public safety and law enforcement, in particular to a face dynamic recognition tracking system based on edge calculation and cloud cooperation and a synchronization method thereof. Background In the fields of public safety and law enforcement, face recognition technology based on video monitoring has become an important auxiliary means. The conventional law enforcement monitoring system generally adopts a traditional architecture of front-end acquisition-back-end processing, and the technical bottleneck is mainly embodied in the following aspects: 1. The response delay is caused by strong network dependence, the traditional scheme relies on a front-end camera to transmit the original video stream back to a far-end server in real time, and the face detection, feature extraction and database comparison are completed by a back-end computing power cluster. In this mode, the data transmission occupies a large amount of bandwidth resources and is significantly affected by the network stability. For example, in a 4G network environment, the end-to-end delay from video acquisition to comparison result feedback is usually more than 200ms, and cannot meet the real-time early warning requirement (such as immediate response when a suspected person moves quickly) of an emergency in a law enforcement scenario. 2. The recognition robustness in the complex environment is insufficient, and the existing system is easily influenced by factors such as illumination change (such as day and night alternation and direct irradiation of strong light), face shielding (such as wearing of a mask and a sunglasses), posture change (such as side face and low head) and the like in the dynamic recognition process. For example, when the ambient illumination is below 50Lux (low-illumination scene) or above 50000Lux (strong light scene), the imaging quality of the traditional single-mode camera is significantly reduced, so that the detection omission ratio of face detection is increased to above 15%, and the cosine similarity matching accuracy of the feature extraction model is reduced to below 80%, which is difficult to meet the requirement of law enforcement scene on high-reliability identification. 3. In the prior art, the target tracking mainly depends on manually demarcating a monitoring area or presetting a fixed rule, and lacks continuous tracking capability for dynamic targets. When a plurality of monitoring points work cooperatively, the back-end server needs to integrate multi-source video data in real time, and due to the lack of an effective track prediction algorithm, the situation that the target is lost across the field of view of a camera often occurs, and the response speed of manual intervention tracking is lagged, so that a complete target motion track chain cannot be formed. Disclosure of Invention The invention provides a law enforcement face dynamic identification tracking system based on edge calculation and cloud cooperation, which solves the problems of high response delay, low complex environment identification rate and insufficient automatic tracking capability in the traditional scheme through hardware architecture innovation, multi-stage processing flow design and core algorithm optimization. The specific technical scheme is as follows: 1. Hardware architecture The system hardware system adopts an edge-cloud layered architecture, so that reasonable distribution of computing power resources and efficient interaction of data are realized: (1) Front edge computing terminal -A multi-modal awareness module: The dual-mode imaging unit integrating the visible light camera and the infrared camera supports the effective working distance of 0.5-5 meters (meets the requirements of law enforcement short-distance checking and medium-distance monitoring). The visible light camera adopts 1920X 1080 resolution sensor to support 30fps dynamic video acquisition, and the infrared camera is provided with 850nm/940nm dual-band light supplementing module, the light supplementing distance is 3-5 m (meeting IEC 62471 light biological safety standard), and can stably image in 50-50000 Lux wide dynamic illumination environment. -An edge calculation unit: An embedded NPU chip (model: wan is Assetnd 310, calculated 4 TOPS) is mounted, and a lightweight AI processing module is integrated to support dynamic loading and updating of a model. The unit is internally provided with a 512MB cache, and can locally store face feature templates (feature vector dimension 128 dimension) of 500-1000 high-risk personnel for edge side quick comparison. -A communication module: And integrating the 4G/5G wireless communication module (supporting NSA/SA dual mode) and the Wi-Fi 6 transmission unit to realize encrypted data interaction between the edge end and the cl