CN-121982639-A - Visual detection method and system based on customer taking cigarettes in remote watch

CN121982639ACN 121982639 ACN121982639 ACN 121982639ACN-121982639-A

Abstract

The invention relates to the technical field of computer vision, in particular to a vision detection method and system based on a customer taking cigarettes in remote watch. The method comprises the following steps of S1, data preparation and enhancement, S2, model improvement and training, S3, model light weight and optimization, and S4, cloud or edge reasoning and deployment. The invention enhances the small target feature extraction capability through a focusing mechanism, combines CIoU Loss optimization frame regression, is excellent in performance under small targets and shielding scenes, supports the real-time processing of multiple paths of video streams on a cloud server based on a lightweight single-stage detector and a pruning and quantization technology, also has the real-time reasoning capability on edge equipment, adapts a model to various illumination, background and packaging changes through large-scale data enhancement and migration learning, overcomes the defect of poor environmental adaptability of the traditional method, and simultaneously can realize low-cost and high-efficiency automatic monitoring and is suitable for large-scale deployment.

Inventors

WU JIABAO
CHEN JINGZHOU

Assignees

快进时代(厦门)科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260120

Claims (9)

1. A visual inspection method based on taking cigarettes by customers in remote duty is characterized by comprising the following steps: s1, data preparation and enhancement; s2, model improvement and training; S3, model weight reduction and optimization; And S4, cloud or edge reasoning and deployment.
2. A method for visual inspection of cigarettes taken by a customer in a remote duty according to claim 1, wherein S1 comprises the steps of: S101, constructing a cigarette image dataset covering multiple scenes, multiple lights, multiple angles, multiple brands and shielding conditions; S102, marking the boundary boxes and the categories of the images by adopting a COCO format; and S103, data enhancement is implemented, including random rotation, random scaling, brightness contrast adjustment, noise addition and horizontal overturn, so as to improve the generalization capability of the model.
3. A method of visual inspection of cigarettes taken by a customer in a remote duty as claimed in claim 2, wherein said S2 comprises the steps of: S201, a model architecture, wherein an advanced single-stage target detection model is selected as a basis; S202, transfer learning, namely initializing based on a pre-training weight of a large data set, and performing fine adjustment on the cigarette data set by adopting a random gradient descent optimizer with momentum and cosine annealing learning rate scheduling; And S203, optimizing a loss function, namely adopting CIoU Loss for regression loss, comprehensively optimizing the overlapping area, the center point distance and the aspect ratio, and realizing more accurate frame positioning.
4. A method for visual inspection of cigarettes by customers in a remote duty according to claim 3, wherein the step S201 further comprises the step of pertinently improving the single-stage object detection model: a. The attention mechanism is embedded, namely a convolution attention module is introduced into the feature fusion network, the attention of the model to a small target area is enhanced, and the background interference is restrained; b. and initializing the self-adaptive anchor frame, namely optimizing the size and the aspect ratio of the initial anchor frame aiming at the strip-shaped appearance of the cigarette by adopting a K-means++ clustering algorithm, and improving the positioning efficiency and the positioning precision.
5. A method for visual inspection of cigarettes taken by a customer in a remote duty according to claim 1, wherein S3 comprises the steps of: S301, implementing channel pruning, removing redundant convolution channels, and compressing the model volume on the premise of controllable precision loss; s302, carrying out model quantization, converting the FP32 weight into INT8 precision, and further reducing the model size.
6. A method for visual inspection of cigarettes taken by a customer in a remote duty as claimed in claim 5, wherein said step S4 includes the steps of: s401, deploying the optimized lightweight model to a cloud server or edge computing equipment to form a flexible cloud collaborative architecture; s402, receiving a real-time video stream from a camera, and performing decoding and image preprocessing; s403, inputting the preprocessed image into a deployment model, and performing forward reasoning to obtain a cigarette bounding box, a confidence coefficient and a category label; s404, filtering out overlapping redundant frames by using non-maximum suppression, and outputting a final detection result.
7. A method for visual inspection of cigarettes taken by a customer in a remote duty as claimed in claim 6, wherein said S4 further comprises: And S405, result linkage and alarming, namely triggering an alarm when a cigarette taking event is detected, and pushing alarm information comprising a time stamp, a position and a screenshot to a monitoring platform or a management terminal.
8. A visual inspection system for taking cigarettes based on customers in remote watch, which is applicable to the visual inspection method for taking cigarettes based on customers in remote watch according to any one of claims 1 to 7, and is characterized by comprising the following modules: The video acquisition module consists of high-definition network cameras deployed in the monitoring scene and is responsible for continuously acquiring the field video stream; the cloud analysis server, as a core processing unit, comprises: the preprocessing unit is used for decoding, scaling and normalizing the uploaded video stream and converting the video stream into a model input format; model reasoning service, loading and running the optimized cigarette detection model to perform batch or real-time reasoning; And the post-processing unit is used for executing non-maximum value inhibition, analyzing and filtering the model output.
9. A visual inspection system for customer-based access to cigarettes in a remote duty as claimed in claim 8, further comprising the modules of: an edge reasoning node is used for deploying a lightweight model to edge equipment in a scene requiring low delay or limited network, realizing local real-time reasoning and synchronizing key events and data with a cloud; The alarm and communication module triggers real-time alarm according to the detection result and uploads the structured alarm information to the management platform through the network; And the cloud management platform is used for converging and storing alarm and detection data of each node and providing visual monitoring, historical query, report generation and system management functions.

Description

Visual detection method and system based on customer taking cigarettes in remote watch Technical Field The invention relates to the technical field of computer vision, in particular to a vision detection method and system based on a customer taking cigarettes in remote watch. Background With the popularity of the unmanned retail and remote on duty modes, how to automatically and accurately monitor the taking behavior of managed or high value goods (such as cigarettes) has become a key technical requirement. The prior art scheme mainly has the following defects: Methods based on traditional image processing rely on manually designed features (e.g., color, shape). The method has the defects of poor robustness, easiness in being influenced by illumination, background, packaging difference and shielding interference, weak adaptability and difficulty in large-scale deployment across scenes. The method based on early deep learning is superior to the traditional method, but has the limitation on the task of detecting cigarettes, namely the small target detection capability is insufficient, the pixel ratio of the cigarettes in a picture is small, the missed detection is caused, and the similar objects such as pens, snack bars and the like are easy to generate false detection. The method (such as FasterR-CNN) based on the two-stage detector has the advantages of higher detection precision, high calculation complexity, low reasoning speed, difficulty in meeting the real-time video stream processing requirement, high-performance GPU (graphic processing unit) if the real-time performance is required, and higher deployment and operation cost. Therefore, there is a need in the art for a solution for detecting the cigarette taking of a cloud platform or an edge device, which has high precision, high speed and strong robustness and can be flexibly deployed, and for this purpose, we propose a visual detection method and system based on the customer taking the cigarette in a remote watch. Disclosure of Invention The invention aims to provide a visual detection method and a visual detection system based on taking cigarettes by customers in remote watch so as to solve the problems in the background technology. In order to solve the technical problems, the invention adopts the following technical scheme: a visual detection method based on taking cigarettes by customers in remote duty comprises the following steps: s1, data preparation and enhancement; s2, model improvement and training; S3, model weight reduction and optimization; And S4, cloud or edge reasoning and deployment. Preferably, the step S1 includes the steps of: S101, constructing a cigarette image dataset covering multiple scenes, multiple lights, multiple angles, multiple brands and shielding conditions; S102, marking the boundary boxes and the categories of the images by adopting a COCO format; and S103, data enhancement is implemented, including random rotation, random scaling, brightness contrast adjustment, noise addition and horizontal overturn, so as to improve the generalization capability of the model. Preferably, the step S2 includes the steps of: S201, a model architecture, wherein an advanced single-stage target detection model is selected as a basis; S202, transfer learning, namely initializing based on a pre-training weight of a large data set, and performing fine adjustment on the cigarette data set by adopting a random gradient descent optimizer with momentum and cosine annealing learning rate scheduling; And S203, optimizing a loss function, namely adopting CIoU Loss for regression loss, comprehensively optimizing the overlapping area, the center point distance and the aspect ratio, and realizing more accurate frame positioning. Preferably, in the step S201, the method further includes improving pertinence of the single-stage object detection model: a. The attention mechanism is embedded, namely a convolution attention module is introduced into the feature fusion network, the attention of the model to a small target area is enhanced, and the background interference is restrained; b. and initializing the self-adaptive anchor frame, namely optimizing the size and the aspect ratio of the initial anchor frame aiming at the strip-shaped appearance of the cigarette by adopting a K-means++ clustering algorithm, and improving the positioning efficiency and the positioning precision. Preferably, the step S3 includes the steps of: S301, implementing channel pruning, removing redundant convolution channels, and compressing the model volume on the premise of controllable precision loss; s302, carrying out model quantization, converting the FP32 weight into INT8 precision, and further reducing the model size. Preferably, the step S4 includes the steps of: s401, deploying the optimized lightweight model to a cloud server or edge computing equipment to form a flexible cloud collaborative architecture; s402, receiving a real-time video stream from a camera, and pe