CN-122023754-A - Two-dimensional code detection and identification method and device

CN122023754ACN 122023754 ACN122023754 ACN 122023754ACN-122023754-A

Abstract

The invention discloses a two-dimensional code detection and identification method and device based on an AGV (automatic guided vehicle), which comprise the steps of acquiring an image file acquired through shooting equipment, preprocessing the image file to generate a multi-scale tensor pyramid, inputting the one-scale tensor into a lightweight class network, outputting integrity confidence level of a wide-field image based on a bottleneck multi-scale feature fusion module and a channel attention mechanism, and executing gating decision in real time based on a comparison result of the integrity confidence level and a preset safety threshold. The method has the beneficial effects that the bottleneck type multi-scale feature fusion module and the channel attention mechanism are integrated through the front lightweight class network, the 'analog cutoff' priori knowledge is injected, invalid frames such as edge cutoff, serious breakage or pure background can be accurately identified, and the misjudgment risk of the incomplete two-dimensional code is intercepted from an inference source.

Inventors

LU YASHI
LI WEIJUN
HU ZHIGUANG
Huang Lielie

Assignees

浙江迈睿机器人有限公司

Dates

Publication Date: 20260512
Application Date: 20260122

Claims (10)

1. The two-dimensional code detection and identification method is based on an AGV industrial automatic guided vehicle and is characterized by comprising the following steps: acquiring an image file acquired by a shooting device; Preprocessing the image file to generate a multi-scale tensor pyramid, wherein the multi-scale tensor pyramid comprises a first-stage classification tensor for integrity judgment and a second-stage detection tensor for high-precision positioning; inputting the first-level classification tensor into a lightweight classification network, and outputting the integrity confidence level of the image based on a bottleneck-type multi-scale feature fusion module and a channel attention mechanism; executing a gating decision based on a comparison result of the integrity confidence level and a preset safety threshold in real time: If the integrity confidence is lower than the safety threshold, triggering an early-back mechanism, releasing the video memory occupied by the second-level detection tensor, and outputting a non-target signal; If the integrity confidence coefficient is not lower than the safety threshold, activating a high-precision detection network, and carrying out sub-pixel level corner regression on the secondary detection tensor to obtain geometric parameters of the two-dimensional code; And (3) performing geometric topology verification on the corner points obtained by regression, and mapping the corner points into navigation pose information under a physical coordinate system after verification.
2. The method for detecting and identifying a two-dimensional code according to claim 1, wherein the preprocessing step further comprises: And directly writing the image file stream into a page-locked memory area in the physical memory by adopting a direct memory mapping or shared memory technology, wherein the area is mapped to a GPU unified addressing space.
3. The two-dimensional code detection and recognition method according to claim 1, wherein preprocessing the image file to generate a multi-scale tensor pyramid, wherein the multi-scale tensor pyramid comprises a first-stage classification tensor for integrity discrimination and a second-stage detection tensor for high-precision positioning, and the method comprises the steps of performing Bayer removal, histogram equalization and normalization on the image file through a precompiled CUDA kernel function; wherein, the normalization formula is: ; Is a preset statistical constant of the industrial scene, Is an image file; on the basis, a bilinear interpolation operator is utilized to construct a double-scale tensor pyramid in parallel, wherein the pyramid comprises a first-level classification tensor And a second order detection tensor 。
4. The two-dimensional code detection and identification method according to claim 1, wherein the lightweight class network comprises: a bottleneck layer for compressing the number of channels of the input feature; the multi-path parallel depth separable convolution layers respectively adopt convolution kernels with different scales to extract multi-scale features; the channel attention module is used for adaptively adjusting the weight of the characteristic channel according to analog cut-off priori knowledge injected in the training stage; the full-connection classification layer outputs the integrity confidence coefficient through the full-connection classification head by the feature vector after weight calibration 。
5. The two-dimensional code detection and recognition method according to claim 1, wherein the channel attention module is configured to adaptively adjust weights of the feature channels according to analog truncation priori knowledge injected in a training phase, and the method comprises: global average pooling is performed on the fusion feature map through a channel attention mechanism, channel statistics descriptors are extracted, and weight vectors are generated by using two full connection layers: ; Wherein, the Representing the channel statistics after global average pooling, reflecting the response intensity of each characteristic channel in the whole graph, To reduce the weight matrix, for compressing channel features to capture cross-channel interaction information, For the ReLU activation function, for introducing non-linear characteristics, For the up-scaling weight matrix, for recovering the feature dimension, Is a Sigmoid function used to map the output to normalized weights between (0, 1).
6. The method for detecting and identifying a two-dimensional code according to claim 1, wherein if the integrity confidence is lower than the safety threshold, triggering an early-back mechanism, releasing a video memory occupied by the second-level detection tensor, and outputting a no-target signal, comprising: sending a termination instruction to a task scheduler, and physically blocking subsequent enqueuing and execution of detection network related calculation; performing atomic operations to free a buffered second detection tensor in a video memory buffer Marking it as an overwritable state to serve the next frame; And writing the non-target signal to the output interface.
7. The two-dimensional code detection and identification method according to claim 1, wherein if the integrity confidence is not lower than the safety threshold, activating a high-precision detection network, and performing sub-pixel level corner regression on the second-level detection tensor to obtain geometric parameters of the two-dimensional code, including: triggering an activation mode, generating a high-priority terminal signal, activating a post-positioned high-precision detection network special CUDA flow, and detecting tensor for the second stage The mark is locked and transmitted to the computing unit, and the subsequent high-precision positioning process is formally started.
8. The two-dimensional code detection and recognition method according to claim 1 or 7, wherein the high-precision detection network comprises: An improved CSPDARKNET backbone network adopts a hierarchical mixed activation strategy, comprising adopting a ReLU activation function with simple calculation to reduce the video memory access delay in a high-resolution shallow network with intensive calculation; The edge-enhanced spatial decoupling attention module is used for respectively executing one-dimensional global pooling along the horizontal direction and the vertical direction and reserving spatial position information; taking the feature aggregation in the horizontal direction as an example, the mathematical expression is as follows: Wherein, the In-channel for inputting feature graphs Is used to determine the response value of (c) in the response, Is the width of the feature map; The meaning of the formula is by going in the horizontal direction Aggregation is carried out to generate feature vectors Catch the first The long-distance dependence of the rows is kept, so that accurate position information in the vertical direction is kept, and the pooling in the vertical direction keeps position information in the horizontal direction; the detection head is used for returning geometric parameters of four corner points, including corner point coordinates, center point coordinates and rotation angles of the two-dimensional code.
9. The two-dimensional code detection and identification method according to claim 1, wherein geometric topology verification is performed on corner points obtained by regression, and the corner points are mapped into navigation pose information under a physical coordinate system after verification, and the method comprises the following steps: obtaining geometric parameters of four corner points, and calculating convexity of a quadrangle formed by the four corner points and eccentricity of a diagonal intersection; If the geometric shape is found to be severely distorted, predicting as a false target and forcibly discarding; using calibrated camera reference matrix And distortion coefficient The pixel coordinate system is mapped back to the physical space coordinate system by adopting the back projection principle of the pinhole camera model, and the mathematical mapping relation is as follows: ; Wherein, the For the pixel coordinates, Is the physical coordinates in the camera coordinate system; Accurate physical pose of two-dimensional code center point relative to AGV body through perspective transformation Yaw angle ; And the AGV motion controller is transmitted to the AGV motion controller through a TCP/IP protocol.
10. Two-dimensional code detects and recognition device locates AGV industry automatic guided vehicle, its characterized in that includes: An acquisition unit configured to acquire an image file acquired by a photographing apparatus; The preprocessing unit is used for preprocessing the image file to generate a multi-scale tensor pyramid, and comprises a first-stage classification tensor for integrity judgment and a second-stage detection tensor for high-precision positioning; The network processing module is used for inputting the first-level classification tensor into a lightweight class network and outputting the integrity confidence level of the wide-field image based on a bottleneck-type multi-scale feature fusion module and a channel attention mechanism; the comparison module is used for executing a gating decision based on a comparison result of the integrity confidence coefficient and a preset safety threshold in real time: If the integrity confidence is lower than the safety threshold, triggering an early-back mechanism, releasing the video memory occupied by the second-level detection tensor, and outputting a non-target signal; If the integrity confidence coefficient is not lower than the safety threshold, activating a high-precision detection network, and carrying out sub-pixel level corner regression on the secondary detection tensor to obtain geometric parameters of the two-dimensional code; And the verification module is used for carrying out geometric topology verification on the corner points obtained by regression, and mapping the corner points into navigation pose information under a physical coordinate system after verification.

Description

Two-dimensional code detection and identification method and device Technical Field The invention relates to the field of identification, in particular to a two-dimensional code detection and identification method and device. Background Along with the advanced of 'industry 4.0' and intelligent manufacturing, the machine vision technology is widely applied to the scenes of logistics storage, automatic production line and the like. Wherein, dataMatrix, QRCode two-dimensional bar codes have become the core carrier of industrial Automatic Guided Vehicle (AGV) navigation landmark and material tracking by virtue of the advantages of large information capacity and strong error correction capability. Under the scene that the AGV moves at a high speed or the industrial assembly line runs fast, the visual navigation system needs to meet two core requirements simultaneously, namely, the visual navigation system carries out high-frequency integrity judgment on the two-dimensional code in the visual field to avoid navigation errors, and the visual navigation system realizes accurate positioning of sub-pixel levels to ensure navigation accuracy. The existing industrial two-dimensional code visual analysis technology mainly relies on a single-stage deep learning target detection method (such as a YOLO series, an SSD and the like), and an end-to-end reasoning mode is adopted to directly input each acquired frame of wide-field image into a deep convolutional neural network. The network extracts multi-scale features through a Feature Pyramid (FPN), and the detection head synchronously returns the bounding box, the confidence coefficient and the corner coordinates of the two-dimensional code. In order to reduce the detection omission risk, the existing system executes a complete detection reasoning process on all frames in the video stream, whether the image contains a complete two-dimensional code target or not. The existing single-stage detection architecture does not carry out front screening and calculation dynamic distribution on the integrity of the two-dimension code, so that the two-dimension code cannot be judged under the limited hardware resources of an embedded terminal, and meanwhile, the judgment of the high-reliability integrity of the two-dimension code and the positioning of the sub-pixel level in high real time are considered, or the model illusion is generated due to the incomplete two-dimension code with the broken edge cut-off by forced regression, so that the safety risk of AGV navigation deviation or collision is caused, or the full high-calculation operation is carried out on an invalid frame, so that the resource waste of the embedded terminal is caused, the real-time processing frame rate of a system is limited, and finally, the safety reliability and the operation efficiency are difficult to balance. Disclosure of Invention The invention aims to solve the technical problems in the prior art and provide a two-dimensional code detection and identification method and device. The application provides a two-dimensional code detection and identification method, which is based on an AGV industrial automatic guided vehicle and comprises the following steps: acquiring an image file acquired by a shooting device; Preprocessing the image file to generate a multi-scale tensor pyramid, wherein the multi-scale tensor pyramid comprises a first-stage classification tensor for integrity judgment and a second-stage detection tensor for high-precision positioning; Inputting the first-level classification tensor into a lightweight class network, and outputting the integrity confidence level of the wide-field image based on a bottleneck-type multi-scale feature fusion module and a channel attention mechanism; executing a gating decision based on a comparison result of the integrity confidence level and a preset safety threshold in real time: If the integrity confidence is lower than the safety threshold, triggering an early-back mechanism, releasing the video memory occupied by the second-level detection tensor, and outputting a non-target signal; If the integrity confidence coefficient is not lower than the safety threshold, activating a high-precision detection network, and carrying out sub-pixel level corner regression on the secondary detection tensor to obtain geometric parameters of the two-dimensional code; And (3) performing geometric topology verification on the corner points obtained by regression, and mapping the corner points into navigation pose information under a physical coordinate system after verification. Preferably, the pretreatment further comprises: and directly writing the image file stream into a page-locked memory area in the physical memory by adopting a direct memory mapping or shared memory technology, wherein the area is mapped to a GPU unified addressing space. Preferably, preprocessing the image file to generate a multi-scale tensor pyramid, wherein the multi-scale tensor pyramid comprises