CN-116071426-B - Method and system for detecting pose of filling opening of steel cylinder

CN116071426BCN 116071426 BCN116071426 BCN 116071426BCN-116071426-B

Abstract

The invention discloses a method for detecting the pose of a filling opening of a steel bottle, which comprises the steps of carrying out preliminary processing on images obtained when the filling opening of the steel bottle is in different poses, carrying out block segmentation on the preprocessed images, coding different blocks, converting coordinates on an object coordinate system into coded coordinates of coded images, obtaining block number masks of all dimensions, then establishing a pose detection model of the filling opening of the steel bottle, carrying out preliminary processing on the images obtained in real time, obtaining the block number masks through the detection model, decoding to obtain a real coordinate feature map, extracting two-dimensional coordinates and corresponding three-dimensional coordinates, and converting the corresponding relation between the two-dimensional coordinates and the three-dimensional coordinates into pose information of the filling opening through EPnP algorithm. The novel partition coding coordinate feature map predicts the deep neural network, so that average errors on all pixels of the feature map can be reduced proportionally, and the accuracy is improved.

Inventors

XU LIYUN
ZHANG JIAN
FU XIUFENG
MA ZONGHENG

Assignees

上海旻实智能科技有限公司

Dates

Publication Date: 20260508
Application Date: 20221230

Claims (7)

1. The method for detecting the pose of the filling port of the steel bottle is characterized by comprising the following steps of: (1) The method comprises the steps of obtaining images of different positions of a steel bottle filling opening, carrying out preliminary processing on the obtained images to obtain preprocessed images, and obtaining position information of the steel bottle filling opening at the same time, wherein the method specifically comprises the following steps: (1.1) obtaining RGB images of the filling opening of the steel bottle in different positions; (1.2) graying the acquired RGB image; (1.3) performing adaptive threshold binarization processing on the gray-scaled image; (1.4) carrying out straight line detection, quadrilateral detection and angular point extraction on the processed image; (1.5) determining an analysis plane, and calculating a perspective transformation matrix from the image plane of the processed image to the analysis plane; (1.6) determining four vertex coordinates (a i, b i ) of the analysis plane bounding box according to the cylinder style (i e {1,2,3,4 }); (1.7) projecting the four vertices (a i, b i ) determined in the analysis plane into the image plane by right-hand perspective transformation matrix (u i , v i ); (1.8) obtaining that the pixel coordinate (c x , c y ) of the center of the bounding box in the image plane is ([ max (u i )+min(u i )]/2, [max(v i )+min(v i ) ]/2), and the bounding box length and width (S x , S y ) is ([ max (u i )-min(u i ), max(v i )-min(v i ) ]); (1.9) obtaining an image of interest in the bounding box, scaling the image of interest to a length and width of 256×256 using bilinear interpolation; (2) Dividing the preprocessed image into blocks, converting the number N E [0, N ] of the block into a code consisting of 0 and 1 by adopting the idea of Gray code coding, and converting coordinates on an object coordinate system into coding coordinates of the coded image to obtain a characteristic diagram of the partition coding coordinates; (3) Obtaining a mask image according to the feature map of the partition coding coordinates, and performing block division on the mask image to obtain block number masks of each dimension; (4) Taking the preprocessed image as input, taking the obtained characteristic diagram, mask image and block number mask of the partition coding coordinate as output, and establishing a steel bottle filling port pose detection model; (5) Acquiring an image of a filling port of the steel bottle in real time, performing preliminary treatment on the acquired image to obtain a preprocessed image, and obtaining a block number mask in each dimension through a pose detection model of the filling port of the steel bottle according to the preprocessed image; (6) Extracting two-dimensional coordinates of pixels in the image and three-dimensional coordinates of corresponding three-dimensional object coordinate systems; (7) And converting the obtained corresponding relation between the two-dimensional coordinates and the three-dimensional coordinates into pose information of the filling port through EPnP algorithm.
2. The method of claim 1, wherein the target image data in the acquired image is obtained by a bounding box detection algorithm in the step (1).
3. The method for detecting the pose of the filling port of the steel cylinder according to claim 2, wherein the block segmentation mode of the preprocessed image in the step (2) is to divide equally in each dimension of xyz of the image, set the number of blocks as N, and convert the number of blocks N e [0, N ] into a code consisting of 0 and 1 by adopting the idea of gray code coding.
4. The method for detecting the pose of a filling port of a steel cylinder according to claim 3, wherein the functional relationship between the coded coordinates ZCC and the corresponding coordinates C on the object coordinate system is: ; Where i ε {0, 1, 2} corresponds to xyz three dimensions, max i is half the span of the cylinder fill port image in the i-th dimension, The interval is 。
5. The method for detecting the pose of a filling port of a steel cylinder according to claim 4, wherein a ResNet-34 network structure is adopted as a main network to extract image information, a characteristic map of partition coding coordinates of 64 x 3 is adopted as an intermediate output layer, a mask image of 64 x 1 is adopted as an output layer, and the output layer is 64 x Is used for the block number mask of each dimension, The number of channels of the block numbering mask in all dimensions of xyz.
6. The method for detecting the pose of the filling port of the steel cylinder according to claim 1, wherein in the step (6), the decoding process is to read a block number mask in each dimension to obtain a block number of a point corresponding to a coding coordinate in an object coordinate system, and the three-dimensional point coordinate in the object coordinate system is obtained according to an inverse function of the coding function of the partition coding coordinate.
7. The steel bottle filling port pose detection system is characterized by comprising a camera, an image preliminary processing module, an image reprocessing module, a model building module and a calculating module, wherein the camera is used for acquiring a steel bottle filling port pose image; the image preliminary processing module is used for carrying out preliminary processing on the image acquired by the camera to obtain a preprocessed image, and specifically comprises the following steps: (1.1) obtaining RGB images of the filling opening of the steel bottle in different positions; (1.2) graying the acquired RGB image; (1.3) performing adaptive threshold binarization processing on the gray-scaled image; (1.4) carrying out straight line detection, quadrilateral detection and angular point extraction on the processed image; (1.5) determining an analysis plane, and calculating a perspective transformation matrix from the image plane of the processed image to the analysis plane; (1.6) determining four vertex coordinates (a i, b i ) of the analysis plane bounding box according to the cylinder style (i e {1,2,3,4 }); (1.7) projecting the four vertices (a i, b i ) determined in the analysis plane into the image plane by right-hand perspective transformation matrix (u i , v i ); (1.8) obtaining that the pixel coordinate (c x , c y ) of the center of the bounding box in the image plane is ([ max (u i )+min(u i )]/2, [max(v i )+min(v i ) ]/2), and the bounding box length and width (S x , S y ) is ([ max (u i )-min(u i ), max(v i )-min(v i ) ]); (1.9) obtaining an image of interest in the bounding box, scaling the image of interest to a length and width of 256×256 using bilinear interpolation; The image re-processing module is used for dividing the block of the image preprocessed by the image preliminary processing module, converting the serial numbers N E [0, N ] of the blocks into codes consisting of 0 and 1 by adopting the idea of Gray code coding, converting the coordinates on an object coordinate system into the coding coordinates of the coded image to obtain a characteristic diagram of the partition coding coordinates; The model building module is used for building a steel bottle filling port pose detection model by taking the preprocessed image as input and taking the obtained characteristic image, mask image and block number mask of the partition coding coordinate as output; The computing module is used for obtaining block number masks in each dimension through a steel bottle filling port pose detection model of the preprocessed image, decoding the block number masks to obtain a real coordinate feature map, extracting two-dimensional coordinates of pixels in the image and three-dimensional coordinates of pixels in a corresponding three-dimensional object coordinate system, and converting the obtained corresponding relationship between the two-dimensional coordinates and the three-dimensional coordinates into pose information of the filling port through EPnP algorithm.

Description

Method and system for detecting pose of filling opening of steel cylinder Technical Field The invention relates to automatic filling of a refrigerant, in particular to a method and a system for detecting the pose of a filling opening of a steel cylinder. Background The production process of the refrigerant is characterized in that the refrigerant filling operation is simpler, the labor repeatability is high, and the material leakage has a certain danger to workers, so that the refrigerant filling is an industrial field with quicker automation, and the refrigerant filling at present comprises a semi-automatic production line and a full-automatic production line, so that the production efficiency is improved, and the production cost is saved. Among them, the disassembly and assembly process of unscrewing the screw cap to pour in the refrigerant and reinstalling is one of the important difficulties. To accomplish this, it is necessary to acquire pose information of the filling port of the check valve, that is, a conversion relationship between the object coordinate system on the set filling port to the camera coordinate system, through a vision sensor, which is a key point of the vision guiding robot to align the gun screwing actuator to the filling port of the check valve. The 6D pose of the filling port is detected, and belongs to one of research hot spots in visual aspect in recent years, based on the principle of a deep neural network, three-dimensional information can be read from only one RGB image and pose estimation of a target object can be carried out in an application stage by continuously learning various priori pose data, generally, the network structure is in a U shape and is divided into an encoder and a decoder, the encoder analyzes the input RGB image and carries out information extraction, and the decoder is responsible for generating required output. Generally, the method for achieving the object pose detection purpose can be divided into a direct method and an indirect method, wherein the direct method can directly regression to obtain the information of a rotation matrix and a translation vector of an object, the indirect method needs to regression to obtain various feature diagrams created based on coordinate information on an object coordinate system, then the corresponding relation between a pixel two-dimensional coordinate and a three-dimensional coordinate on the object coordinate system is obtained according to feature diagram analysis, and then the conversion relation from the object coordinate system on the object to a camera coordinate system is obtained through EPnP algorithm. An indirect method is to take a feature map of three-dimensional coordinates on an object coordinate system as output, take Pix2Pose and CDPN as representatives, obtain coordinates of the object coordinate system corresponding to each pixel of an image by using a depth map obtained by shooting a depth camera and pose information obtained by detecting ArUco corner points, normalize the coordinates to be a section of [ -1,1] for prediction, and then decode to obtain a corresponding relation between 2D points and 3D points for EPnP processing, but the accuracy is still lacking. Disclosure of Invention Aiming at the defects, the invention provides the method for detecting the pose of the filling opening of the steel cylinder with high accuracy. The invention also provides a system for detecting the position and the posture of the filling opening of the steel cylinder. The invention adopts a method for detecting the pose of a filling opening of a steel bottle to solve the problems, and comprises the following steps: (1) Acquiring images of different positions of a filling opening of the steel bottle, performing preliminary processing on the acquired images to obtain preprocessed images, and acquiring position information of the filling opening of the steel bottle; (2) Dividing the preprocessed image into blocks, coding different blocks, and converting coordinates on an object coordinate system into coding coordinates of a coded image to obtain a characteristic diagram of the partition coding coordinates; (3) Performing image masking on the feature map of the partition coding coordinates to obtain block number masks of all dimensions; (4) Taking the preprocessed image as input, taking the obtained characteristic diagram, mask image and block number mask of the partition coding coordinate as output, and establishing a steel bottle filling port pose detection model; (5) Acquiring an image of a filling port of the steel bottle in real time, performing preliminary treatment on the acquired image to obtain a preprocessed image, and obtaining a block number mask in each dimension through a pose detection model of the filling port of the steel bottle according to the preprocessed image; (6) Extracting two-dimensional coordinates of pixels in the image and three-dimensional coordinates of corresponding three-dimensional