CN-121999025-A - Ore stacking volume measurement method based on binocular stereoscopic vision

CN121999025ACN 121999025 ACN121999025 ACN 121999025ACN-121999025-A

Abstract

The invention relates to the technical field of ore stacking volume measurement, in particular to an ore stacking volume measurement method based on binocular stereoscopic vision, which comprises the following steps of collecting left and right views of an ore stacking by using a binocular camera and constructing a data set of stereoscopic matching data; the method comprises the steps of correcting acquired left and right views by an improved multi-scale self-adaptive Bouguet stereo correction algorithm to eliminate lens distortion, aligning row pixels of the left and right views to obtain stereo correction image pairs, carrying out gray level processing on the stereo correction image pairs by a weighted average method, carrying out contrast-limited self-adaptive histogram equalization on the images subjected to gray level processing by the weighted average method to highlight edge and texture information of ore stacks, constructing an MS-BGNet stereo matching model based on fusion transfer learning, inputting the left and right views of the ore stacks into the model to generate parallax images, and calculating the volume of the ore stacks by a triangular prism micro-element splitting method based on the generated parallax images.

Inventors

JIANG SONG
GUO LI
WANG SEN
HONG YONG
CHEN YING
MA LIANJING
ZHAO YIFEI
ZHANG NA
LIU LEILEI
ZHANG YAO
CUI ZHIXIANG
WU MINGZE
GU QINGHUA
RAO BINJIAN
Ai Qingwu
ZHANG SAI
LIU DI

Assignees

西安建筑科技大学
太原优迈智采科技有限公司

Dates

Publication Date: 20260508
Application Date: 20251205

Claims (8)

1. The ore stacking volume measuring method based on binocular stereoscopic vision is characterized by comprising the following steps of: Step one, collecting left and right views of the ore windrow by using a binocular camera, and constructing a data set of three-dimensional matching data, wherein the three-dimensional matching data of the large-granularity ore windrow and the three-dimensional matching data of the small-granularity ore windrow are respectively 300 pairs; Correcting the left view and the right view acquired in the first step by adopting an improved multi-scale self-adaptive Bouguet stereo correction algorithm to eliminate lens distortion and align row pixels of the left view and the right view to obtain a stereo correction image pair, then carrying out gray scale treatment on the stereo correction image pair by adopting a weighted average method, and finally carrying out self-adaptive histogram equalization for limiting contrast on the image subjected to gray scale treatment by adopting the weighted average method so as to highlight the edge and texture information of the ore stacking; firstly, pre-training by using an open-source large-scale stereo matching data set, then using a stereo matching network with pre-training weight for fine adjustment of a self-built ore stacking data set, and inputting left and right views of the ore stacking obtained in the second step into the model to generate a disparity map after fine adjustment is completed; And step four, calculating the volume of the ore stacking by utilizing a triangular prism micro-element splitting method based on the parallax map generated in the step three.
2. The method for measuring the volume of the ore stacking based on binocular stereoscopic vision according to claim 1, wherein in the second step, the improved multi-scale self-adaptive Bouguet stereoscopic correction algorithm comprises the following steps: Step 1, after the stereo correction of the foundation Bouguet is completed, in order to balance between large-scale distortion and local detail, introducing a multi-resolution fusion correction module to perform multi-scale decomposition on an original image, respectively applying the stereo correction on a plurality of resolution levels to construct a pyramid, wherein the pyramid construction process can be expressed as follows: , ; Where I l (k) 、I r (k) is the image of the left and right eye image at the k-th layer pyramid, respectively, pyrDown () is a pyramid downsampling operation, I l (k-1) 、I r (k-1) is an image of a layer of the left eye diagram and the right eye diagram on the pyramid, and k is the level number of the pyramid; Step 2, after the local correction is completed on each scale, the correction results under different resolutions are fused to obtain a final corrected image with global consistency and local detail retention, and the fusion process can be described as follows: ; ; in the formula, Is the final fused corrected image, pyrUp () is the pyramid up-sampling operation, w k is the weight of the k-th layer scale in the final fusion; Step 3, after multi-scale correction fusion is completed, introducing a self-adaptive ROI module, automatically calculating an optimal clipping window by analyzing the effective areas of the left and right correction images, firstly detecting the non-zero pixel areas of the left and right images, and calculating an intersection as the effective ROI, wherein the formula is as follows: ; where (x, y) is the pixel coordinates in the image, I l (x,y)、I r (x, y) is the pixel intensity of the left and right corrected images at position (x, y), τ is the pixel intensity threshold, Is a logical symbol AND; step 4, calculating a clipping boundary for maximizing the view reservation based on the effective ROI area, ensuring the consistency of the left and right image views, and simultaneously minimizing the ineffective black area, wherein the formula is as follows: ; where x min 、y min represents the upper left corner coordinates of the final crop region ROI, x max 、y max represents the upper right corner coordinates of the final crop region ROI, 、 Representing the upper left corner coordinates of the active area of the left image, 、 Representing the lower right corner coordinates of the active area of the left image, 、 Representing the upper left corner coordinates of the active area of the right image, 、 A lower right corner coordinate representing an effective area of the right image; And 5, finally, obtaining a high-quality stereo correction image pair with maximized visual field through multi-scale and self-adaptive ROI clipping.
3. The method for measuring the volume of the ore stacking based on binocular stereoscopic vision according to claim 2, wherein in the second step, the weighted average method is used for carrying out gray scale treatment, namely, carrying out proper weighted average operation on pixel values of all channels of RGB, and converting an RGB three-channel image originally containing rich color information into a single-channel image only having one-dimensional gray scale information, wherein the formula is as follows: ; Wherein Gyay represents a gray value, wherein i and j represent a row number and a column number, respectively, (i, j) represents a pixel coordinate in a two-dimensional image, R Representing the red channel, G representing the green channel, B representing the blue channel, and three weights 0.299, 0.578 and 0.114 correspond to YUV format luma values of 4:2:0, respectively.
4. The method for measuring the volume of the ore stacking material based on binocular stereoscopic vision according to claim 3, wherein in the second step, the adaptive histogram equalization processing for limiting the contrast ratio specifically comprises the following steps: dividing the image subjected to graying into rectangular context areas which are not overlapped, and calculating a gray level histogram for each area; Step 2, contrast limitation and histogram clipping, namely setting clipping limiting threshold value, clipping each local histogram to limit the amplitude value The formula is: ; Where T clip is the total number of pixels in the context area, N pixels is the number of gray levels, and N bins is the number of gray level divisions of the histogram; will exceed Is uniformly redistributed to all gray levels; Step 3, local histogram equalization, namely calculating a cumulative distribution function of each context area based on the local histogram after clipping and redistribution, and applying the function to perform histogram equalization conversion on the area; and 4, bilinear interpolation synthesis, namely, in order to avoid blocking artifacts, carrying out bilinear interpolation on the final gray value of any pixel point in the image through four adjacent context area transformation functions.
5. The method for measuring the volume of the ore stacking based on binocular stereoscopic vision according to claim 4, wherein in the third step, the MS-BGNet stereo matching model based on fusion transfer learning comprises a feature extraction module, a cost volume construction and aggregation module, a cost volume up-sampling module based on bilateral grids, a cost volume enhancement module based on multi-scale feature guidance and a residual parallax refinement module, wherein: the feature extraction module is used for extracting multi-scale features of an input stereo image pair by adopting a ResNet-like architecture, and outputting feature pyramids with three resolution levels of 1/2, 1/4 and 1/8; The cost volume construction and aggregation module is used for constructing grouping association cost volumes based on the multi-scale features, wherein the group correlation calculation adopts the formula: ; Where C g represents the matching cost of the G-th group, N g is the total number of characteristic channels, G is the number of packets, And The characteristic vectors of the left image and the right image in the g group are respectively; the cost volume up-sampling module based on the bilateral grid is used for up-sampling the low-resolution cost volume to high resolution through a slicing operation, and the mathematical definition of the slicing operation is as follows; ; wherein B is a bilateral network, For the width or height ratio of the grid dimension to the high resolution cost volume dimension, Is the ratio of the gray level of the grid to the gray level of the guide graph G; The cost volume enhancement module guided by the multi-scale features is used for fusing 1/2, 1/4 and 1/8 multi-scale features from the feature extraction module, enhancing the cost volume through a space-parallax dual attention mechanism, wherein the space attention weight is calculated as follows: ; in the formula, As a sigmoid function, W S is a convolution weight, The feature stitching is represented and is performed, The characteristic diagrams under three scales of 1/2, 1/4 and 1/8 are spliced in the channel dimension; The residual parallax refinement module is used for regressing a final parallax image from the enhanced high-resolution cost volume through a soft argmin function, and the calculation definition is as follows: ; Wherein D pred (x, y) represents the final predicted disparity value, D is the disparity depth, D max is the maximum disparity depth, and C enhanced (x, y, D) represents the enhanced cost volume; And optimized using a smooth L1 loss function defined as: 。
6. The method for measuring the volume of the ore stacking based on binocular stereoscopic vision according to claim 5, wherein in the third step, the two-stage training strategy is that firstly, an open-source large-scale stereoscopic matching dataset Scene Flow and a KITTI dataset are used for pre-training the MS-BGNet stereoscopic matching model to learn general stereoscopic matching priori knowledge, then, a stereoscopic matching network with pre-training weights is used for fine adjustment of the stereoscopic matching dataset with the built-in ore stacking dataset as an object, so that the perception capability of the model on the surface texture and geometric characteristics of the ore stacking is optimized, and the matching precision and robustness of the model under specific application scenes are improved.
7. The method for measuring the volume of the ore stacking based on binocular stereoscopic vision according to claim 6, wherein in the fourth step, the specific step of calculating the volume of the ore stacking based on the parallax map generated in the third step by using a triangular prism micro-element splitting method comprises the following steps: step 1, mapping two-dimensional image points to a three-dimensional space through coordinate conversion based on a parallax image generated by an MS-BGNet stereo matching model, wherein each pixel point Parallax value of the display device Can be converted into three-dimensional coordinates and parallax values The calculation formula is as follows: ; Wherein, x l and x r are the lengths of imaging points in the left imaging plane and the right imaging plane respectively from the left edge of the plane; The three-dimensional coordinate (X w ,Y w ,Z w ) has the following calculation formula: ; Wherein b is a binocular camera baseline, f is a camera focal length, d is a parallax value, (x, y) is a coordinate in an imaging plane coordinate system, (u, v) is a coordinate in a pixel coordinate system, and alpha and beta are focal distances of pixels in x-axis and y-axis directions of the imaging plane; Step 2, constructing triangular prism micro-elements based on the obtained three-dimensional coordinates, wherein each triangular prism micro-element is composed of three-dimensional space points corresponding to three adjacent pixel points, and each pixel point corresponds to an upper vertex positioned on the surface of the stacking material and a lower vertex positioned on a reference plane, so that an irregular triangular prism with six vertexes is formed; Step 3, splitting each triangular prism into six irregular tetrahedrons according to the infinitesimal of the triangular prism, and respectively connecting a center point G of the triangular prism with six vertexes of the triangular prism to form the six tetrahedrons, wherein the three-dimensional coordinates of the center point G are calculated according to the following formula:
8. wherein ,(X n1 , Y n1 , Z n1 )、(X n2 , Y n2 , Z n2 )、(X n3 , Y n3 , Z n3 ) is the three-dimensional coordinates of three vertexes of the lower bottom surface of the triangular prism, (X m1 , Y m1 , Z m1 )、(X m2 , Y m2 , Z m2 )、(X m3 , Y m3 , Z m3 ) Three-dimensional coordinates of three vertexes of the upper bottom surface of the triangular prism; And 4, respectively calculating the volume of each tetrahedron, wherein a determinant method is adopted for calculation, and the calculation formula of the volume V of the tetrahedron formed by four vertexes P 1 (x 1 , y 1 , z 1 ),P 2 (x 2 , y 2 , z 2 ),P 3 (x 3 , y 3 , z 3 ),P 4 (x 4 , y 4 , z 4 ), is as follows: ; Step 5, adding the volumes of the six tetrahedrons to obtain the volume of a single triangular prism primordial; and 6, accumulating the volumes of all the triangular prism infinitesimal units so as to obtain the total volume of the ore stacking.

Description

Ore stacking volume measurement method based on binocular stereoscopic vision Technical Field The invention relates to the technical field of ore stacking volume measurement, in particular to an ore stacking volume measurement method based on binocular stereoscopic vision. Background In the operation management of mine enterprises, the volume measurement of ore stockpiles is a key link for cost accounting, benefit assessment and production planning, and the measurement precision and efficiency directly influence the economic benefit and management level of the enterprises. The traditional bulk volume measurement mainly adopts a contact method (such as manual measurement of a tape measure), and the method is time-consuming, labor-consuming, high in cost, and difficult to avoid large manual measurement errors, and has obvious potential safety hazards in severe working environments such as high dust, high temperature and the like. For this reason, a noncontact measurement method has been developed and has become a mainstream trend. Among them, the vision-based measuring method is widely used because of its high safety, long-distance operability, and the like. The vision measurement method mainly comprises monocular, binocular and multi-eye schemes, wherein the monocular vision method only can acquire two-dimensional image information, depth data cannot be directly obtained, and when a large-volume stacking is measured, all objects are difficult to cover at one time, so that measurement errors are large, the multi-vision method can improve accuracy, but the hardware cost is high, and the problems of complex multi-camera calibration, synchronization and data fusion are related, so that implementation difficulty and cost are remarkably increased. Binocular vision approaches strike a good balance between cost, accuracy and technical maturity and are considered ideal solutions for achieving automatic measurement of the bulk volume. However, existing binocular stereoscopic methods still face challenges when applied to this particular scenario of ore stacking. The surface texture of the ore stacking is weak, the characteristics are sparse, and the environment illumination is complex, so that a generated parallax map often has a large number of holes and distortion, and the integrity of the subsequent three-dimensional reconstruction and the accuracy of volume calculation are seriously affected. Therefore, a binocular vision measurement method capable of accurately generating a parallax map and realizing high-precision volume calculation on the basis of the parallax map is researched, and the binocular vision measurement method has important practical significance and application value. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a binocular stereoscopic vision-based ore stacking volume measuring method. In order to achieve the above object, the technical scheme provided is as follows: the ore stacking volume measuring method based on binocular stereoscopic vision is characterized by comprising the following steps of: Step one, collecting left and right views of the ore windrow by using a binocular camera, and constructing a data set of three-dimensional matching data, wherein the three-dimensional matching data of the large-granularity ore windrow and the three-dimensional matching data of the small-granularity ore windrow are respectively 300 pairs; Correcting the left view and the right view acquired in the first step by adopting an improved multi-scale self-adaptive Bouguet stereo correction algorithm to eliminate lens distortion and align row pixels of the left view and the right view to obtain a stereo correction image pair, then carrying out gray scale treatment on the stereo correction image pair by adopting a weighted average method, and finally carrying out self-adaptive histogram equalization for limiting contrast on the image subjected to gray scale treatment by adopting the weighted average method so as to highlight the edge and texture information of the ore stacking; firstly, pre-training by using an open-source large-scale stereo matching data set, then using a stereo matching network with pre-training weight for fine adjustment of a self-built ore stacking data set, and inputting left and right views of the ore stacking obtained in the second step into the model to generate a disparity map after fine adjustment is completed; And step four, calculating the volume of the ore stacking by utilizing a triangular prism micro-element splitting method based on the parallax map generated in the step three. Preferably, in the second step, the improved multi-scale adaptive Bouguet stereo correction algorithm includes the following steps: step 1, after the stereo correction of the foundation Bouguet is completed, in order to balance between large-scale distortion and local detail, a multi-resolution fusion correction module is introduced to perform multi-scale decomposit