Search

CN-121982620-A - Real-time analysis method for shield slag hole based on image stabilization enhancement and interaction detection driving

CN121982620ACN 121982620 ACN121982620 ACN 121982620ACN-121982620-A

Abstract

The invention discloses a shield slag hole real-time analysis method based on image stabilization enhancement and interaction detection driving. Obtaining a slag hole video, separating according to preset parameters to obtain a video sequence, and processing the video sequence to obtain a stabilized image sequence Processing any appointed key frame through polygon interaction drawing to obtain a binary mask Detecting the target region by using a target detection model to obtain an initial candidate frame set, and pre-screening the candidate frames in the initial candidate frame set according to the area threshold value of the candidate frames to obtain the candidate frame set Generating corresponding instance-level segmentation masks by using the image unsupervised segmentation large model Post pixel-by-pixel maximization fusion to obtain a frame-level segmentation mask Based on the segmentation mask And after the monochrome highlight layer is built, the monochrome highlight layer and the original image frame are subjected to weighted superposition processing to obtain a video file, and slag hole video segmentation visualization is completed. The invention has the advantages of guaranteeing the continuity of video vision and the integrity of pictures and improving the visual impression.

Inventors

  • ZHANG XUHUI
  • HU WEIDONG
  • ZHANG HAITAO
  • LI CHAORAN
  • FANG XIANGDONG
  • ZHAN TIANMING
  • ZHANG YUQING
  • QI JIAQIANG

Assignees

  • 中铁十四局集团大盾构工程有限公司

Dates

Publication Date
20260505
Application Date
20260403

Claims (10)

  1. 1. The shield slag hole real-time analysis method based on image stabilization enhancement and interaction detection driving is characterized by comprising the following steps of: obtaining a slag hole video, separating the slag hole video into original image frames according to preset parameters, and sequencing the original image frames in sequence Obtaining a video sequence, performing image stabilization enhancement on the video sequence, performing ORB feature extraction on adjacent original image frames to obtain a feature image, and calculating the mean shift of the point displacement in the feature image by combining Lowe ratio matching with RANSAC interior point screening The characteristic images are subjected to output synthesis of translation image stabilization and clipping repair by mean shift to obtain a stabilized image sequence ; Processing any appointed key frame through polygon interaction drawing to obtain a binary mask Binary mask Performing subsequent target detection and segmentation in the ROI area; Constructing a target detection model and after image stabilization, sequencing Performing target detection in each frame to obtain an initial candidate frame set, performing preliminary screening on candidate frames in the initial candidate frame set according to the area threshold value of the candidate frames, and judging whether the centroid of the candidate frame falls into the initial candidate frame set Performing secondary screening to obtain a candidate frame set ; Building an image unsupervised segmentation large model and collecting the image with candidate frames The candidate boxes in the model are taken as prompt input to obtain corresponding example level segmentation masks Then dividing the example into masks Performing pixel-by-pixel maximization fusion to obtain a frame-level segmentation mask ; Based on segmentation mask Building a monochrome highlight layer, and weighting and superposing the highlight layer and an original image frame to generate a visual frame According to the segmentation mask Is stored as a video file; To visualize frames And writing the output video file, outputting a frame-by-frame mask image sequence, and completing slag hole video segmentation visualization.
  2. 2. The method for analyzing the shield slag hole in real time based on stable image enhancement and interaction detection driving according to claim 1, wherein the characteristic image extraction process is as follows: Given two sets of gray scale images And And gives the gray level images of two adjacent frames before and after 、 And 、 Extracting key point set and binary descriptor by ORB algorithm, and firstly using 、 、 Extraction by ORB algorithm Key point set corresponding to frame and binary description submatrix The calculation formula is as follows: ; wherein: Representation and representation A corresponding binary descriptor matrix; represent the first Frame key point set and simultaneous utilization 、 、 Extraction by ORB algorithm Key point set and binary description submatrix of frame The calculation formula is as follows: ; In the formula, Represent the first A frame gray scale image; represent the first A set of frame key points, Representation and representation A corresponding binary descriptor matrix; Representing the ORB feature extraction and descriptor generation operator.
  3. 3. The method for analyzing the shield slag hole in real time based on stable image enhancement and interaction detection driving according to claim 1, wherein the key frame processing process is as follows: KNN matching is carried out on the binary descriptor matrix by utilizing Hamming distance, after KNN matching is finished, lowe ratio criterion is used for screening, and a matching pair set is obtained , wherein, 、 Presetting a re-projection error threshold Performing geometric consistency test on the matching pair set by using a RANSAC homography algorithm, and when the projection error of the matching pair meeting homography is less than or equal to a re-projection error threshold value If so, the internal point is judged, otherwise, then it is determined to be an outlier, passed through the outlier mask Screening and building an interior point set, wherein the interior point mask is used for screening and building the interior points For a Boolean array equal in length to the matched pair set, presetting an inner point lower limit When the number of the inner points in the inner point set is smaller than the lower limit of the inner points When the last stable frame is used for the return edge The stable frame From the last time inner point mask Mean shift of (a) For the original frame And performing translation registration and cutting an effective visual field to obtain the image, wherein the calculation formula is as follows: , ; , ; In the formula, 、 Representing the nearest neighbor distance of the same key point in KNN matching; representing the secondary neighbor distance of the same key point in KNN matching; Representing a Lowe ratio threshold; representing an interior point mask set obtained by screening a random sampling consistency algorithm; representing the number of interior points, namely the number of elements in the interior point mask of the random sampling consistency algorithm; Representing the re-projection error threshold value, Representing the lower limit of the number of the interior points, predicting the global average quantity by utilizing the displacement average value corresponding to the interior points, and constructing a translation affine matrix Performing two-dimensional affine transformation according to the current frame to obtain a transformation frame, and performing identical translation affine matrix with an all-1 mask 1 Transforming, wherein the full 1 mask 1 is a binary mask of the same size as the current frame, the pixel values of the current frame are all set to 1 for representing that all the universe before transformation are valid pixels, and the binary mask and the current frame execute a translational affine matrix for two-dimensional affine transformation Matching, i.e. applying the same to all 1 masks 1 Transforming to obtain the effective domain corresponding to the current frame, and obtaining the binary mask of the effective domain of the current frame, namely the full 1 mask 1.
  4. 4. The method for analyzing the shield slag hole in real time based on stable image enhancement and interaction detection driving according to claim 3, wherein the calculation formula of the current frame for performing two-dimensional affine transformation to obtain a transformation frame is as follows: ; ; ; ; In the formula, 、 Represent the first Frame relative to The average amount of displacement of the frames, 、 Represent the first Intra-frame point matching points Is used for the horizontal and vertical coordinate components of the (c), 、 Represent the first Intra-frame point matching points Is used for the horizontal and vertical coordinate components of the (c), Represent the first The two-dimensional translation affine matrix corresponding to the frame, Is the first The original image of the frame is displayed, Is the first image after affine transformation and image stabilization correction The frame image is displayed in a frame image, Representing a two-dimensional affine transformation function; after obtaining the transformed frame, calculating the effective domain binary mask of the current frame, wherein the calculation process is as follows: Defining three domains, overlapping region Effective area of previous frame Active area of current frame The above stable frame For reference, the current affine frame And the last stable frame And (3) performing region synthesis, and outputting a stable frame, wherein the formula is as follows: ; , ; In the formula, Represent the first Affine matrix of frames The transformed valid domain binary mask, Representing the overlap of the current frame and the previous frame, Indicating that the active area of the previous frame has been clipped in the current frame, Representing the portion of the active area of the current frame beyond the boundary of the previous frame, Represent the first The stabilization of a frame is referred to as a frame, Is the first And stabilizing the frame output.
  5. 5. The method for analyzing the shield slag hole in real time based on stable image enhancement and interaction detection driving according to claim 3, wherein the calculation process of the matching pair set is as follows: Firstly, calculating the nearest neighbor distance and the next nearest neighbor distance of the same key point in KNN matching, wherein the calculation formula is as follows: ; wherein: Representing the nearest neighbor distance of the same key point in KNN matching; Representing the corresponding secondary neighbor distance of the same key point in the KNN matching process; Representing a Lowe ratio threshold; After the calculation of the nearest neighbor distance and the next nearest neighbor distance of the same key point in KNN matching is completed, executing a random sampling consistency algorithm by a homography model to obtain an inner point mask and an inner point set, wherein the calculation formula is as follows: ; R represents an inner point mask set output by a random sampling consistency algorithm; representing the number of interior points, namely the number of elements in the interior point mask of the random sampling consistency algorithm; representing a re-projection error threshold; Representing the lower limit of the number of interior points.
  6. 6. The method for analyzing the shield slag hole in real time based on the stable image enhancement and the interaction detection driving according to claim 1, wherein the polygon interaction drawing processes any appointed key frame to obtain a binary mask The process of (2) is as follows: First frame image On, click on the first frame image by mouse Adding vertexes point by point to obtain polygon vertex set, and forming closed polygon curve The specific formula is as follows: ; In the formula, Representing a set of vertices of a polygon, Representing polygon vertex sets The first of (3) The coordinates of the individual vertices of the model, Representing the number of vertices that are to be processed, Representing the result of a vertex set Polygonal curves obtained by sequentially connecting and closing the ends; in the interactive drawing process, the first frame image is in real time The method comprises the steps of drawing small vertex dots on an image copy of the image, connecting the small vertex dots with adjacent vertices, forming a closed polygon after drawing, and ensuring the visual confirmation and dynamic adjustment of the ROI shape by a user, wherein the specific formula is as follows: ; Wherein, the Representing adjacent vertices And (3) with Connecting the formed line segments for drawing the polygon boundary; Will close the polygon Binary mask map projected to and image size Wherein the specific formula is as follows: ; Wherein, the Representing the result of a vertex set The closed polygonal area is formed so that, Binary mask map representing region corresponding to ROI 。
  7. 7. The method for analyzing the shield slag hole in real time based on stable image enhancement and interaction detection driving according to claim 1, wherein the candidate frame set is characterized in that The screening process is as follows: using the target detection model, in each frame of image Or after-image stabilization And (3) performing target detection to obtain an initial candidate frame set, wherein the calculation formula is as follows: ; In the formula, Representing an initial set of candidate boxes that the object detection model outputs on the t-th frame, Representing the top left corner coordinates of the candidate box, Representing the lower right corner coordinates of the candidate box, Representing the confidence scores of the corresponding candidate boxes given by the target detection model, Representing the number of candidate frames detected by the t frame; setting confidence threshold Screening out candidate frames with confidence coefficient larger than or equal to the threshold value, wherein the calculation formula is as follows: ; In the formula, Representing a confidence threshold for screening out candidate boxes of low confidence; Representing the candidate frame subset after confidence level screening; represent the first Detecting confidence degrees of the candidate frames; after screening the candidate frames, the area of each candidate frame is screened Calculating, and setting an area upper limit threshold Reservation meets Is calculated as follows: ; ; In the formula, Represent the first The area of the individual candidate boxes is determined, An upper threshold representing the area of the candidate box, Representing a candidate frame set subjected to area threshold screening; after screening of the candidate frames is completed, the mass centers of the reserved candidate frames are calculated, and the calculation formula is as follows: ; ; In the formula, Represent the first The centroid coordinates of the candidate boxes, Representing the top left corner coordinates of the candidate box, Representing the lower right corner coordinates of the candidate box, Representing binary masks The value of the centroid position of the candidate frame, Represent the first Frame passing binary mask Obtaining final candidate frame set after screening, when the mass center is located in the binary mask Is effective area of (a), i.e The candidate frame is retained, otherwise, the retained candidate frame is removed, and the retained candidate frame forms a set of candidate frames in the ROI.
  8. 8. The method for analyzing the shield slag hole in real time based on the stable image enhancement and the interactive detection driving according to claim 1, wherein the candidate frames extracted by the target detection model are combined with the image segmentation model to obtain a serial end-to-end segmentation model, the candidate frames detected in the ROI are subjected to example level segmentation, and the frame level binary masks are obtained by fusion The method comprises the following specific steps: With a set of candidate frames As prompt input, wherein the candidate boxes are set Before inputting an image unsupervised segmentation large model, mapping a candidate frame to an image coordinate system inside the image unsupervised segmentation model, and marking as geometric transformation The transformed result is obtained by the following calculation formula: ; In the formula, Representing normalization and resolution adjustment operations in an image unsupervised segmentation model predictor; Representing candidate boxes The transformed result is a candidate frame under the internal coordinate system of the image unsupervised segmentation model; represent the first Frame candidate frame set A number corresponding to the cardinality of (a); collecting the converted candidate frames Inputting the image non-supervision segmentation model to carry out segmentation prediction to obtain a segmentation mask, wherein the calculation formula is as follows: ; In the formula, Representation subjected to geometric transformation The set of candidate boxes to be followed, The image segmentation function is represented as a function of the image segmentation, Represent the first Frame, th The segmentation mask of each candidate frame is subjected to pixel-by-pixel maximization fusion to obtain a frame-level binary mask The calculation formula is as follows: ; In the formula, Is the first In frame and the first The segmentation masks corresponding to the candidate frames are at pixel positions Is used for the value of (a) and (b), As a pixel-by-pixel maximum operator, Is the first Frame level binary mask of frame when the frame level binary mask A value of 1 indicates the foreground, when the frame-level binary mask A value of 0 of (c) indicates a background, Cover the first two-value mask pattern after fusion And (5) framing foreground areas corresponding to all candidate frames.
  9. 9. The method for analyzing the shield slag hole in real time based on the stable image enhancement and the interaction detection driving according to claim 8, wherein the highlight layer is overlapped with the original image frame in a weighting way to generate a visualized frame The steps of (a) are as follows: Acquiring a frame level mask Expanding it into a three-channel color image The calculation formula is as follows: ; In the formula, Represent the first Three-way color mask map of a frame is at a pixel Wherein the three components correspond in turn to the (R, G, B) channel, 255 representing the maximum intensity of the channel, Representing the expanded three-channel mask diagram, wherein the size is H multiplied by W multiplied by 3; in the original frame Or steady frame As input frame Masking the color And input frame Linear superposition processing is carried out to obtain a visual output frame The calculation formula of the weighted overlap-add is as follows: ; In the formula, Is the first The frame is input with an image, Is the first A three-way color mask map of the frame, for highlighting the target area, Represent the first Frame visualization of an output image, from an input frame Is formed by weighted superposition with the color mask pattern, For pixel coordinates The output value at which the output value is to be obtained, The weighting coefficients representing the original frame image are, Weighting coefficients representing a color mask, sequence For the visual frame sequence obtained by processing the input video frame by frame, the input video is decoded to obtain the image frames arranged in time sequence And for each frame Generating a corresponding frame-level mask And is composed of Expansion to obtain a color mask Then according to Visual output frame obtained through calculation Sequence is formed in turn according to t=1 to T And writing an output file according to a video coding format, wherein the output file is a visual segmentation video file, and a visual segmentation video is generated.
  10. 10. The method for analyzing the shield slag hole in real time based on the stable image enhancement and interaction detection driving according to claim 9, wherein the specific steps of outputting a frame-by-frame mask map sequence are as follows: obtaining a frame-by-frame visualization result Writing in the output video file in time sequence to complete slag hole video segmentation visualization, wherein the result is visualized frame by frame The calculation formula for writing the output video file in time sequence is as follows: ; In the formula, Representing the output video file, generated from the chronological writing of the frame-by-frame visualization results, Representing a video write function, for encoding a sequence of images into a video file, A video encoding format is represented and, The frame rate of the output video is indicated, The output resolution is represented and is consistent with the input frame.

Description

Real-time analysis method for shield slag hole based on image stabilization enhancement and interaction detection driving Technical Field The invention particularly relates to a shield slag hole real-time analysis method based on image stabilization enhancement and interaction detection driving. Background Video stabilization, object detection and image segmentation are fundamental tasks for video understanding and visual analysis, and have long taken a central role in academic research and engineering applications. The existing image stabilizing method mainly uses feature matching and RANSAC as main lines, namely key points and descriptors are extracted by ORB, SIFT and other operators, outliers are removed through matching and robust estimation, and inter-frame transformation is obtained to compensate camera shake. However, if higher order homography is simply adopted or mandatory geometric correction is performed on each frame, black edges and deformation are often introduced at the edges of the image, which requires a matched region patching strategy to maintain the continuity of timing and vision. On the segmentation side, the paradigm of detection driving is widely used. Typically, candidate frames are generated by using a target detection model, and then the frames are input into an instance segmentation model as prompt information, so that focusing capability on a target area is improved and overall efficiency is improved. The strategy is good when the scene is stable or the target motion is mild, but once the video has obvious jitter or target cross-frame rapid displacement, the time sequence consistency between the detection result and the segmentation result is easily damaged, so that the segmentation boundary jitter or target omission is caused. Man-machine collaborative interactive ROIs are also commonly employed in engineering practice. By marking the region of interest on the frame by the user, false detection and redundant calculation caused by irrelevant background can be obviously reduced, and the calculation force is concentrated in the local range of task attention, thereby improving the precision and stability to a certain extent. However, depending on a certain module alone, it is still difficult to fundamentally cope with complex video scenes, especially when jitter is obvious, the target scale is small, and there is a stabilizing requirement on the output impression, the shortcomings of these methods are amplified, namely the detection frame is easy to drift under strong jitter, the segmentation result is unstable, affine or perspective correction of the whole frame range is easy to cause field of view change and black edges, and false triggering outside the ROI wastes computational resources and reduces the overall accuracy. Disclosure of Invention The invention aims to provide a shield slag hole real-time analysis method based on image stabilization enhancement and interaction detection driving, which solves the problems existing in the prior art. The technical scheme is that the shield slag hole real-time analysis method based on image stabilization enhancement and interaction detection driving comprises the following steps: obtaining a slag hole video, separating the slag hole video into original image frames according to preset parameters, and sequencing the original image frames in sequence Obtaining a video sequence, performing image stabilization enhancement on the video sequence, performing ORB feature extraction on adjacent original image frames to obtain a feature image, and calculating the mean shift of the point displacement in the feature image by combining Lowe ratio matching with RANSAC interior point screeningThe characteristic images are subjected to output synthesis of translation image stabilization and clipping repair by mean shift to obtain a stabilized image sequence; Processing any appointed key frame through polygon interaction drawing to obtain a binary maskBinary maskPerforming subsequent target detection and segmentation in the ROI area; Constructing a target detection model and after image stabilization, sequencing Invoking a target detection model in each frame to perform target detection to obtain an initial candidate frame set, and performing preliminary screening on candidate frames in the initial candidate frame set according to the area threshold of the candidate frames to determine whether the centroid of the candidate frame falls into the threshold of the candidate framePerforming secondary screening to obtain a candidate frame set; Building an unsupervised segmentation large model of an image to set candidate framesIs used as input to generate a corresponding instance-level segmentation maskThen dividing the example into masksPerforming pixel-by-pixel maximization fusion to obtain a frame-level segmentation mask; Based on segmentation maskBuilding a monochrome highlight layer, and weighting and superposing the highlight layer and an original imag