CN-122024003-A - Pavement crack segmentation processing method and device based on domain self-adaption
Abstract
The invention discloses a pavement crack segmentation processing method and device based on domain self-adaption, which are characterized in that multi-scene crack images are collected, a mixed domain sample library is built through multi-mode preprocessing such as self-adaption equalization, directional gradient enhancement and the like, a crack teacher large model is built through LoRA fine adjustment based on SAM, a lightweight student network containing a parameter collapse mechanism and a continuity enhancement module is designed, a multi-level knowledge distillation frame with output alignment, pseudo-tag self-training and feature alignment is built, a student model is trained through progressive strategies, and edge self-adaption deployment is achieved through a dynamic reasoning engine. According to the method, by means of domain self-adaption and targeted distillation, the model parameter and the calculated amount are greatly reduced, meanwhile, the continuous modeling of the crack is enhanced, the cross-domain generalization capability is improved, the real-time inspection requirement of edge equipment is met, and the method is suitable for accurately dividing the crack of the multi-scene road.
Inventors
- HU SHUGUANG
- GUO YUANHAO
- CAO JIANKUN
- ZHANG JIE
- WU NAIMING
- GONG XIANGYU
Assignees
- 中公高科(霸州)养护科技产业有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251225
Claims (10)
- 1. The pavement crack segmentation processing method based on domain self-adaption is characterized by comprising the following steps of: (1) Acquiring a designated scene crack image and preprocessing multi-mode collaborative data, and constructing a mixed domain sample library through directional gradient enhancement operation; (2) Constructing a crack teaching master model with a scene generalization crossing capability by taking a general vision large model as a basic framework through parameter fine adjustment, and refining knowledge in the crack segmentation field; (3) Constructing a student network model by adopting an encoding-decoding architecture, introducing a structural parameter successive collapse mechanism, uniformly and structurally scaling the channel numbers of an encoder and a decoder through a global collapse coefficient to generate a serial model with a reduced architecture consistent parameter number; (4) Constructing a multi-level knowledge distillation frame comprising output space structure alignment, continuity pseudo tag self-training and feature statistics alignment, and transferring the domain knowledge and continuity priori knowledge of the crack teaching master model to a student network model; (5) Adopting a progressive distillation training strategy, using AdamW an optimizer, combining a linear learning rate preheating and cosine annealing scheduling strategy, starting mixed precision training, setting dynamic gradient cutting, and training a student network model; (6) And the self-adaptive deployment of the model on the edge equipment is realized through a dynamic reasoning engine supporting heterogeneous computation and precision switching.
- 2. The pavement crack segmentation processing method based on domain adaptation according to claim 1, wherein in the step (1), the multi-mode collaborative data preprocessing comprises adaptive equalization, directional gradient enhancement, illumination disturbance and perspective transformation, and the mapping relation of the adaptive equalization is as follows: ; In the formula, Representing the cumulative distribution function of the local window, Representing input image at coordinates The gray value at which the color is to be changed, And The length and the width of the local window are respectively represented and used for controlling the statistical range of the local histogram; In the step (1), the specific logic of the directional gradient enhancement is as follows: respectively carrying out Sobel convolution on the self-adaptive equalized image and the horizontal and vertical directions 、 Performing convolution operation to obtain horizontal gradient Vertical gradient Through again Calculating a directional gradient magnitude, wherein 、 ,“ "Means that the convolution operation is performed, For enhancing the resolution of elongated slits in complex backgrounds.
- 3. The field-adaptive pavement crack segmentation processing method according to claim 2, wherein in the step (1), the expression of the illumination disturbance is: ; Where u is a luminance scaling factor for adjusting the image luminance scaling, For the brightness offset, the brightness offset is used for adjusting the overall brightness offset of the image; The perspective transformation is performed by a perspective projection matrix The realization, the transformation relation is: ; In the formula, For the pixel coordinates of the original image, For the perspective transformed pixel coordinates, And the coordinate coefficients are homogeneous coordinate coefficients and are used for completing the simulation of the visual angle and the distortion.
- 4. The pavement crack segmentation processing method based on domain adaptation according to claim 1, wherein in the step (2), the general vision large model is SAM, the parameter fine tuning is LoRA low-rank fine tuning method, and the formula is: ; In the formula, For the original Q, K, V projection matrix in the SAM image encoder, And A low-rank decomposition matrix of LoRA, The fine tuning formulas of the projection matrix corresponding to Q, K, V are respectively as follows: ; ; ; In the formula, 、 、 The original Q, K, V projection matrices are respectively used, 、 、 、 、 、 LoRA low-rank decomposition matrices corresponding to the projection matrices respectively; in step (2), the attention calculating logic of the SAM image encoder is to record the input feature sequence as Obtaining a query matrix through linear projection Key matrix Sum matrix Calculating the attention characteristics: ; In the formula, Representing a key matrix Is used in the manufacture of a printed circuit board, Representation of And (3) with Is used to determine the product of the transposed matrices, In order for the scaling factor to be a factor, Is a normalization function.
- 5. The domain-adaptive pavement crack segmentation processing method according to claim 1, wherein in the step (3), logic of the structural parameter successive collapse mechanism is as follows: Introducing global collapse coefficients Original channel number for each layer of student network model encoder and decoder Carrying out uniform structured scaling and light channel number after light weight In which, in the process, The values of (1) comprise 1.0, 0.75, 0.5 and 0.35, and are used for generating a serial lightweight model with consistent framework and decreasing parameter quantity.
- 6. The domain-adaptive-based pavement crack segmentation processing method according to claim 1, wherein in the step (3), the implementation logic of the crack continuity enhancing module comprises: (31) For input feature map Global average pooling in the horizontal direction and the vertical direction are respectively carried out, and the pooling formula in the horizontal direction is as follows: ; the vertical pooling formula is: ; In the formula, Representing a characteristic diagram Is provided with a height of (1), Representing a characteristic diagram Is defined by the width of the (c) a, Representing a characteristic diagram In coordinates of A feature value at the location; (32) Generating a saliency Mask reflecting the linear connected prior of the crack through an outer product: ; In the formula, For Mask in coordinates A weight value at; (33) Taking the Mask as a space attention weight to the feature map And carrying out weighted enhancement, wherein the formula is as follows: ; In the formula, Representing an element-by-element multiplication, The Mask after normalization is used for strengthening potential crack paths and inhibiting background noise; (34) Feature map with enhanced weighting The light convolution attention module is input, and the output is as follows: ; In the formula, The activation function is represented as a function of the activation, A 1 x 1 convolution operation is shown, Representing a deep convolution operation.
- 7. The domain-adaptive-based pavement crack segmentation process according to claim 1, wherein in step (4), the output spatial structure alignment distillation includes soft label distillation, countermeasure learning, and entropy minimization constraint; the loss function of soft label distillation is: ; In the formula, For the logits outputs of the teacher model, For the logits outputs of the student's model, For temperature coefficients, for smoothing probability distributions, Representing the function of the KL-divergence, Is a normalization function; The countering loss function is: ; In the formula, In order for the arbiter to be a function of the arbiter, A crack probability map output for the student model, Representing expected values of all pixels for aligning student output to a teacher on spatial morphology, communication structure and class boundaries; the entropy minimization constraint loss function is: ; In the formula, The pixel positions are indicated and the pixel positions are indicated, Pixel-in-pixel for student model The prediction probability is used for enabling the student model to have clear and continuous prediction results in the target domain.
- 8. The pavement crack segmentation processing method based on domain adaptation according to claim 7, wherein in the step (4), the specific logic of the continuity pseudo tag self-training is as follows: Generating a target domain pseudo tag through a teacher model, and screening out a low-quality pseudo tag by combining with crack continuity detection, wherein the screening conditions are as follows: ; In the formula, For a high confidence pseudo tag after screening, Pixel-in-pixel for teacher model The probability of the output at that point is, As a threshold value of the probability, Is a pixel A fracture continuity metric value at the location, For a continuity score threshold, "ζ" represents a logical AND operation; the student network model takes the high-confidence pseudo tag as additional supervision for training, and the training loss is as follows: ; In the formula, Representing a cross entropy loss function; In the step (4), the feature statistical alignment adopts MixStyle/BN STATISTICS ALIGNMENT mechanism, and the formula is as follows: ; In the formula, As a feature map of the current domain, For a hybrid feature map sampled from other domains, Is a random mixing coefficient; in the step (4), the multi-level knowledge distillation frame specifically includes: Constraining the connected consistency of the student model on the pixel level path by a loss function: ; In the formula, Representing contiguous pairs of slit pixels; a crack probability map output for the student network model, The prediction probabilities of the i and j pixels in adjacent pairs are respectively; is L1 norm and is used for measuring the difference of the prediction results of adjacent pixels; local direction field of teacher model Directional field distilled to student network model And through a loss function Constraining the directional continuity of the student model, wherein, Crack in-coordinates predicted for teacher model The local direction angle at which the position is changed, A horizontal component and a vertical component corresponding to the direction angle; for the direction field output by the student network model, The square of the L2 norm is used for measuring the difference of the fields in the directions of teachers and students, and fracture and jagged gaps are avoided in crack prediction; Constraining the integral communication structure of the cracks output by the student network model by using the topological statistical information of the number of the communication branches and the distribution of the endpoints; the training process of the multi-level knowledge distillation framework is driven by a weighted composite loss function: ; In the formula, 、 、 、 、 The weight coefficients of the loss items are respectively; For the soft label distillation loss, In order to account for the loss of alignment of features, In order to achieve a continuous structural distillation loss, In order to combat the loss of this, Self-training loss for pseudo tags.
- 9. The field-adaptive pavement crack segmentation processing method according to claim 1, wherein in the step (5), the progressive distillation training strategy specifically comprises: First, the global collapse coefficient Performing full supervision pre-training on the reference student network, keeping the parameters of the teacher model frozen, and performing the pre-training on the reference student network The student network model of the value is respectively distilled and trained, a AdamW optimizer is used in the training process, a linear learning rate preheating and cosine annealing scheduling strategy is combined, mixed precision training is started, dynamic gradient cutting is set, and training indexes are monitored in real time through a Tensor Board to ensure that the model is fully converged; In the step (6), the dynamic reasoning engine supports CPU/GPU/NPU heterogeneous hardware, the quantization precision supports seamless dynamic switching from FP32 to INT8, the FP32 represents 32-bit floating point precision, the INT8 represents 8-bit integer precision, the dynamic reasoning engine integrates a runtime optimization mechanism, senses equipment load and power consumption budget in real time and adjusts an execution strategy, provides a standardized increment learning interface, and supports streaming processing and periodical small batch updating of new acquired data at an edge end so as to cope with data drift.
- 10. The domain-adaptive road surface crack segmentation processing device, which adopts the domain-adaptive road surface crack segmentation processing method according to any one of claims 1 to 9, is characterized by comprising the following steps: the data acquisition and preprocessing module is used for acquiring the designated scene crack image and preprocessing the multi-mode collaborative data, and constructing a mixed domain sample library through directional gradient enhancement operation; the teacher model construction module is used for constructing a crack teaching teacher model with cross-scene generalization capability by taking the general vision large model as a basic framework through parameter fine adjustment, and refining the knowledge of the crack segmentation field; the system comprises a student network construction module, a crack continuity enhancement module, a crack analysis module and a crack analysis module, wherein the student network construction module is used for constructing a student network model by adopting an encoding-decoding architecture, introducing a structural parameter successive collapse mechanism, and carrying out uniform structural scaling on the channel numbers of an encoder and a decoder through a global collapse coefficient to generate a serial model with a reduced architecture consistent parameter quantity; The knowledge migration module is used for constructing a multi-level knowledge distillation frame comprising output space structure alignment, continuity pseudo tag self-training and feature statistics alignment, and migrating the domain knowledge and continuity priori knowledge of the crack teaching master model to the student network model; The student network model training module is used for training the student network model by adopting a progressive distillation training strategy, using a AdamW optimizer and combining a linear learning rate preheating and cosine annealing scheduling strategy, starting mixed precision training and setting dynamic gradient cutting; The edge deployment module is used for realizing the self-adaptive deployment of the model on the edge equipment through a dynamic reasoning engine supporting heterogeneous computation and precision switching.
Description
Pavement crack segmentation processing method and device based on domain self-adaption Technical Field The invention belongs to the technical field of pavement crack treatment, and particularly relates to a pavement crack segmentation treatment method and device based on domain self-adaption. Background The pavement cracks are used as main expression forms of early road diseases, and accurate segmentation is the key of road maintenance and safety early warning. The existing fracture segmentation technology mainly depends on a deep convolutional neural network or a general vision large model, but has the following problems: Firstly, differences of pavement materials, illumination conditions and shooting visual angles can cause deviation of image characteristic distribution, and the segmentation precision of a model trained by a single scene is greatly reduced in a new scene. Secondly, the high-precision model has large parameter quantity and complex calculation, can not be deployed in the vehicle-mounted terminal and other calculation-force-limited equipment, and the lightweight model is easy to crack, split, break and discontinuous due to simplified feature modeling. Third, traditional distillation only focuses on output layer knowledge migration, and the geometric characteristics of the slit 'slender communication' are not fully utilized, so that the core priori knowledge of the teacher model cannot be effectively transferred to the student model. Fourth, the existing model lacks the capabilities of dynamic precision switching, incremental learning and the like, is difficult to adapt to edge equipment with different calculation forces, and has weak coping capability on data drift. Therefore, a technical solution that combines cross-domain generalization capability, lightweight deployment requirements and crack segmentation continuity is needed to meet engineering application requirements of real-time road inspection. Disclosure of Invention Therefore, the invention provides a pavement crack segmentation processing method and device based on domain self-adaption, which are used for solving or partially solving the problems mentioned in the background art. In order to achieve the above purpose, the invention provides the following technical scheme that in a first aspect, a pavement crack segmentation processing method based on domain self-adaption is provided, and the method comprises the following steps: (1) Acquiring a designated scene crack image and preprocessing multi-mode collaborative data, and constructing a mixed domain sample library through directional gradient enhancement operation; (2) Constructing a crack teaching master model with a scene generalization crossing capability by taking a general vision large model as a basic framework through parameter fine adjustment, and refining knowledge in the crack segmentation field; (3) Constructing a student network model by adopting an encoding-decoding architecture, introducing a structural parameter successive collapse mechanism, uniformly and structurally scaling the channel numbers of an encoder and a decoder through a global collapse coefficient to generate a serial model with a reduced architecture consistent parameter number; (4) Constructing a multi-level knowledge distillation frame comprising output space structure alignment, continuity pseudo tag self-training and feature statistics alignment, and transferring the domain knowledge and continuity priori knowledge of the crack teaching master model to a student network model; (5) Adopting a progressive distillation training strategy, using AdamW an optimizer, combining a linear learning rate preheating and cosine annealing scheduling strategy, starting mixed precision training, setting dynamic gradient cutting, and training a student network model; (6) And the self-adaptive deployment of the model on the edge equipment is realized through a dynamic reasoning engine supporting heterogeneous computation and precision switching. In the step (1), the multi-mode collaborative data preprocessing comprises adaptive equalization, directional gradient enhancement, illumination disturbance and perspective transformation, wherein the mapping relation of the adaptive equalization is as follows: In the formula, Representing the cumulative distribution function of the local window,Representing input image at coordinatesThe gray value at which the color is to be changed,AndThe length and width of the local window are indicated, respectively, for controlling the statistical range of the local histogram. As a preferable scheme of the pavement crack segmentation processing method based on domain self-adaption, in the step (1), the specific logic of the directional gradient enhancement is as follows: respectively carrying out Sobel convolution on the self-adaptive equalized image and the horizontal and vertical directions 、Performing convolution operation to obtain horizontal gradientVertical gradientThrough agai