CN-121982526-A - Full-automatic rail surface damage identification and classification method based on multi-mode large model

CN121982526ACN 121982526 ACN121982526 ACN 121982526ACN-121982526-A

Abstract

The invention relates to the technical field of intelligent detection of railway infrastructure, in particular to a full-automatic identification and classification method of surface damage of a steel rail based on a multi-mode large model, which comprises the steps of synchronously collecting three-mode original data in the same time window in the running process of the steel rail, extracting characteristics of the three-mode original data, outputting multi-mode characteristics, and obtaining the characteristics of the three-mode original data based on a formula Element-by-element fusion of the multi-modal features and generation ; To Performing self-attention based large model reasoning for input and outputting class confidence and severity scores, calculating evolution rate based on historical data And according to the formula Calculating risk score, outputting maintenance advice, and transmitting the detection result, risk assessment and maintenance advice to an operation and maintenance system or a display terminal.

Inventors

LIU HONGMEI
MENG SIMING
CHEN WEIXUN
JIANG YUELONG
QI MIAO
Yu Qianzi

Assignees

广州铁路职业技术学院（广州铁路机械学校）

Dates

Publication Date: 20260505
Application Date: 20260120

Claims (10)

1. A full-automatic identification and classification method for rail surface damage based on a multi-mode large model is characterized by comprising the following steps: three modes of original data, namely visual image data, in the same time window in the running process of the steel rail are synchronously collected Laser point cloud data Sound/vibration timing signal ; Extracting the characteristics of the three-mode original data and outputting multi-mode characteristics; based on the formula Element-by-element fusion of the multi-modal features and generation ; To be used for Performing self-attention based large model reasoning for the input and outputting class confidence and severity scores; calculating evolution rate based on historical data And according to the formula Calculating a risk score and outputting maintenance suggestions; And sending the detection result, the risk assessment and the maintenance advice to an operation and maintenance system or a display terminal.
2. The method for fully automatically identifying and grading the surface damage of the steel rail based on the multi-mode large model according to claim 1, wherein the feature extraction is performed on the three-mode original data, and multi-mode features are output, and the method comprises the following steps: For the visual image data Target candidate region detection using a target detection network and from the first Extracting image feature tensors from layer convolution features For a pair of Averaging and normalizing along the channel dimension to obtain an image characteristic tensor ; For the laser point cloud data Extracting three-dimensional geometric features and mapping the three-dimensional geometric features into two-dimensional/three-dimensional laser point cloud features with the same scale ; The laser point cloud features Including but not limited to normal vector estimation, curvature/relief calculation, voxel or depth learning based local geometry descriptors, and mapping to and from by interpolation or pooling Identical spatial resolution ; For the sound/vibration timing signal Performing event detection to obtain each detected event Time window of (2) Peak amplitude of event Corresponding class probability vectors ; The detection event is detected Mapping to image/geometric feature space according to spatio-temporal quantization: Set the total time window as Definition of The time mapping index is , And thereby construct a sound feature tensor Wherein for the detected event Is a set of rows of (1) The sound feature tensor satisfies each element: 。
3. the method for fully automatically identifying and grading the surface damage of the steel rail based on the multi-mode large model according to claim 2, wherein the method is based on a formula Element-by-element fusion of the multi-modal features and generation Comprising: Constructing a position mask tensor For the first detection of YOLO Bounding box of candidate objects 1 Is arranged in the corresponding channel, and the rest positions are 0; tensor of image features on an element-by-element basis And the sound characteristic tensor The position mask tensor Performing pre-fusion to obtain multi-mode fusion feature tensor The fusion formula is: ; Wherein the method comprises the steps of Is in combination with All 1 tensors of the same dimension, and sign Representing element-by-element multiplication.
4. A method for fully automatically identifying and classifying surface flaws of steel rails based on a multi-modal large model as recited in claim 3, wherein the steps of Performing self-attention based large model reasoning for inputs and outputting category confidence and severity scores, comprising: Will be A large model based on self-attention mechanism is input and the category confidence and the initial injury severity score of each candidate target are output by the large model 。
5. The method for full-automatically identifying and classifying the surface damage of the steel rail based on the multi-mode large model as recited in claim 4, wherein the evolution rate is calculated based on historical data And according to the formula Calculating a risk score, outputting a repair recommendation, comprising: Calculating a lesion evolution rate estimate based on historical inspection data, previous location lesion detection records, and vehicle load/operation information Risk scoring Wherein the risk score may be given by: ; Wherein the method comprises the steps of For an operation exposure factor related to the track position, The weight coefficient is determined in advance or obtained through data driving learning; According to risk score Dividing the detected damage into different risk levels with a preset threshold value set and outputting maintenance priority suggestions; The risk score Is used to define a maintenance level: When (when) When the vehicle is judged to be in emergency maintenance; When (when) When the maintenance is high-priority maintenance; When (when) Is of "medium priority"; Other cases are "routine monitoring".
6. The method for fully automatically identifying and classifying surface flaws of steel rails based on a multi-modal large model as recited in claim 5 wherein the visual image data is of the order of Target candidate region detection using a target detection network and from the first Extracting image feature tensors from layer convolution features For a pair of Averaging and normalizing along the channel dimension to obtain an image characteristic tensor Comprising: The image feature tensor Is obtained by the following operations of Arithmetically averaging in channel dimension to obtain And then will Normalized to interval And copy Secondary obtaining 。
7. The method for fully automatically identifying and classifying the surface damage of the steel rail based on the multi-mode large model as claimed in claim 6, wherein the method comprises the following steps of A large model based on self-attention mechanism is input and the category confidence and the initial injury severity score of each candidate target are output by the large model Comprising: The dynamic weight self-adaptive mechanism of the large model comprises the following substeps: on-line estimation is carried out on the quality/environment index of the input mode to obtain the visual environment quality index Quality index of laser point cloud And sound quality index ; The visual environment quality index Comprises brightness, contrast and shielding/rain and snow detection indexes, wherein the laser point cloud quality indexes Comprises abnormal point cloud density and reflectivity, and the sound quality index Including background noise ratio SNR, impact prominence; Based on And model history prediction confidence coefficient, calculating confidence coefficient weight of each mode Satisfies the following conditions And is also provided with ; The confidence weight Can be obtained by the following data-driven or regularized expression: Wherein the method comprises the steps of Is a trainable or preset scale parameter; Weight is weighted A attention module integrated into the upstream fusion or large model interior; Further tensor the image characteristics The laser point cloud feature And the sound characteristic tensor And carrying out weighted combination in the form of: ; Subsequently by Substitution of In the formula 。
8. The method for full-automatic identification and classification of surface flaws of steel rails based on a large multi-modal model as claimed in claim 7, wherein the flaw evolution rate estimation is calculated based on historical inspection data, previous position flaw detection records and vehicle load/operation information Risk scoring Comprising: The lesion evolution rate estimation Observing sequences through historical discrete time The regression fit comprises linear model fit and exponential model fit; when the regression fit is a linear model fit, ; When the regression fit is an exponential model fit, Then For the estimated evolution rate, take 。
9. The method for fully automatically identifying and classifying surface flaws of steel rails based on a multi-modal large model as recited in claim 8 wherein the said sound/vibration timing signals Performing event detection to obtain each detected event Time window of (2) Peak amplitude of event Corresponding class probability vectors Comprising: the detection event Is a random peak following formula: ; Wherein the parameter interval Can be according to different categories Is set and the generated peak value is used for constructing And probability vector 。
10. The method for fully automatically identifying and classifying the surface damage of the steel rail based on the multi-mode large model according to claim 9, wherein the method is characterized in that according to the risk score After classifying the detected damage into different risk levels and outputting maintenance priority suggestions with a preset threshold set, the method comprises the following steps: the final output of the candidate targets includes at least information of target category, category confidence level, and, 、、 The location coordinates of the track, the suggested maintenance priority, and the recommended maintenance time window.

Description

Full-automatic rail surface damage identification and classification method based on multi-mode large model Technical Field The invention relates to the technical field of intelligent detection of railway infrastructure, in particular to a full-automatic identification and classification method for rail surface damage based on a multi-mode large model. Background The steel rail is used as a key bearing part of a rail transit system, and the damage such as surface crack, peeling, wave grinding, crush injury and the like can directly influence the driving safety and the service life of a rail structure. Currently, railway departments mainly rely on visual shooting and laser measuring systems and manual inspection carried by locomotives or inspection vehicles to monitor the state of steel rails. However, the existing rail damage identification technology still has the following main defects: The single-mode detection has insufficient robustness, and the rail inspection system usually adopts a single-image visual mode or a laser geometric scanning mode to identify the damage. However, the image detection precision is seriously reduced due to the fact that the image is extremely easily affected by external environments, such as strong reflection, rain and snow coverage, greasy dirt pollution, insufficient illumination at night, dust emission shielding of a train and the like, and the problem of data loss caused by sparse, shielding or uneven reflection of laser point clouds in a high-speed running scene is also easily caused. The single-mode detection mode has poor recognition robustness in a complex line scene, and is difficult to cover various damage types and different working conditions. The multi-mode data is introduced, but the fusion mode is limited, namely the existing inspection system starts to introduce the multi-mode combination of the image and the point cloud, but the method generally adopts simple characteristic splicing, weighted average or rule judgment, has limited fusion depth, and cannot effectively utilize the association information among different modes. For example, images can provide texture variations while point clouds provide geometry, but synchronization, scale matching, noise differences between the two often lead to poor fusion. At the same time, sound/vibration data is typically used for overall monitoring of the track status, but is not effectively integrated into the lesion recognition model. The environment change is not included in the self-adaptive mechanism of the model, namely the multi-mode data quality change is obvious when the system faces different environments (such as rain, snow, greasy dirt, tunnel darkness or high light reflection section), but the traditional algorithm usually sets fixed weight or threshold value, and the identification strategy cannot be automatically adjusted according to the environment state. Therefore, the weight cannot be dynamically adjusted among the modalities, so that the final recognition result fluctuates greatly under a complex scene, and even is erroneously recognized or missed. The failure of analysis of the damage development trend can not provide the risk grade and the prediction function, and the steel rail damage detection system mostly stays at the stage of 'identifying the current damage', and only outputs the damage type or position. The rail damage has obvious evolution characteristics, such as crack propagation speed is obviously different along with the changes of vehicle flow, axle weight, accumulated fatigue degree and the like. The lack of prediction of the lesion development speed makes the current system unable to classify the risk level, difficult to give maintenance priority, and unable to meet the requirements of intelligent operation and maintenance on 'predictive maintenance' and 'on-demand maintenance'. The large model integration scheme suitable for multi-source data is lacking, however, the deep learning model achieves good effect in visual recognition, but most models are only aimed at single-mode training, and cannot process heterogeneous data of multiple modes such as images, point clouds, sounds and the like in a unified architecture. Meanwhile, the existing algorithm often needs to construct independent models aiming at different modes, and then performs post fusion according to rules, so that the whole system is complex, the instantaneity is insufficient, and the migration capability is weak. In summary, the prior art cannot meet the following requirements at the same time, namely, accurate identification of the steel rail damage is realized by comprehensively utilizing images and point Yun Hesheng tones, modal weights are automatically adjusted under different environmental conditions, identification robustness is improved, development trend prediction is performed on the damage, quantifiable risk assessment is provided, and unified large model architecture is used for realizing deep fusion and end-to-e