CN-120336825-A - Bearing voiceprint fault detection and diagnosis method for multi-motor operation site

CN120336825ACN 120336825 ACN120336825 ACN 120336825ACN-120336825-A

Abstract

The invention discloses a bearing voiceprint fault detection and diagnosis method aiming at a multi-motor operation site, which belongs to the field of motor bearing voiceprint fault detection and diagnosis, and comprises the steps of firstly decomposing recorded voiceprint signals through variation modal decomposition to construct components with different frequencies as centers, secondly screening the components by using energy entropy to remove environment voiceprint components and load abnormal sound components to reconstruct voiceprint signals, secondly decomposing the source number of the reconstructed signals by using an improved Conv-tasnet network to output decomposed signals, then extracting features of each decomposed signal by using a Fbank filter bank, performing self-adaptive soft thresholding weighting processing on the features by using a EMAtrs network, and finally performing grouping weight convolution on different weight features to obtain the features which can be used for classification, and determining equipment working conditions. The invention can better separate and extract the sound aliasing characteristics of multiple motors, thereby improving the accuracy of fault monitoring and diagnosis.

Inventors

ZHANG WENMING
TIAN ZHENJIANG
ZHANG XINGJIAN
LI YAQIAN
SONG TAO

Assignees

UNIV YANSHAN

Dates

Publication Date: 20250718
Application Date: 20250422
Priority Date: 20250422

Claims (8)

1. A bearing voiceprint fault detection and diagnosis method for a multi-motor operation site is characterized by comprising the following steps: s1, inputting a voice print signal received and recorded by acquisition equipment, and decomposing the voice print signal recorded and recorded by using variation modal decomposition to construct components taking different frequencies as centers; S2, screening the components by using an energy entropy, removing an ambient sound component and a load abnormal sound component, and reconstructing a voiceprint signal; S3, performing source number decomposition on the reconstructed signal by using the improved Conv-tasnet network, and outputting a decomposed signal; s4, extracting features of each decomposition signal by using Fbank filter banks; S5, performing self-adaptive soft thresholding weighting processing on the characteristics by using EMAtrs networks; S6, carrying out grouping weight convolution on different weight features by using SplitCAM block to obtain features which can be used for classification; s7, comparing different fault characteristics with the sound source characteristics to determine the working condition of the motor bearing.
2. The method for detecting and diagnosing the bearing voiceprint faults aiming at the multi-motor operation site is characterized in that the main structure of a Conv-tasnet network in the step S3 comprises an encoding layer, a time domain convolution network layer for improving convolution sequence and a decoding layer, and the specific improvement mode in the time domain convolution network layer is that serial convolution in which a group of cavity convolution expansion rates are sequentially connected from small to large is improved to multi-scale serial convolution in which a plurality of groups of different maximum cavity convolution expansion rates are provided, and the cavity convolution expansion rates in each group are sequentially connected from small to large to small.
3. The method for detecting and diagnosing bearing voiceprint faults on a multi-motor operation site according to claim 1, wherein the processing process of the Fbank filter bank on the input signal in the step S4 is sequentially pre-emphasis, hamming window function, fourier transform, power spectrum calculation and mel filter bank, and feature extraction is carried out on the decomposed signal through the processes.
4. The method for detecting and diagnosing bearing voiceprint faults on a multi-motor operating site as claimed in claim 3, wherein the low frequency component in the signal is reduced through pre-emphasis, the high frequency resolution is improved, the signal to noise ratio is improved, and the pre-emphasis calculation formula is as follows: y(t)=x(t)-αx(t-1) Where α is a pre-emphasis coefficient, typically 0.95 or 0.97, and x (t) and x (t-1) are the speech signals of the current and previous sample points, respectively, which is essentially a high-pass filter; The calculation formula of the Hamming window is: x w (n)=y(n)·w(n) Where x w (N) is the windowed signal, y (N) is the original signal, w (N) is the value of the hamming window function, N is the window length, Taking 0.46; Converting the signal from time domain to frequency domain, further observing the audio signal through energy distribution on frequency spectrum, wherein the calculation formula of FFT conversion is as follows: wherein S (k) is a frequency domain signal, and k and n are frequency domain and time index respectively; the calculation formula of the power spectrum is as follows: and finally, carrying out Mel filtering calculation, wherein a conversion formula between Mel frequency and linear frequency is as follows: The Mel filter group is a series of triangular filter groups, the frequency response can be determined according to the center frequency and the bandwidth, the center frequency of the Mel triangular filter is 1, the Mel triangular filter is gradually decreased towards two sides, and the specific formula is as follows: Multiplying and accumulating the energy spectrum and the frequency of the filter and taking the logarithm to obtain E m , namely Fbank characteristics of the audio data:
5. the method for detecting and diagnosing the bearing voiceprint faults on the multi-motor operation site according to claim 1 is characterized in that a EMAtrs network performs self-adaptive soft thresholding weighting processing on the characteristics on the basis of a common ResNet network in the step S5, the specific structure comprises a trunk structure, a CHANNEL GATE module and an EMA module, the trunk structure is formed by connecting two groups of convolution structures in series, each group comprises a 3X 3 convolution layer, a batch normalization layer and a ReLU activation function layer, the trunk input and the trunk output are connected through jumping, the CHANNEL GATE module is located between a second group of convolution structures and the trunk output, the main structure is a global average pooling layer, a full connection layer, a Sigmoid activation function and a weighting layer in sequence, the EMA module is similar to the CHANNEL GATE module in position, and the main structure is an absolute value processing layer, a horizontal pooling layer, a vertical pooling layer, a characteristic connection and convolution layer and a cross-space learning layer in sequence.
6. The method for detecting and diagnosing bearing voiceprint faults on a multi-motor operation site according to claim 5, wherein a full-connection layer in CHANNEL GATE module maps a global average value to a new space to generate a channel weight W, a weighting processing layer multiplies the channel weight with an original feature map to realize self-adaptive weighting, and a calculation formula is as follows: wherein F is a network characteristic diagram, Representing element-by-element multiplication, W being the channel weight, σ (W) being the Sigmoid activation function; the EMA module enhances the feature expression capability through multi-scale feature recombination, and the absolute value processing layer has a calculation formula as follows: X abs =|X| wherein X is a feature vector; the horizontal and vertical pooling layers respectively pool the outputs of the absolute value processing layer in the horizontal and vertical directions, and the calculation formula is as follows: X h-pool ＝Pool(X abs ,direcyion=horizontal) X v-pool ＝Pool(X abs ,direcyion=vertical) Wherein Pool () is a pooling function; The feature connection and the convolution layer connect the features of the horizontal pool and the vertical pool, and carry out feature recombination through 1x1 convolution, and the calculation formula is as follows: X conv ＝Conv(Concat(X h-pool ,X v-pool )) Wherein Concat () is a feature concatenation function, conv () is a1×1 convolution; the cross-space learning layer combines X conv with the main 3X 3 convolution feature map to realize cross-space feature interaction, and a recombined feature map is obtained, wherein the calculation formula is as follows: F final ＝F conv3×3 +X conv Wherein F conv3×3 is the trunk output; Finally, the jump connection of the backbone network adds the input feature map, the weighted feature map and the recombination feature map to obtain complete self-adaptive soft thresholding, and the calculation formula is as follows: F output ＝F input +F wighted +F final 。
7. The method for detecting and diagnosing bearing voiceprint faults on a multi-motor operation site according to claim 1, wherein in step S6, splitCAM block performs grouping splitting on original input features, each group uses a separate convolution kernel to perform feature extraction, and uses a CAM attention mechanism to perform weight vector correction on separation weights through dense connection operation and softmax operation respectively, and performs dispersion weighting feature combination, which can be expressed as: where S j,i represents the attention weight, σ is the activation function, W z is the learnable weight, b z is the bias, and y i is the output feature.
8. The method for detecting and diagnosing the bearing voiceprint faults on the multi-motor operation site is characterized by training a network by using a supervised learning strategy, so that the network can learn seven working condition characteristics of normal motor bearing, outer ring fracture, inner ring fracture, rolling body fracture, less oil, coal cinder invasion and assembly dislocation through the network structure, extracting on-site unknown working conditions in the production and use processes, separating the working condition characteristics from the learned characteristics through steps S1 to S6, comparing the working condition characteristics with the learned characteristics, and outputting the working condition label with the highest similarity of each sound source to finish the monitoring of the working condition of the bearing.

Description

Bearing voiceprint fault detection and diagnosis method for multi-motor operation site Technical Field The invention belongs to the field of voiceprint fault detection and diagnosis of motor bearings, and particularly relates to voiceprint fault detection and diagnosis of a multi-motor operation site. Background Industrial equipment has various working conditions and is complex, and the stable operation of the equipment directly influences the working efficiency and the safety. As a widely used component of industrial equipment, once a fault occurs, the equipment is stopped, production is delayed, and even safety accidents are caused, so that the motor fault detection and prediction is extremely important. Faults of the motor are mainly classified into electrical faults and mechanical faults, the electrical faults are easily detected and found by circuit faults, and the mechanical faults are often detected by a specific method because the mechanical faults occur in a motor protection shell. The bearing is used as the area which is most prone to mechanical failure in the motor, and the occurrence rate of the mechanical failure is up to 40%. For motor bearing fault diagnosis, the most common method is to detect based on vibration signals. Compared with vibration signal detection, voiceprint feature detection has the advantage of no need of contact, and has the convenience of low-cost on-line monitoring in the case of production sites with monitoring equipment. The mean-CNN (Mel Convolutional Neural Network) model is proposed by Shan S and used for fault diagnosis, a variation modal decomposition technology is utilized to process motor noise signals, and the mean spectrum characteristic extraction is combined with the CNN depth model. Luo Y et al propose enhanced feature extraction network to combine with the Context-AWARE MASKING (CAM) attention mechanism by convolution filters of different sizes. Wang H et al propose A FAST AND EFFICIENT network for speaker verification using context-AWARE MASKING (CAM++) network, take densely connected TIME DELAY neural network (D-TDNN) network as its core architecture, and strengthen the feature extraction capability of the model by means of dense connection, and meanwhile adopt the strategies such as context-aware masking technology and multi-granularity pooling, etc., so that the efficiency of fault diagnosis is greatly improved. However, the above solutions remain very far from being deployed in industrial sites. The difficulty with industrial voiceprint recognition is that the field production conditions are more complex, including environmental noise, human voice, field multiple device interference, etc. The problem that the voiceprint signals recorded by the acquisition equipment are difficult to unify due to the distance difference. Disclosure of Invention The invention aims to overcome the defects of the prior art, and provides a bearing voiceprint fault detection and diagnosis method aiming at a multi-motor operation site, which provides an audio source separation and full distance fault diagnosis algorithm aiming at a multi-equipment production site, the mixed voiceprints of a plurality of production devices can be separated and are suitable for different recording distances, so that the method has strong adaptability, the monitoring cost is reduced, the fault detection and diagnosis efficiency is improved, and the requirements of production sites are met. The invention adopts the technical scheme that the method for detecting and diagnosing the bearing voiceprint faults of the multi-motor operation site uses an improved Conv-tasnet (Convolutional Time-domain Audio Separation Network) network and an improved CAM++ network and comprises the following steps: s1, inputting a voice print signal received and recorded by acquisition equipment, and decomposing the voice print signal recorded and recorded by using variation modal decomposition to construct components taking different frequencies as centers; S2, screening the components by using an energy entropy, removing an ambient sound component and a load abnormal sound component, and reconstructing a voiceprint signal; S3, performing source number decomposition on the reconstructed signal by using the improved Conv-tasnet network, and outputting a decomposed signal; s4, extracting features of each decomposition signal by using Fbank filter banks; S5, performing self-adaptive soft thresholding weighting processing on the characteristics by using EMAtrs networks; S6, carrying out grouping weight convolution on different weight features by using SplitCAM block to obtain features which can be used for classification; s7, comparing different fault characteristics with the sound source characteristics to determine the working condition of the motor bearing. The technical scheme of the invention is further improved in that the step S3 improves the connection mode of the time domain convolution layers in the Conv-tasnet net