CN-122024743-A - Power transformer fault voiceprint recognition method based on double-branch attention mechanism

CN122024743ACN 122024743 ACN122024743 ACN 122024743ACN-122024743-A

Abstract

The invention discloses a power transformer fault voiceprint recognition method based on a double-branch attention mechanism. Firstly, each frame of audio signal of the acquired transformer is converted into a Mel cepstrum coefficient (MFCC) to acquire voiceprint characteristics of the transformer under different working conditions, secondly, the MFCC characteristic matrix is subjected to dimension reduction according to the energy contribution rate, a dimension reduction MFCC matrix is formed by 10 continuous frames of MFCCs after dimension reduction of the same sample and is used as an input parameter of a subsequent recognition model, and then, in order to reduce the sensitivity of the recognition model to environmental noise, an improved CNN model is provided, namely, a double-branch attention mechanism module is added on the CNN model, and the frequency domain and the channel weight are regulated by utilizing a frequency spectrum attention mechanism and a channel attention mechanism, so that the expression of the model to the global and local effective characteristics is effectively improved. Experimental results show that the fault voiceprint recognition model based on the double-branch attention mechanism can effectively overcome the interference of environmental background noise, and compared with a CNN model, the fault recognition accuracy is improved by about 5.27%.

Inventors

SHEN MINGWEI
Jiao Caiying
XING HAIYUN

Assignees

河海大学

Dates

Publication Date: 20260512
Application Date: 20241112

Claims (9)

1. A power transformer fault voiceprint recognition method based on a double-branch attention mechanism is characterized by comprising the following steps: s1, extracting voiceprint characteristics, namely preprocessing a transformer audio data set, and extracting a mel cepstrum coefficient characteristic matrix of the transformer audio under different working conditions; s2, feature dimension reduction, namely obtaining the ratio of each dimension of the Mel cepstrum coefficient to the total dimension energy, and performing dimension reduction treatment on the feature matrix according to the energy contribution ratio to reduce the number of model training parameters; s3, constructing an improved model, namely improving a CNN model, providing a fault voiceprint recognition model Channel-spectra-CNN (CS-CNN) based on a double-branch attention mechanism, dividing a preprocessed data set into a training set and a verification set, inputting the training set and the verification set into the improved model, adjusting model parameters, and training the model until convergence; S4, testing voiceprint recognition results, namely inputting a test set acquired and processed in real time into a trained recognition model based on a double-branch attention mechanism, and realizing type recognition of different working conditions of the transformer in a complex environment.
2. The method for identifying the fault voiceprint of the power transformer based on the dual-branch attention mechanism according to claim 1, wherein the preprocessing is performed on the original audio of the power transformer in step S1, and the mel-frequency cepstrum coefficient feature matrix is extracted, specifically: S1.1, a data set consists of M kinds of transformer audios under different working conditions, wherein the number of samples of the audios under each working condition is N, the sampling frequency is 48KHZ, firstly, the acquired audio signals are divided into frames, the frame length is 8192, the overlapping rate is 50% in order to avoid overlarge span between two frames, and the corresponding time length of each frame is 170ms. The N audio signals under each condition are divided into a number of frames { x: x 1 ,x 2 ...x i ...x l }, l representing the number of frames, and x i representing the i-th frame signal. Selecting a Hamming window as a window function to carry out windowing on each frame sequence signal, and carrying out fast Fourier transform on the signals subjected to the windowing of the divided frames to obtain frequency domain signals X i (k) about k frequency domain components; S1.2, filtering the frequency domain signal X i (k) through a triangular band-pass filter, wherein the function formula of the filter H n (k) is as follows: Taking logarithm of the filtering result, and the specific formula is as follows: wherein D is 40 of the number of the Mel filters; S1.3, performing Discrete Cosine Transform (DCT) on x i (n) to obtain a Mel cepstrum coefficient, wherein the specific formula is as follows: Stacking the obtained mel-frequency coefficients according to a time sequence, taking 10 frames as one sample, and obtaining m mel-frequency coefficient matrixes { y: y 1 ,y 2 .....y m }, m=l/10 under different working conditions with the size of 10 multiplied by 40 multiplied by 1.
3. The method for identifying a fault voiceprint of a power transformer based on a dual-branch attention mechanism according to claim 1, wherein in step S2, the MFCC feature matrix is subjected to a dimension reduction process, and a weight of energy of each dimension in each frame of signal in total energy of the dimension is calculated based on an energy contribution rate, and ω n is a percentage of the energy contribution rate of each dimension: And f n is an n-th-dimension MFCC value, k represents the MFCC dimension, and since the energy proportion of the first 6 dimensions of the mel cepstrum coefficient is about 95% of the total energy and the other dimensions hardly contain voiceprint information, the dimension of the MFCC, which is used as effective characteristics, of the first 6 dimensions of the MFCC, omega n is screened to obtain a 10 multiplied by 6 multiplied by 1 MFCC feature matrix { z: z 1 ,z 2 .....z m } with the input dimensions under different working conditions, and model training and recognition are carried out.
4. The method for identifying the fault voiceprint of the power transformer based on the dual-branch attention mechanism according to claim 1, wherein in the step S3, the preprocessed dataset is divided, samples z obtained after the audio sequence is subjected to feature extraction and dimension reduction are randomly divided into two subsets z train and z validaion according to 8:2, each feature sample z corresponds to one fault type label lable, z train is a training set used for model training, and z validation is a verification set used for super-parameter adjustment and selection of models.
5. The method for identifying the fault voiceprint of the power transformer based on the dual-branch attention mechanism according to claim 1, wherein in the step S3, the transformer fault identification model CS-CNN based on the dual-branch attention mechanism is composed of 3 convolution modules, 1 downsampling layer, 3 dual-branch attention mechanism modules, 3 discard layers, 1 unfolding layer and 2 fully-connected layers: s3.1, the convolution module consists of a convolution layer and an activation layer, wherein the size of the convolution kernel of the convolution layer is 3 multiplied by 3, the step length is 1, a zero filling mode of the same is adopted, the activation layer uses a leak ReLU as an activation function, the model speed is improved, and meanwhile, compared with the ReLU, the dead ReLU problem is relieved under the condition of negative input; s3.2, a downsampling layer is arranged behind the convolution module, a maximum pooling method is adopted, the characteristic size is compressed to be 5 multiplied by 3 multiplied by 8 while the texture characteristic is reserved, and redundant information is removed; S3.3, a dual-branch attention mechanism module is arranged next, wherein the first branch utilizes a frequency spectrum attention module to allocate weight coefficients for different dimensions, the second branch is a channel attention module, different weights are given to channels through learning of global information of each layer of characteristics, and the results are summed and output in a parallel mode; S3.4, the discard layer is behind the double-branch attention mechanism module, so that overfitting caused by too small data volume can be prevented, the model stacks a three-time convolution module and the double-branch attention mechanism module to extract features, performs three discard operations, then flattens the feature map into a vector form through the flattening layer, and finally outputs classification results through the two full-connection layers and the softmax classifier.
6. The method for identifying the fault voiceprint of the power transformer based on the dual-branch attention mechanism according to claim 5, wherein the spectrum attention module in the step S3.3 captures importance degrees of different frequencies in the MFCC sequence, the input features are U epsilon R T×F×C , wherein T represents a time dimension, F represents a frequency dimension, C represents a channel number, the input is convolved by 1×1 to obtain a cross-channel spectrum feature V T ∈R T×Fx1 , global max pooling operation is performed along the frequency direction, important texture information W T ∈R 1×F×1 of a feature map is extracted, and finally, the input feature matrix U is multiplied by an activated matrix W T to obtain an output feature map U F .
7. The method for identifying a fault voiceprint of a power transformer based on a dual-branch attention mechanism according to claim 5, wherein the channel attention module in step S3.3 has an input characteristic U e R T×F×C , where T represents a time dimension, F represents a frequency dimension, C represents a channel number, and a global average pooling is performed on a characteristic U e R T×F of each channel of the input characteristic, and a specific formula is: Obtaining information S epsilon R 1×1 of each channel, stacking the channels to obtain characteristics S C ∈R 1×1×C , then obtaining a weight coefficient T C ∈R 1×1×C of each channel by using a ReLU function as an activation function of a first full-connection layer and a Sigmoid function as an activation function of a second full-connection layer through two full-connection layers, and multiplying an input characteristic U by the weight coefficient T C to obtain an output matrix U C .
8. The method for identifying the fault voiceprint of the power transformer based on the double-branch attention mechanism according to claim 1 is characterized in that in the step S3, the divided data set is input into an improved model, a tensorflow-based deep learning frame is adopted, model parameters are adjusted, the Batch size is set to be 16, the epoch is set to be 50, a comparison model is set, training is carried out until convergence, and meanwhile, in order to verify the effectiveness of the model, the accuracy (Acc) is adopted as an evaluation index of the model.
9. The method for identifying the fault voiceprint of the power transformer based on the dual-branch attention mechanism according to claim 1, wherein in the step S4, the identification of the fault type of the transformer is realized according to the MFCC characteristics, the real-time collected test set is subjected to characteristic extraction and dimension reduction, and then is input into a trained fault identification model of the transformer, and a label corresponding to a sample is output to obtain the fault type.

Description

Power transformer fault voiceprint recognition method based on double-branch attention mechanism Technical Field The invention relates to a voiceprint recognition technology, in particular to a power transformer fault voiceprint recognition method based on a double-branch attention mechanism. Background With the development of power systems, power transformers play an important role in daily life and industrial production. The faults of the transformer not only can cause equipment damage, but also have huge influences such as large-scale power failure, fire disaster and the like caused by the faults. Therefore, the method and the device can quickly and accurately identify the faults of the transformer, and have important significance for ensuring the stable operation of equipment. The conventional transformer fault identification mainly depends on physical characteristics of light, heat and vibration signals caused by faults, such as an infrared temperature measurement method, a pulse current method, a vibration analysis method and the like. However, the anti-interference capability of the method is poor, and detection personnel need to directly contact equipment, so that the normal operation of the equipment is affected. Whereas the transformer audio signal contains rich voiceprint information, different types of faults have unique voiceprint characteristics. Meanwhile, the voiceprint recognition technology does not directly contact equipment, so that the detection safety is greatly improved, and the measurement is more convenient and quick. The study of voiceprint recognition mainly comprises the extraction of voiceprint features and the construction of a model. In the field of feature extraction, researchers at home and abroad propose methods such as short-time Fourier transform (STFT), wavelet transform, mel cepstrum coefficient (MFCC) and the like. These methods convert the sound signal into a two-dimensional matrix containing time and frequency domain information of the voiceprint signal. Regarding the construction of models, the voiceprint recognition models commonly used at present are Hidden Markov Models (HMM), support Vector Machines (SVM), convolutional Neural Networks (CNN) and the like, and have important reference values in the aspect of transformer fault recognition. However, the conventional machine learning method requires manual adjustment of parameters to obtain optimal performance, which requires a lot of experimental experience, and the CNN has poor generalization ability and limitation in recognition in complex environments. In an actual operation scene of the transformer, the audio acquisition is interfered by environmental factors such as bird song, whistling and the like. Therefore, the method improves the identification model, reduces the sensitivity of the model to environmental noise, and has important significance in improving the fault identification rate of the power transformer in a complex environment. Disclosure of Invention In order to solve the problems in the prior art, the invention provides a transformer audio fault identification method based on a double-branch attention mechanism. The method has the advantages that the feature dimension reduction is carried out according to the energy contribution rate, the model training parameter quantity is effectively reduced, the training efficiency is improved, the anti-interference capability of the model is improved by adopting a double-branch attention mechanism, and the fault type of the transformer can be effectively identified in a complex noise environment. In order to achieve the above purpose, the present invention adopts the following technical scheme: a transformer fault identification method based on an attention mechanism comprises the following steps: s1, extracting voiceprint characteristics, namely preprocessing a transformer audio data set, and extracting a mel cepstrum coefficient characteristic matrix of the transformer audio under different working conditions; s2, feature dimension reduction, namely obtaining the ratio of each dimension of the Mel cepstrum coefficient to the total dimension energy, and performing dimension reduction treatment on the feature matrix according to the energy contribution ratio to reduce the number of model training parameters; s3, constructing an improved model, namely improving a CNN model, providing a fault voiceprint recognition model Channel-spectra-CNN (CS-CNN) based on a double-branch attention mechanism, dividing a preprocessed data set into a training set and a verification set, inputting the training set and the verification set into the improved model, adjusting model parameters, and training the model until convergence; S4, testing voiceprint recognition results, namely inputting a test set acquired and processed in real time into a trained recognition model based on a double-branch attention mechanism, and realizing type recognition of different working conditions o