CN-121983087-A - Method and system for collecting voiceprint signals of transformer main equipment

CN121983087ACN 121983087 ACN121983087 ACN 121983087ACN-121983087-A

Abstract

The invention discloses a method and a system for collecting voiceprint signals of transformer main equipment, comprising the following steps: and synchronously acquiring multichannel original voiceprint signals through the voiceprint sensing array, and carrying out phase correction by using space azimuth codes to realize signal alignment. And extracting voiceprint envelope characteristics, matching the voiceprint envelope characteristics with a standard template to obtain the matching degree of the operation conditions, screening the dominant channels according to the matching degree, and carrying out weighted fusion to generate an enhanced voiceprint signal. And finally, separating the enhancement signal into intrinsic mode components reflecting vibration characteristics of different parts through time-frequency domain joint decomposition. The method can improve the signal-to-noise ratio of the voiceprint signal, clearly separate the vibration characteristics of all parts in the equipment, and provide a more accurate data basis for the state monitoring and fault diagnosis of the transformer main equipment.

Inventors

HAN SHUAI
ZHANG BOWEN
LI LIHUA
SHAO MENGYU
WANG YANDA
LIAO SIZHUO
ZHAO XIN
GAO FEI
YANG NING
Wang Kuangyan
YANG YANG
CHEN MEI
SHANG WENTONG

Assignees

北京谛声科技有限责任公司
中国电力科学研究院有限公司

Dates

Publication Date: 20260505
Application Date: 20260402

Claims (10)

1. The voiceprint signal acquisition method for the transformer main equipment is characterized by comprising the following steps of: Acquiring a multichannel original voiceprint signal set synchronously acquired by voiceprint sensing arrays arranged on the surface of a shell in different spatial orientations under various preset operation conditions of a substation main device, wherein each channel signal in the multichannel original voiceprint signal set carries corresponding spatial orientation coding information and timestamp information of acquisition time; Performing phase correction processing based on signal propagation path differences on the multi-channel original voiceprint signal set, generating a phase compensation factor by using phase offset generated between channel signals corresponding to different space azimuth coding information due to acoustic arrival time differences, and performing phase alignment operation on each channel signal according to the phase compensation factor to generate a multi-channel voiceprint signal set after space synchronization; extracting the voiceprint signal envelope characteristics of each channel signal in the multichannel voiceprint signal set after spatial synchronization, and carrying out dynamic time regular matching processing on the voiceprint signal envelope characteristics and a standard voiceprint envelope template under each operation working condition of a pre-stored transformer main device to obtain a current operation working condition matching degree sequence corresponding to each channel signal; Channel screening and weighted fusion processing are carried out on the multichannel voiceprint signal set after spatial synchronization based on the current operation condition matching degree sequence, dominant channel signals with matching degree higher than a preset dynamic threshold are screened out, fusion weight coefficients dynamically adjusted along with the matching degree are distributed to the dominant channel signals, and an enhanced voiceprint signal fused with multi-space azimuth information is generated; And carrying out time-frequency domain joint decomposition processing on the enhanced voiceprint signal, decomposing the enhanced voiceprint signal into a plurality of intrinsic mode components comprising an iron core excitation vibration frequency component, a winding load vibration frequency component and a cooling device mechanical vibration frequency component, classifying and combining the plurality of intrinsic mode components according to respective corresponding frequency ranges, and generating a target separated voiceprint signal acquisition result reflecting the vibration characteristics of different parts in the transformer main equipment.
2. The method of claim 1, wherein the acquiring the set of multichannel original voiceprint signals synchronously acquired by the voiceprint sensing arrays disposed at different spatial orientations of the housing surface by the substation master device under a plurality of preset operating conditions comprises: Receiving multichannel original voiceprint signals synchronously acquired by the voiceprint sensing array under the idle running condition, the load running condition and the independent running condition of the cooling device respectively; Distributing corresponding space azimuth codes for the voiceprint signals of each channel under each working condition, and associating and collecting time stamp information of start and stop moments; arranging the multichannel voiceprint signals under each working condition according to the spatial azimuth coding sequence to form signal sets respectively corresponding to each working condition; And performing association combination on the signal sets according to the operation condition identification to generate the multi-channel original voiceprint signal set, wherein each signal unit in the multi-channel original voiceprint signal set is provided with a unique operation condition label, space azimuth coding information and time stamp information.
3. The method according to claim 1, wherein the performing a phase correction process based on a signal propagation path difference on the multi-channel original voiceprint signal set, generating a phase compensation factor by using a phase offset generated between channel signals corresponding to different spatial orientation coding information due to a difference in arrival time of an acoustic wave, and performing a phase alignment operation on each channel signal according to the phase compensation factor, generating a spatially synchronized multi-channel voiceprint signal set includes: Extracting a plurality of channel signals with the same operation condition label and the same acquisition time stamp information from the multi-channel original voiceprint signal set to form a synchronous acquisition signal group, taking the channel signals positioned at a preset center reference position in the synchronous acquisition signal group as phase references, and calculating the sound wave arrival time difference of other channel signals in the group relative to the phase references; Determining the phase offset of each channel signal according to the sound wave arrival time difference, and generating a corresponding phase compensation factor; performing phase alignment operation on each channel signal in the synchronous acquisition signal group by using the phase compensation factors to obtain a multichannel voiceprint signal subset after spatial synchronization; And executing the phase correction processing on all synchronous acquisition signal groups in the multichannel original voiceprint signal set, and combining the multichannel voiceprint signal subsets after each spatial synchronization according to the original time and working condition sequence to generate a complete multichannel voiceprint signal set after the spatial synchronization.
4. The method of claim 1, wherein the extracting the voiceprint signal envelope feature of each channel signal in the spatially synchronized multichannel voiceprint signal set, and performing dynamic time warping matching processing on the voiceprint signal envelope feature and a standard voiceprint envelope template under each operation condition of a pre-stored power transformation main device to obtain a current operation condition matching degree sequence corresponding to each channel signal, includes: extracting the voiceprint signal envelope characteristics of each channel signal in the multichannel voiceprint signal set after spatial synchronization; Carrying out dynamic time regular matching operation on the voiceprint signal envelope characteristics of each channel signal and a standard voiceprint envelope template under a pre-stored no-load operation condition, a pre-stored load operation condition and a pre-stored independent operation condition of a cooling device respectively to obtain accumulated distance measurement values with each standard template; determining a preliminary matching operation condition and a matching confidence score of each channel signal according to the accumulated distance metric value; And generating a matching degree sequence of the current operation condition corresponding to each channel signal based on the preliminary matching operation condition and the matching confidence degree score.
5. The method according to claim 1, wherein the performing channel screening and weighted fusion processing on the spatially synchronized multichannel voiceprint signal set based on the current operation condition matching degree sequence, screening out dominant channel signals with matching degree higher than a preset dynamic threshold, and distributing fusion weight coefficients dynamically adjusted according to the matching degree to each dominant channel signal, and generating an enhanced voiceprint signal fused with multi-spatial azimuth information includes: Determining the corresponding preset dynamic threshold value for the no-load matching degree index, the load matching degree index and the cooling device matching degree index according to the current operation condition matching degree sequence; Comparing each matching degree index of each channel signal in the multi-channel voiceprint signal set after spatial synchronization with the corresponding preset dynamic threshold value, screening out channel signals with at least one matching degree index higher than the corresponding threshold value, and forming an advantageous channel signal set; Calculating an initial fusion weight coefficient for each channel signal according to the sum of the matching degree indexes of each channel signal in the dominant channel signal set and the spatial azimuth coding information of the sum; Correcting the initial fusion weight coefficient based on the azimuth correlation reflected by the spatial azimuth coding information, and normalizing the corrected weight coefficient to obtain a final normalized fusion weight coefficient of each channel signal; and carrying out weighted summation on all channel signals in the dominant channel signal set by using the normalized fusion weight coefficient to generate the enhanced voiceprint signal fused with the multi-space azimuth information.
6. The method according to claim 1, wherein the performing time-frequency domain joint decomposition processing on the enhanced voiceprint signal decomposes the enhanced voiceprint signal into a plurality of eigenmode components including an excitation vibration frequency component of an iron core, a vibration frequency component of a load of a winding, and a mechanical vibration frequency component of a cooling device, and classifies and combines the plurality of eigenmode components according to respective corresponding frequency ranges to generate a target separated voiceprint signal acquisition result reflecting vibration characteristics of different components in a power transformation main device, and includes: Inputting the enhanced voiceprint signal into a signal processing module of an empirical mode decomposition algorithm, performing iterative screening treatment on the enhanced voiceprint signal through the empirical mode decomposition algorithm, sequentially extracting a plurality of eigenmode function components with frequencies ranging from high to low from the enhanced voiceprint signal, wherein each eigenmode function component represents a local characteristic oscillation mode of different time scales in the enhanced voiceprint signal; Calculating the correlation coefficient between each eigenmode function component and the original enhanced voiceprint signal after extracting each eigenmode function component, stopping iterative screening when the correlation coefficient is lower than a preset decomposition termination threshold value, and forming an initial eigenmode component set of the enhanced voiceprint signal by all eigenmode function components extracted at the moment and residual signal components after the last iteration together; Carrying out Hilbert transformation on each eigenmode function component in the initial eigenmode component set to obtain an instantaneous frequency parameter and an instantaneous amplitude parameter of each eigenmode function component, and drawing a track of the temporal change of the instantaneous frequency parameter of each eigenmode function component as an instantaneous frequency curve of the eigenmode function component; Scanning instantaneous frequency curves of all eigenmode function components in the initial eigenmode component set according to a fundamental frequency range preset value of the excitation vibration of the iron core of the transformer main equipment, and identifying eigenmode function components of which the instantaneous frequency curves are always stable in the fundamental frequency range preset value of the excitation vibration of the iron core on a time axis as eigenmode components leading the excitation vibration of the iron core; Scanning an instantaneous frequency curve of the residual eigenmode function components in the initial eigenmode component set according to a preset value of a characteristic frequency range of the load vibration of the winding of the transformer main equipment, and identifying eigenmode function components, of which the frequency components are matched with the preset value of the characteristic frequency range of the load vibration of the winding, of the instantaneous frequency curve on a time axis as eigenmode components leading the load vibration of the winding; scanning an instantaneous frequency curve of eigenmode function components which are not classified in the initial eigenmode component set according to a preset value of a characteristic frequency range of mechanical vibration of the cooling device of the transformer main equipment, and identifying eigenmode function components, of which the occurrence frequency components on a time axis are matched with the preset value of the characteristic frequency range of the mechanical vibration of the cooling device, of the instantaneous frequency curve as eigenmode components leading the mechanical vibration of the cooling device; Performing accumulation reconstruction processing on all the identified intrinsic mode components leading by the iron core excitation vibration to generate an iron core excitation vibration separation voiceprint signal containing the complete time domain waveform of the iron core excitation vibration; performing accumulation reconstruction processing on all the identified intrinsic mode components leading to the winding load vibration, and generating winding load vibration separation voiceprint signals containing complete time domain waveforms of the winding load vibration; Performing accumulation reconstruction processing on the identified intrinsic mode components leading to the mechanical vibration of all the cooling devices to generate cooling device mechanical vibration separation voiceprint signals containing complete time domain waveforms of the mechanical vibration of the cooling devices; And carrying out association combination on the iron core excitation vibration separation voiceprint signal, the winding load vibration separation voiceprint signal and the cooling device mechanical vibration separation voiceprint signal according to the corresponding vibration source types to generate a target separation voiceprint signal acquisition result reflecting the vibration characteristics of different parts in the transformer main equipment.
7. The method of claim 1, further comprising, after generating the target split voiceprint signal acquisition result: The iron core excitation vibration separation voiceprint signal, the winding load vibration separation voiceprint signal and the cooling device mechanical vibration separation voiceprint signal are respectively input into a corresponding pre-constructed feature extraction network to be processed so as to respectively generate an iron core voiceprint feature vector, a winding voiceprint feature vector and a cooling device voiceprint feature vector; Splicing and combining the core voiceprint feature vector, the winding voiceprint feature vector and the cooling device voiceprint feature vector to generate a combined state feature tensor containing state information of multiple parts of the transformer main equipment; Inputting the combined state characteristic tensor into a pre-constructed multi-task classification output layer for processing so as to synchronously output a first probability distribution that the excitation vibration state of the iron core belongs to a preset iron core state category, a second probability distribution that the load vibration state of the winding belongs to a preset winding state category and a third probability distribution that the mechanical vibration state of the cooling device belongs to a preset cooling device state category.
8. The method of claim 1, wherein the performing the time-frequency domain joint decomposition on the enhanced voiceprint signal further comprises: Grouping all channel signals in the multichannel voiceprint signal set after spatial synchronization according to the spatial azimuth coding information corresponding to each channel signal to generate a plurality of spatial azimuth voiceprint signal subsets taking spatial azimuth as grouping identification; Carrying out superposition and average processing on all channel signals in each space azimuth voiceprint signal subset, calculating arithmetic average values of a plurality of channel signals in the same space azimuth at the same time sampling points, and generating azimuth average voiceprint signals corresponding to each space azimuth; Extracting short-time energy characteristics of azimuth average voiceprint signals corresponding to each spatial azimuth, dividing each azimuth average voiceprint signal into continuous short-time frame units, calculating the square sum of all sampling points in each short-time frame unit as the short-time energy value of the short-time frame unit, and generating a short-time energy change curve corresponding to each spatial azimuth; Carrying out peak detection processing on the short-time energy change curves corresponding to each space azimuth, identifying local peak points exceeding a preset energy threshold value in the short-time energy change curves of each space azimuth and corresponding time positions thereof, and generating an energy peak time sequence of each space azimuth; Performing cross-azimuth comparison analysis on energy peak time sequences of different spatial azimuths, finding out the energy peak point pairs with the closest occurrence time on the different spatial azimuths, and calculating the absolute value of the time difference between each pair of energy peak points as a time delay parameter for the voiceprint signal to reach the different spatial azimuths; calculating the propagation speed estimated value of the voiceprint signal in the shell structure of the power transformation main equipment according to the time delay parameter and the geometric distance parameter between different space orientations of the shell surface of the power transformation main equipment; Performing primary positioning on sound source positions of subsequently acquired voiceprint signals by using the propagation speed estimation value, and calculating a space coordinate estimation value of a main vibration source generating the current voiceprint signals in the power transformation main equipment or on the surface by measuring time differences of reaching a plurality of space azimuth voiceprint signals at the same time and combining space azimuth geometric coordinates; Comparing the space coordinate estimated value with a standard space coordinate range of an internal iron core, a winding and a cooling device of the transformer main equipment, and determining the source component attribution probability of a main vibration component in the current enhanced voiceprint signal; dynamically adjusting frequency search range parameters for identifying an excitation vibration frequency component of an iron core, a vibration frequency component of a winding load and a mechanical vibration frequency component of a cooling device in subsequent time-frequency domain joint decomposition processing according to the attribution probability of the source component; and inputting the frequency search range parameter and the enhanced voiceprint signal into a time-frequency domain joint decomposition processing module together, so that the subsequent decomposition processing process can carry out self-adaptive frequency component identification according to the main vibration source space position information of the current voiceprint signal.
9. The method of claim 1, wherein generating the target split voiceprint signal acquisition reflecting the vibration characteristics of different components within the power transformation master device further comprises, after: Dividing the iron core excitation vibration separation voiceprint signal, the winding load vibration separation voiceprint signal and the cooling device mechanical vibration separation voiceprint signal according to a preset fixed time window length respectively to generate a plurality of continuous time window segments corresponding to each other; Performing fast Fourier transform on each time window segment to obtain a frequency domain power spectrum curve corresponding to each time window segment; extracting power spectrum peak amplitudes at fundamental frequency and odd-numbered harmonic frequency positions from a frequency domain power spectrum curve corresponding to the core excitation vibration separation voiceprint signal to generate core fundamental frequency and harmonic amplitude feature vectors; Extracting power spectrum integral energy values at preset characteristic frequency and side band positions from a frequency domain power spectrum curve corresponding to the winding load vibration separation voiceprint signal to generate a winding characteristic frequency band energy characteristic vector; Extracting power spectrum peak values and corresponding frequency values at a plurality of main frequency component positions from a frequency domain power spectrum curve corresponding to the mechanical vibration separation voiceprint signal of the cooling device to generate a multi-frequency peak characteristic vector of the cooling device; And combining the characteristic vectors of the fundamental frequency and harmonic amplitude of the iron core, the characteristic vector of the energy of the winding characteristic frequency band and the characteristic vector of the multi-frequency peak value of the cooling device, which correspond to the same time window, so as to generate the comprehensive state voiceprint fingerprint vector of the transformer master equipment, which corresponds to the time window.
10. A server system comprising a server for performing the method of any of claims 1-9.

Description

Method and system for collecting voiceprint signals of transformer main equipment Technical Field The invention relates to the technical field of voiceprint recognition, in particular to a voiceprint signal acquisition method and system of transformer main equipment. Background At present, the voiceprint monitoring of the transformer main equipment mainly depends on a single-point acquisition mode, and the vibration state of multiple parts inside the equipment is difficult to comprehensively reflect. The multichannel voiceprint signal has the phase mismatch problem caused by propagation path difference, and the follow-up feature extraction and fault diagnosis accuracy are affected. Meanwhile, the existing method lacks an effective fusion mechanism for multichannel signals, and cannot fully utilize space azimuth information, so that the signal-to-noise ratio of the voiceprint signals is low and the characteristic separation is unclear. Disclosure of Invention The invention aims to provide a method and a system for collecting voiceprint signals of transformer main equipment. In a first aspect, an embodiment of the present invention provides a method for collecting voiceprint signals of a substation main device, including: Acquiring a multichannel original voiceprint signal set synchronously acquired by voiceprint sensing arrays arranged on the surface of a shell in different spatial orientations under various preset operation conditions of a substation main device, wherein each channel signal in the multichannel original voiceprint signal set carries corresponding spatial orientation coding information and timestamp information of acquisition time; Performing phase correction processing based on signal propagation path differences on the multi-channel original voiceprint signal set, generating a phase compensation factor by using phase offset generated between channel signals corresponding to different space azimuth coding information due to acoustic arrival time differences, and performing phase alignment operation on each channel signal according to the phase compensation factor to generate a multi-channel voiceprint signal set after space synchronization; extracting the voiceprint signal envelope characteristics of each channel signal in the multichannel voiceprint signal set after spatial synchronization, and carrying out dynamic time regular matching processing on the voiceprint signal envelope characteristics and a standard voiceprint envelope template under each operation working condition of a pre-stored transformer main device to obtain a current operation working condition matching degree sequence corresponding to each channel signal; Channel screening and weighted fusion processing are carried out on the multichannel voiceprint signal set after spatial synchronization based on the current operation condition matching degree sequence, dominant channel signals with matching degree higher than a preset dynamic threshold are screened out, fusion weight coefficients dynamically adjusted along with the matching degree are distributed to the dominant channel signals, and an enhanced voiceprint signal fused with multi-space azimuth information is generated; And carrying out time-frequency domain joint decomposition processing on the enhanced voiceprint signal, decomposing the enhanced voiceprint signal into a plurality of intrinsic mode components comprising an iron core excitation vibration frequency component, a winding load vibration frequency component and a cooling device mechanical vibration frequency component, classifying and combining the plurality of intrinsic mode components according to respective corresponding frequency ranges, and generating a target separated voiceprint signal acquisition result reflecting the vibration characteristics of different parts in the transformer main equipment. In a second aspect, an embodiment of the present invention provides a server system, including a server, where the server is configured to perform the method described in the first aspect. Compared with the prior art, the method and the system for acquiring the voiceprint signals of the power transformation main equipment have the beneficial effects that the voiceprint sensing array synchronously acquires the multichannel original voiceprint signals, and the space azimuth codes are utilized for phase correction to realize signal alignment. And extracting voiceprint envelope characteristics, matching the voiceprint envelope characteristics with a standard template to obtain the matching degree of the operation conditions, screening the dominant channels according to the matching degree, and carrying out weighted fusion to generate an enhanced voiceprint signal. And finally, separating the enhancement signal into intrinsic mode components reflecting vibration characteristics of different parts through time-frequency domain joint decomposition. The method can improve the signal-to-noise ratio of the voiceprint