CN-122024762-A - Earphone quality test data processing method, device and medium applying deep learning

CN122024762ACN 122024762 ACN122024762 ACN 122024762ACN-122024762-A

Abstract

The invention discloses a headset quality test data processing method, device and medium applying deep learning, and relates to the technical field of deep learning application. The method comprises the steps of constructing a double-flow attention depth neural network, extracting and fusing time domain transient characteristics and frequency domain texture characteristics of audio signals in parallel, capturing deep association among different modes by using a cross attention mechanism, constructing a quality confidence index containing an uncertainty penalty term based on a mahalanobis distance between a sample to be tested and a gold sample in a feature space and a model prediction entropy value, realizing fine hierarchical control of earphone quality, and generating a visual thermodynamic diagram by further combining a class activation mapping algorithm to accurately position a specific frequency band causing tone quality abnormality. The invention effectively solves the problems of insufficient complex nonlinear defect identification capability and lack of interpretation of results of the traditional testing means, and improves the intelligent level of quality inspection of the earphone production line and the defect diagnosis efficiency.

Inventors

SHENG CHENGLONG
Sheng Zihao
WANG TIANCI
Wei Kaifa
Cheng Kaike
XU GUOHAI
SHENG XINYUE

Assignees

深圳市盛佳丽电子股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260124

Claims (10)

1. A headset quality test data processing method employing deep learning, the method comprising the steps of: Step S1, acoustic response data of an earphone to be tested under a standard excitation signal are obtained, and the acoustic response data are divided into a time domain waveform sequence and a frequency domain spectrogram matrix; S2, constructing a pre-trained double-flow attention depth neural network, wherein the double-flow attention depth neural network comprises a time domain feature extraction branch and a frequency domain feature extraction branch; Step S3, the time domain waveform sequence and the frequency domain spectrogram matrix are synchronously input to the double-flow attention depth neural network, and the comprehensive quality feature vector of the earphone to be tested is generated through a multi-mode feature fusion mechanism; step S4, calculating the quality confidence index of the earphone to be tested by using a nonlinear mapping function based on the comprehensive quality feature vector; And S5, comparing the quality confidence index with a preset hierarchical control threshold, outputting a quality grade judgment result of the earphone to be tested, and generating a thermodynamic diagram positioning identifier aiming at abnormal data.
2. A headset quality test data processing method applying deep learning according to claim 1, characterized in that said step S3 comprises the sub-steps of: step S301, inputting the time domain waveform sequence into the time domain feature extraction branch, and extracting transient time sequence features by using a one-dimensional convolution layer; step S302, inputting the frequency domain spectrogram matrix into the frequency domain feature extraction branch, and extracting spectrum texture features by using a two-dimensional residual convolution layer; Step S303, calculating an associated weight matrix of the transient time sequence feature and the spectrum texture feature by using a cross attention mechanism module; And step S304, carrying out weighted splicing on the transient time sequence features and the spectrum texture features according to the association weight matrix, and outputting the comprehensive quality feature vector.
3. The method for processing headset quality test data using deep learning according to claim 1, wherein in the step S4, the quality confidence index is used for quantifying a feature distribution distance between the headset to be tested and a standard gold sample, and the calculation formula is as follows: , wherein, Representing the quality confidence index; representing a mahalanobis distance between the comprehensive quality feature vector and a pre-stored golden sample feature center; representing a prediction entropy value of the double-flow attention depth neural network in an reasoning process, wherein the prediction entropy value is used for representing uncertainty of a model; representing a preset distance tolerance threshold value, corresponding to the distribution boundary of the qualified product; A slope control factor representing a Sigmoid activation function for adjusting sensitivity to distance changes; Representing the normalized scaling factor; And the calculation formula reduces the score of the fuzzy sample at the classification boundary through the combination of the distance measurement and the uncertainty penalty.
4. The method for processing the earphone quality test data by deep learning according to claim 1, wherein the step S1 comprises the steps of performing silence segment clipping and amplitude normalization processing on an acquired original audio signal to obtain the time domain waveform sequence with a fixed length, converting the time domain waveform sequence into a two-dimensional time-frequency distribution map by using a short-time Fourier transform algorithm, and converting the time-frequency distribution map into a logarithmic energy spectrum under a Mel scale to obtain the frequency domain spectrogram matrix.
5. The method for processing the earphone quality test data by deep learning according to claim 3, wherein the acquiring process of the mahalanobis distance comprises the steps of reading an inverse matrix of a covariance matrix of a standard golden sample set from a feature database, calculating a difference vector between the comprehensive quality feature vector and a mean vector of the standard golden sample set, and performing quadratic operation on the difference vector and the inverse matrix, and obtaining the mahalanobis distance after square opening, wherein the method is used for eliminating correlation interference among different feature dimensions.
6. The method for processing earphone quality test data by deep learning according to claim 1, wherein in the step S2, the training strategy of the dual-flow attention depth neural network comprises the steps of constructing a mixed training set comprising pure samples, simulated defect samples and boundary fuzzy samples, performing metric learning training on the dual-flow attention depth neural network by adopting a contrast loss function, enabling similar samples to be gathered in a feature space and heterogeneous samples to be separated in the feature space, and introducing a random masking mechanism in the training process to randomly mask part of the frequency domain spectrogram matrix.
7. The method for processing the earphone quality test data by deep learning according to claim 1, wherein the step S5 includes setting a first threshold and a second threshold, wherein the first threshold is greater than the second threshold, determining that the earphone to be tested is a good product if the quality confidence index is greater than or equal to the first threshold, determining that the earphone to be tested is a good product if the quality confidence index is less than the first threshold and greater than the second threshold, and determining that the earphone to be tested is a bad product if the quality confidence index is less than or equal to the second threshold, and triggering an abnormal analysis flow.
8. The method for processing headset quality test data using deep learning according to claim 1, wherein in the step S5, the generating thermodynamic diagram location identifier for abnormal data includes: The method comprises the steps of backtracking the last convolution layer of the double-flow attention depth neural network by using a class activation mapping algorithm, calculating gradient contribution values of the comprehensive quality feature vector to the frequency domain spectrogram matrix, generating a visual thermodynamic diagram according to the gradient contribution values, and superposing and displaying the thermodynamic diagram on the frequency domain spectrogram matrix for highlighting a frequency band region causing quality abnormality.
9. A headset quality test data processing apparatus employing deep learning, the apparatus comprising: the data acquisition and preprocessing module is used for acquiring acoustic response data of the earphone to be tested and generating a time domain waveform sequence and a frequency domain spectrogram matrix; the depth network reasoning module is used for calling a pre-trained double-flow attention depth neural network, extracting and fusing the characteristics of the time domain waveform sequence and the frequency domain spectrogram matrix, and outputting a comprehensive quality characteristic vector; The quality index calculation module is used for calculating a quality confidence index by utilizing a quality confidence calculation formula according to the comprehensive quality feature vector and the pre-stored golden sample data; and the result judging and visualizing module is used for comparing the quality confidence index with a hierarchical control threshold value to output a quality grade, and generating a thermodynamic diagram for identifying an abnormal region by utilizing a class activation mapping algorithm.
10. A computer readable storage medium having stored thereon computer program instructions, which when read and executed by one or more processors cause the one or more processors to perform the steps of a headset quality test data processing method applying deep learning as claimed in any one of claims 1 to 8.

Description

Earphone quality test data processing method, device and medium applying deep learning Technical Field The invention relates to the field of deep learning application, in particular to a headset quality test data processing method, device and medium applying deep learning. Background In the earphone manufacturing process, the traditional quality detection (QC) link mainly relies on manual listening or acoustic parameter testing based on fixed threshold, such as frequency response curve tolerance bands. While traditional parameter tests are objective, but can only cover basic indexes such as frequency response, distortion and the like, and lack effective detection means for some complicated and nonlinear hearing defects such as fine noise, transient abnormal sound or impure tone and the like. In recent years, although research attempts are made to introduce deep learning to perform fault diagnosis, most of the research attempts are directed to converting audio into a single spectrogram to perform image classification, rich transient information contained in an audio signal in a time domain is ignored, a model generally only outputs a binary result of 'pass/fail', evaluation of model prediction uncertainty and visual interpretation of a defect specific frequency band are lacked, and a problem source is difficult to quickly locate according to a test result by production staff. Disclosure of Invention In view of the foregoing problems of the prior art, an object of the present invention is to provide a method for processing earphone quality test data by applying deep learning, the method comprising the following steps: step S1, acoustic response data of the earphone to be tested under a standard excitation signal are obtained, and the acoustic response data are divided into a time domain waveform sequence and a frequency domain spectrogram matrix. Step S2, a pre-trained double-flow attention depth neural network is constructed, wherein the double-flow attention depth neural network comprises a time domain feature extraction branch and a frequency domain feature extraction branch. And S3, synchronously inputting the time domain waveform sequence and the frequency domain spectrogram matrix into the double-flow attention depth neural network, and generating the comprehensive quality feature vector of the earphone to be tested through a multi-mode feature fusion mechanism. And S4, calculating the quality confidence index of the earphone to be tested by using a nonlinear mapping function based on the comprehensive quality feature vector. And S5, comparing the quality confidence index with a preset hierarchical control threshold, outputting a quality grade judgment result of the earphone to be tested, and generating a thermodynamic diagram positioning identifier aiming at abnormal data. Preferably, the step S3 includes the following substeps: Step S301, inputting the time domain waveform sequence into the time domain feature extraction branch, and extracting transient time sequence features by using a one-dimensional convolution layer. Step S302, inputting the frequency domain spectrogram matrix into the frequency domain feature extraction branch, and extracting spectrum texture features by using a two-dimensional residual convolution layer. Step S303, calculating an association weight matrix of the transient time sequence feature and the spectrum texture feature by using a cross attention mechanism module. And step S304, carrying out weighted splicing on the transient time sequence features and the spectrum texture features according to the association weight matrix, and outputting the comprehensive quality feature vector. Preferably, in the step S4, the quality confidence index is used for quantifying a feature distribution distance between the earphone to be tested and a standard gold sample, and a calculation formula is as follows: , wherein, Representing the quality confidence index; representing a mahalanobis distance between the comprehensive quality feature vector and a pre-stored golden sample feature center; representing a prediction entropy value of the double-flow attention depth neural network in an reasoning process, wherein the prediction entropy value is used for representing uncertainty of a model; representing a preset distance tolerance threshold value, corresponding to the distribution boundary of the qualified product; A slope control factor representing a Sigmoid activation function for adjusting sensitivity to distance changes; Representing the normalized scaling factor; And the calculation formula reduces the score of the fuzzy sample at the classification boundary through the combination of the distance measurement and the uncertainty penalty. Preferably, the step S1 includes performing silence segment clipping and amplitude normalization processing on the collected original audio signal to obtain the time domain waveform sequence with a fixed length, converting the time domain waveform