CN-121983090-A - Breath sound classification detection method based on improved convolutional neural network

CN121983090ACN 121983090 ACN121983090 ACN 121983090ACN-121983090-A

Abstract

The invention discloses a breath sound classification detection method based on an improved convolutional neural network, which comprises the steps of firstly preprocessing breath sound signals acquired in real time and extracting features to construct a breath sound inverse spectrogram feature matrix; and then inputting the breathing sound characteristics into an improved convolutional neural network comprising a convolutional module, a channel attention mechanism module and a lightweight characteristic extraction module for training and classifying, so as to realize accurate recognition of the breathing sound category. The invention realizes the real-time, high-efficiency and high-precision classification detection of breath sounds, has the advantages of low calculation complexity, strong real-time performance and high recognition accuracy, and is suitable for the field of biomedical signal processing.

Inventors

LUO LINBAO
SUN YE
ZHOU XU
LUO MIN
LIANG FENGXIA

Assignees

合肥纳立讯智能科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260202

Claims (7)

1. The breath sound classification detection method based on the improved convolutional neural network is characterized by comprising the following steps of: S1, collecting real-time breathing sound signals and forming a breathing sound data set , wherein, Represent the first A sample of the breath sound is taken, Representing the total number of breath sound samples; S2, pair Preprocessing and feature extraction are carried out to obtain a breath sound inverse spectrogram feature matrix data set , wherein, Representation of A corresponding inverse spectrogram feature matrix; Order the The corresponding real class label set is , wherein, Representation of Corresponding real class labels; S3, constructing an improved convolutional neural network, which sequentially comprises a first convolutional block, a channel attention mechanism module, a first light module, a second convolutional block and a classification module, wherein the first convolutional block, the channel attention mechanism module, the first light module, the second convolutional block and the classification module are used for performing a convolutional neural network Processing to obtain a prediction category set , wherein, Representation of A corresponding prediction category; s4, based on And Constructing cross entropy loss, training the convolutional neural network by using an Adam optimizer, calculating the cross entropy loss to update network parameters, and finishing training until the iteration number reaches the maximum iteration number or the cross entropy loss converges, thereby obtaining a trained optimal breath sound rapid classification detection model, deploying the model into a ZYNQ processing system, judging the breath sound type, and displaying the breath sound classification result on an HDMI display screen in real time.
2. The method for breath sound classification detection based on improved convolutional neural network of claim 1, wherein S2 comprises: S2.1, pair Normalization, framing and windowing are carried out to obtain A continuous and partially overlapped first Short time frame set , wherein, Is that Middle (f) A short period of time of a frame, Representing the total number of frames divided; S2.2, pair I Short time frames Proceeding with Dot fast fourier transform to obtain Is the first of (2) Personal respiratory audio domain feature data For a pair of Taking the mould and squaring to obtain Middle (f) Power spectrum of frame Thereby obtaining the first Sequence of individual power spectra , wherein, Representation of Middle (f) Power spectrum of frame T represents a transpose; S2.3, construction of the compositions comprising Triangular filter group of individual filters and pair Filtering to obtain The band energy { of the output of each filter Wherein, the method comprises the steps of, Represent the first Output by a filter Is the first of (2) Frame band energy; Calculate the first Output by a filter Is corrected by (a) Frame band energy And is opposite to Performing natural logarithmic operation to obtain Output by a filter Is the first of (2) Frame log energy feature Wherein, the method comprises the steps of, Represent the first Output by a filter Is the first of (2) The number of frames versus the energy characteristics, Representing a preset non-zero value; s2.4, pair according to formula (1) Discrete cosine transforming to obtain Is the first of (2) Frame cepstral coefficient vector Thereby obtaining the first Breath sound sample Corresponding inverse spectrogram feature matrix Wherein, the method comprises the steps of, Representation of Is the first of (2) Discrete cosine transform of short-time frame The number of the cepstral coefficients is equal, Is the dimension of the cepstral coefficient; (1) in the formula (1), the components are as follows, Representing the cosine of the computation of the value, The number of filters is indicated and the number of filters is indicated, The circumference ratio is indicated.
3. The method for breath sound classification detection based on improved convolutional neural network of claim 1, wherein S3 comprises: S3.1, the first convolution block consists of a convolution layer, a batch normalization layer, a maximum pooling layer and a ReLU activation layer, and is used for Processing to obtain the first Initial breath sound feature ; S3.2, the channel attention mechanism module pair Processing to obtain the first Individual weighted breath sound features ; S3.3, the first light weight module pair Processing to obtain the first Individual breath sound mid-layer features ; S3.4, the second light module is matched with the S3.3 process Processing to obtain the first Individual breath pitch layer convolution features ; S3.5, the second convolution block comprises a maximum pooling layer, a convolution layer and a ReLU activation layer, and is sequentially matched with Processing to obtain the first Individual respiratory sound deep features ; S3.6, the classification module comprises a full-connection layer and a Softmax layer, and is sequentially connected with Processing to obtain Corresponding prediction category 。
4. A breath sound classification detection method based on an improved convolutional neural network as recited in claim 3, wherein S3.2 comprises: S3.2.1 pair according to formula (2) Performing global feature aggregation operation to obtain the first Individual channel description features ; (2) In equation (2), maxpooling and Averagepooling represent a global average pooling operation and a global maximum pooling operation, respectively, The representative elements are added up and, Representing 2 adaptive fusion weights, and having: (3) In the formula (3), the amino acid sequence of the compound, Representation of Any adaptive fusion weight; Representation of Corresponding weights and bias parameters Concat represent the stitching operation; S3.2.2, will Sequentially processing the first mapping layer, the ReLU activation layer and the second mapping layer to obtain a first mapping layer Individual pre-weighted channel characteristics ; S3.2.3, pair of Nonlinear mapping is carried out to generate the first Individual channel attention weighting coefficients ; S3.2.4, will Extend to and Is uniform in size to obtain the first The attention weight coefficient of the channel after expansion ; S3.2.5, will And (3) with After multiplication, get the first The weighted features 。
5. A breath sound classification detection method based on an improved convolutional neural network of claim 3, wherein S3.3 comprises: s3.3.1, pairs of compression units using channels Processing to obtain the first Features after compression ; S3.3.2 pairs of feature expansion units with different convolution kernel sizes Processing to obtain the first First characteristics of breath sound after expansion And (d) Second characteristic of the breathing sound after expansion ; S3.3.3, will And After fusion, obtain the first Individual breath sound mid-layer features 。
6. An electronic device comprising a memory and a processor, wherein the memory is configured to store a program that supports the processor to perform the improved convolutional neural network-based breath sound classification detection method of any one of claims 1-5, the processor being configured to execute the program stored in the memory.
7. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor performs the steps of the improved convolutional neural network based breath sound classification detection method of any one of claims 1-5.

Description

Breath sound classification detection method based on improved convolutional neural network Technical Field The invention belongs to the technical field of audio signal processing, and particularly relates to a breath sound classification detection method based on an improved convolutional neural network. Background Research shows that respiratory sound is closely related to physiological states and pathological changes of a respiratory system, and important auxiliary basis can be provided for early screening, risk assessment and disease course follow-up of respiratory system diseases through acquisition and analysis of respiratory sound signals. In clinical practice, abnormal respiratory sounds often appear in chest auscultation signals of patients with partial respiratory diseases, such as snoring sounds, wheezing sounds, etc., and the generation of abnormal physiological conditions usually associated with airway narrowing, secretion retention, etc., can be used as one of early prompting features of diseases such as asthma, etc. Therefore, the high-efficiency and accurate automatic classification and identification technology for breath sound is constructed, and the method has practical application value for improving the auxiliary diagnosis efficiency of respiratory diseases. At present, clinical respiratory auscultation is widely used for early screening of diseases such as bronchitis, pneumonia, bronchial asthma and the like due to simple operation and low cost. However, the traditional auscultation method mainly depends on experience judgment of doctors, has the limitations of strong subjectivity, easy missed diagnosis, difficult quantification, inconvenient recording and the like, and meanwhile, auscultation operation can be influenced under the condition that medical staff wear protective equipment. In order to reduce the subjectivity of the human body and improve the quantifiable and traceability of diagnosis, researchers propose an automatic breath sound recognition method based on signal processing and machine learning. In recent years, the deep learning technology is widely applied in the field of medical audio analysis, and the problem of insufficient annotation data can be effectively relieved by combining with transfer learning. For example, by fine tuning a pre-trained audio or visual transducer model (e.g., AST, viT) can be used for breath sound classification tasks. However, as the model scale increases, the overall fine tuning generally requires higher training and storage overhead, and over-fitting is easy to occur in small sample medical scenes, which affects the model deployment efficiency and practical application effect. Patent search finds that publication number CN111640439A discloses a breath sound classification method based on deep learning. According to the method, breath sounds are resampled, high-pass filtered and periodically segmented, acoustic combination features are extracted by combining multiple data enhancement modes, and finally automatic recognition of the breath sounds is realized by using a deep learning classification model. However, the prior art still has the following limitations that the model structure, training dependent characteristic engineering and a complex data preprocessing flow have long overall processing link, the deep learning model is not optimized for the arrangement of edge equipment, the inference process has high demand on computing resources, the method is not beneficial to being used in portable or real-time equipment, a hardware acceleration strategy is not involved, and the requirements on delay and computing efficiency in a clinical real-time auscultation scene are difficult to meet. Although prior studies have applied deep learning methods to automatic classification of breath sounds, there are several shortcomings in the related art. Firstly, part of methods rely on more complex characteristic engineering and data preprocessing processes, so that the overall processing link is longer, and the real-time requirement is difficult to meet. Secondly, the current common deep learning model has large scale and high calculation amount, is difficult to operate efficiently on the embedded or wearable equipment with limited resources, and is not beneficial to popularization in scenes such as clinical auscultation and home monitoring. Disclosure of Invention The invention aims to solve the defects in the prior art, and provides an improved convolutional neural network-based breath sound classification detection method which can be used for collecting, processing and accurately classifying different types of breath sounds in real time, so that the real-time performance and the recognition accuracy of detection can be improved, and the method can be widely applied to the technical field of biomedical signal processing. In order to achieve the aim of the invention, the invention adopts the following technical scheme: The invention discloses a breath