KR-20260065589-A - Swallowing disorder prediction method with improved accuracy and system thereof

KR20260065589AKR 20260065589 AKR20260065589 AKR 20260065589AKR-20260065589-A

Abstract

The present disclosure relates to a method for determining a swallowing disorder based on a cough sound, comprising acquiring a target spectrogram corresponding to a user's cough sound, inputting the target spectrogram into a swallowing disorder prediction model to acquire a predicted value regarding the presence or absence of a swallowing disorder, and determining whether the user has a swallowing disorder based on the predicted value, wherein the swallowing disorder prediction model is trained by fine-tuning a cough judgment model to receive data in the form of a spectrogram and output a value regarding the presence or absence of a swallowing disorder, and the cough judgment model is trained to receive data in the form of a spectrogram and output a value regarding the presence or absence of a cough.

Inventors

이기욱
정지영
오병모
서한길

Assignees

사운더블헬스코리아 주식회사
서울대학교병원

Dates

Publication Date: 20260508
Application Date: 20260130

Claims (8)

A method for generating a swallowing disorder judgment model to predict whether a user has a swallowing disorder, A pre-trained cough judgment model is obtained—wherein, the pre-trained cough judgment model is composed of a Transformer-based artificial intelligence model having a head and an encoder layer, and The above-mentioned pre-trained cough judgment model is trained using a cough training dataset composed of data in which spectrograms of a preset time length for sounds other than coughs are labeled with a label value indicating that they are not coughs, and data in which spectrograms of the above-mentioned preset time length for cough sounds are labeled with a label value indicating that they are coughs. Acquiring a dysphagia learning dataset composed of data in which spectrograms of a preset time length for cough sounds of a person without dysphagia are labeled with a label value indicating no dysphagia, and spectrograms of a preset time length for cough sounds of a person with dysphagia are labeled with a label value indicating dysphagia; and A swallowing disorder judgment model is generated using the above-mentioned acquired swallowing disorder learning dataset and the above-mentioned pre-trained cough judgment model—wherein, the swallowing disorder judgment model is generated as the head and the encoder layer of the above-mentioned pre-trained cough judgment model are trained and fine-tuned by the above-mentioned acquired swallowing disorder learning dataset for at least one epoch or more—including Method for generating a swallowing disorder diagnosis model.
In Article 1, The above swallowing disorder judgment model is, After training the head for at least one epoch using the acquired dysphagia learning dataset, the parameters of the head are fixed, and A generated by training at least one layer of the encoder layer for at least 1 epoch or more using the above-mentioned acquired dysphagia learning dataset Method for generating a swallowing disorder diagnosis model.
In Article 2, The number of epochs for training at least one layer of the encoder layer using the acquired swallowing disorder learning dataset is greater than the number of epochs for training the head using the acquired swallowing disorder learning dataset. Method for generating a swallowing disorder diagnosis model.
In Paragraph 3, The above swallowing disorder judgment model is, After the head is trained for 1 epoch by the acquired swallowing disorder learning dataset, at least one layer of the encoder layer is trained for 50 epochs to be generated Method for generating a swallowing disorder diagnosis model.
In Article 1, The spectrogram of the preset time length for the above cough sound is, Acquire audio data including cough sounds; Determining the onset point by analyzing the above audio data; Acquiring cough sound data of the preset time length within the audio data using the above onset point; and Converting cough sound data of the above preset time length into a spectrogram of the above preset time length; obtained according to Method for generating a swallowing disorder diagnosis model.
In Article 5, The above onset point is a point within the audio data where the magnitude of the sound signal is below a threshold and becomes higher than the threshold. Method for generating a swallowing disorder diagnosis model.
In Article 5, The cough sound data of the above preset time length includes the above onset point, Based on the above onset point, it includes both sound data from a time prior to the above onset point and sound data from a time following the above onset point. Method for generating a swallowing disorder diagnosis model.
In Article 1, The above preset time length is a time length selected between 0.05 seconds and 0.1 seconds. Method for generating a swallowing disorder diagnosis model.

Description

Swallowing disorder prediction method with improved accuracy and system thereof The present specification relates to a method and system for predicting dysphagia with improved accuracy, and more specifically, to a method and system for determining whether a cough sound is the cough of a person with dysphagia using a dysphagia prediction model generated by fine-tuning a cough judgment model. Previously, video fluoroscopic swallowing study (VFSS) and/or fiberoptic endoscopic examination of swallowing (FEES) methods were used to diagnose swallowing disorders. Since video fluoroscopic swallowing study requires X-ray imaging and fiberoptic endoscopic examination requires an endoscopy, specialized medical equipment and medical personnel were required to determine whether a swallowing disorder was present. The inventor of the present application attempted to predict whether a swallowing disorder exists based on cough sounds in order to determine this in daily life and/or conveniently. Specifically, the inventor intended to derive feature data from cough sounds and train a swallowing disorder prediction model that predicts the presence or absence of a swallowing disorder by receiving this data as input. However, the swallowing disorder prediction model trained solely on feature data derived from cough sounds and labeled with the presence or absence of a swallowing disorder had limitations in that it did not possess sufficient predictive performance. FIG. 1 is a drawing of a swallowing disorder prediction system according to one embodiment. FIG. 2 is a drawing for explaining feature data according to one embodiment. FIG. 3 is a block diagram showing the configuration of a user terminal according to one embodiment. FIG. 4 is a block diagram showing the configuration of a server according to one embodiment. FIG. 5 is a diagram illustrating a swallowing disorder prediction model according to one embodiment. FIG. 6 is a diagram illustrating a cough judgment model according to one embodiment. Figure 7 is a diagram showing the results of a Mann-Whitney U test analysis of a swallowing disorder prediction model generated by fine-tuning a cough judgment model according to one embodiment and a swallowing disorder prediction model generated without fine-tuning. FIG. 8 is a flowchart illustrating a method for predicting dysphagia according to one embodiment. The embodiments described in this specification are intended to clearly explain the concept of the invention to those skilled in the art to which the invention pertains; therefore, the invention is not limited by the embodiments described in this specification, and the scope of the invention should be interpreted to include modifications or variations that do not depart from the concept of the invention. The terms used in this specification have been selected to be as widely used as possible, taking into account their functions in the present invention; however, they may vary depending on the intent, custom, or emergence of new technologies of those skilled in the art to which the present invention pertains. However, if a specific term is defined and used with an arbitrary meaning, the meaning of that term will be described separately. Accordingly, the terms used in this specification should be interpreted based on their actual meaning and the content throughout this specification, rather than merely their names. Numbers used in the description of this specification (e.g., 1st, 2nd, etc.) are merely identifiers to distinguish one component from another. Furthermore, the suffixes "module" and "part" for components used in the following embodiments are assigned or used interchangeably solely for the ease of drafting the specification, and do not inherently possess distinct meanings or roles. In the following examples, singular expressions include plural expressions unless the context clearly indicates otherwise. In the following embodiments, terms such as "comprising" or "having" mean that the features or components described in the specification are present, and do not preclude the possibility that one or more other features or components may be added. The drawings attached to this specification are intended to facilitate the explanation of the present disclosure, and the shapes depicted in the drawings may be exaggerated as necessary to aid in understanding the present disclosure; therefore, the present disclosure is not limited by the drawings. Where an embodiment can be implemented differently, a specific process sequence may be performed differently from the order described. For example, two processes described consecutively may be performed substantially simultaneously or proceed in the reverse order of the description. In this specification, if it is determined that a specific description of known configurations or functions related to the present invention may obscure the essence of the present invention, such detailed description may be omitted as necessary. Accordin