CN-121983092-A - Model training method, snore identification method and related equipment
Abstract
The application provides a model training method, a snore identification method and related equipment, which are applied to the technical field of information processing, wherein the model training method comprises the steps of obtaining a first training data set; training a preset model based on the first training data set to obtain a first model, obtaining N detection results output by the preset model for detecting snore of the N sound samples in the training process, comparing the N detection results with the N first labels to obtain T detection results with wrong recognition of the preset model, determining a second training data set based on the T detection results, and training the first model based on the second training data set to obtain a snore recognition model. In the method, the snore identification model obtained through training has strong identification capability for various types of sound samples, and the snore is identified through the snore identification model, so that the accuracy of identifying the snore can be improved.
Inventors
- CHEN XIAOLIANG
- YU XIN
- CHANG LE
- HUANG BINHE
- JING TENG
Assignees
- 北京中科声智科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260310
Claims (12)
- 1. A method of model training, the method comprising: Acquiring a first training data set, wherein the first training data set comprises N sound samples and N first labels which are in one-to-one correspondence with the N sound samples, the first labels are used for indicating whether the corresponding sound samples are snores or not, and N is an integer larger than 1; Training a preset model based on the first training data set to obtain a first model, and acquiring N detection results output by the preset model for snore detection of the N sound samples in the training process; Comparing the N detection results with the N first tags to obtain T detection results with the preset model identification errors, wherein T is a positive integer less than or equal to N; Determining a second training data set based on the T detection results, wherein the proportion of the sound samples corresponding to the T detection results in the second training data set is larger than a preset proportion; and training the first model based on the second training data set to obtain a snore identification model.
- 2. The method of claim 1, wherein comparing the N detection results with the N first tags to obtain T detection results with the preset model identification errors comprises: dividing the N sound samples into M frequency intervals according to the frequencies corresponding to the N sound samples, wherein M is a positive integer less than or equal to N; And determining at least one hard negative sample interval from the M frequency intervals based on the N detection results and the N first labels, wherein the hard negative sample interval is an interval in which the detection error rate of the preset model on the sound samples in the interval is greater than a preset error rate threshold value in the M frequency intervals, the detection error rate is determined based on the detection results corresponding to the sound samples in the corresponding interval and the first labels, and the T detection results are detection results corresponding to the sound samples included in the hard negative sample interval.
- 3. The method of claim 1, wherein the determining at least one hard negative sample interval from the M frequency intervals based on the N detection results and the N first tags comprises: determining L detection results and L first labels corresponding to a target frequency interval from the N detection results and the N first labels, wherein the target frequency interval is any interval of the M frequency intervals, the L detection results are detection results corresponding to L sound samples contained in the target frequency interval, the L first labels are first labels corresponding to L sound samples contained in the target frequency interval, and L is a positive integer smaller than N; Determining the ratio of detection results inconsistent with the L target labels in the L detection results to the L detection results as the detection error rate corresponding to the target frequency interval; And determining the target frequency interval as the hard negative sample interval under the condition that the detection error rate corresponding to the target frequency interval is larger than a preset error rate threshold.
- 4. A method for identifying snoring, the method comprising: Collecting a sound sample to be detected; Inputting the sound sample to be detected into a snore recognition model for snore recognition, and obtaining a snore recognition result output by the snore recognition model, wherein the snore recognition model is obtained according to the model training method of any one of claims 1 to 3.
- 5. The method of claim 4, wherein the collecting the sound sample to be detected comprises: Collecting background noise energy of an environment to be detected; Determining an environment-aware dynamic threshold based on the background noise energy, wherein the magnitude of the environment-aware dynamic threshold is inversely proportional to the magnitude of the background noise energy; and determining the sound sample with the noise energy in the environment to be detected being larger than the environment perception dynamic threshold value as the sound sample to be detected.
- 6. The method of claim 5, wherein the determining an environmental perception dynamic threshold based on the background noise energy comprises: Obtaining P preset noise energy intervals, wherein the P preset noise energy intervals are in one-to-one correspondence with P preset basic thresholds, and the P preset noise energy intervals are in one-to-one correspondence with P dynamic coefficients; Based on the interval of the background noise energy in the P preset noise energy intervals, respectively determining a target threshold value and a target coefficient corresponding to the background noise energy from the P preset basic threshold values and the P dynamic coefficients; Acquiring a noise energy initial value of a section where the background noise energy is located in the P preset noise energy sections; determining a difference value between the background noise energy and the noise energy starting value as a first parameter; determining the product of the first parameter and the target coefficient as a second parameter; and determining the sum of the second parameter and the target threshold as the environment perception dynamic threshold.
- 7. The method of claim 5, wherein the acquiring background noise energy of the environment to be detected comprises: Continuously collecting multiple sections of first background noise of the environment to be detected according to a preset time window; Removing equipment background noise of each section of the first background noise to obtain a plurality of sections of second background noise, wherein the equipment background noise is the background noise corresponding to the equipment for collecting the plurality of sections of first background noise; Removing first noise and second noise from each section of the second background noise respectively to obtain a plurality of sections of third background noise, wherein the first noise is noise with a decibel corresponding to the second background noise smaller than a first decibel threshold value, and the second noise is noise with a decibel corresponding to the second background noise larger than a second score Bei Yuzhi; and determining the average noise energy of the multi-section third background noise as the background noise energy.
- 8. A model training apparatus, the apparatus comprising: The device comprises an acquisition module, a first training data set and a second training data set, wherein the first training data set comprises N sound samples and N first labels which are in one-to-one correspondence with the N sound samples, the first labels are used for indicating whether the corresponding sound samples are snores or not, and N is an integer larger than 1; The first training module is used for training a preset model based on the training data set to obtain a first model, and acquiring N detection results output by the preset model for detecting snores of the N sound samples in the training process; the comparison module is used for comparing the N detection results with the N first tags to obtain T detection results with the preset model identification errors, wherein T is a positive integer less than or equal to N; The first determining module is used for determining a second training data set based on the T detection results, wherein the proportion of the sound samples corresponding to the T detection results in the second training data set is larger than a preset proportion; and the second training module is used for training the first model based on the second training data set to obtain a snore identification model.
- 9. A snore identification device, the device comprising: The acquisition module is used for acquiring a sound sample to be detected; The recognition module is used for inputting the sound sample to be detected into a snore recognition model for snore recognition, so as to obtain a snore recognition result output by the snore recognition model, wherein the snore recognition model is a model obtained according to the model training method of any one of claims 1 to 3.
- 10. An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the model training method according to any one of claims 1 to 3 or the snore identification method according to any one of claims 4 to 7.
- 11. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the model training method according to any one of claims 1 to 3 or implements the steps of the snore identification method according to any one of claims 4 to 7.
- 12. A computer program product comprising computer instructions which, when executed by a processor, implement the steps of the model training method according to any one of claims 1 to 3 or the snore recognition method according to any one of claims 4 to 7.
Description
Model training method, snore identification method and related equipment Technical Field The application relates to the technical field of information processing, in particular to a model training method, a snore identification method and related equipment. Background Snoring is a coarse sound of breathing accompanied by breathing after falling asleep. In a sleeping state, the muscles of the throat part are relaxed, and the upper respiratory tract is blocked or blocked due to factors such as the collapse of the throat part tissues. When respiratory airflow is blocked, soft tissues of the respiratory tract vibrate, which in turn produces sound. In the scenes of disease diagnosis, sleep environment adjustment and the like, the recognition of the snore of a user is an important step, however, the existing snore recognition method has the problem of low recognition accuracy. Disclosure of Invention The embodiment of the application provides a model training method, a snore identification method and related equipment, which can solve the problem that the existing snore identification method has lower accuracy in snore identification. In a first aspect, an embodiment of the present application provides a model training method, including: Acquiring a first training data set, wherein the first training data set comprises N sound samples and N first labels which are in one-to-one correspondence with the N sound samples, the first labels are used for indicating whether the corresponding sound samples are snores or not, and N is an integer larger than 1; Training a preset model based on the first training data set to obtain a first model, and acquiring N detection results output by the preset model for snore detection of the N sound samples in the training process; Comparing the N detection results with the N first tags to obtain T detection results with the preset model identification errors, wherein T is a positive integer less than or equal to N; Determining a second training data set based on the T detection results, wherein the proportion of the sound samples corresponding to the T detection results in the second training data set is larger than a preset proportion; and training the first model based on the second training data set to obtain a snore identification model. Optionally, comparing the N detection results with the N first tags to obtain T detection results with the preset model identification error, including: dividing the N sound samples into M frequency intervals according to the frequencies corresponding to the N sound samples, wherein M is a positive integer less than or equal to N; And determining at least one hard negative sample interval from the M frequency intervals based on the N detection results and the N first labels, wherein the hard negative sample interval is an interval in which the detection error rate of the preset model on the sound samples in the interval is greater than a preset error rate threshold value in the M frequency intervals, the detection error rate is determined based on the detection results corresponding to the sound samples in the corresponding interval and the first labels, and the T detection results are detection results corresponding to the sound samples included in the hard negative sample interval. Optionally, the determining at least one hard negative sample interval from the M frequency intervals based on the N detection results and the N first tags includes: determining L detection results and L first labels corresponding to a target frequency interval from the N detection results and the N first labels, wherein the target frequency interval is any interval of the M frequency intervals, the L detection results are detection results corresponding to L sound samples contained in the target frequency interval, the L first labels are first labels corresponding to L sound samples contained in the target frequency interval, and L is a positive integer smaller than N; Determining the ratio of detection results inconsistent with the L target labels in the L detection results to the L detection results as the detection error rate corresponding to the target frequency interval; and determining the target frequency interval as the hard negative sample interval under the condition that the detection error rate corresponding to the target frequency interval is larger than the preset error rate threshold. In a second aspect, an embodiment of the present application further provides a method for identifying snore, where the method includes: Collecting a sound sample to be detected; And inputting the sound sample to be detected into a snore recognition model for snore recognition, and obtaining a snore recognition result output by the snore recognition model, wherein the snore recognition model is obtained according to the model training method in the first aspect. Optionally, the collecting the sound sample to be detected includes: Collecting background noise energy of an