CN-122020096-A - Learner cognitive state evaluation method, system, equipment and medium based on voice

CN122020096ACN 122020096 ACN122020096 ACN 122020096ACN-122020096-A

Abstract

The invention discloses a learner cognitive state evaluation method, system, equipment and medium based on voice, belonging to the technical field of remote learning cognitive evaluation, wherein the method comprises the steps of collecting reference voice, reading voice and answering voice; the method comprises the steps of separating voice micro-vibration characteristics reflecting voice tremors from reference voices through an LPC and ICA analysis method, reconstructing sounding rhythm control characteristics from read voices through Hilbert transformation and a nonlinear dynamics method, extracting breathing voice cooperative efficiency characteristics of answer voices through energy envelope analysis and phase coordination calculation, mapping the extracted characteristics into initial scores of four cognitive dimensions of attention, fatigue, psychological load and stress level, and carrying out comprehensive decision to obtain final scores of four cognitive dimensions of a learner. The invention can deeply mine the biomarker affecting cognition from the collected voice data, thereby improving the accuracy of evaluating the cognition state of the learner in the remote learning.

Inventors

WANG SONGLIN
ZHAI JIAQI
HUI JUAN
MU XU
HU JIAO
Lu Yumen

Assignees

江苏开放大学（江苏城市职业学院）

Dates

Publication Date: 20260512
Application Date: 20260409

Claims (10)

1. A method for evaluating cognitive state of a learner based on speech, comprising the steps of: collecting and preprocessing voice data under different sounding tasks, wherein the voice data comprises reference voice, reading voice and answering voice; Separating voice micro-vibration characteristics reflecting voice tremors from reference voice through an LPC and ICA analysis method, reconstructing sounding rhythm control characteristics from read voice through Hilbert transformation and a nonlinear dynamics method, and extracting breathing voice cooperative efficiency characteristics of answering voice through energy envelope analysis and phase coordination calculation; The voice micro-vibration characteristics, the sounding rhythm control characteristics and the breathing voice cooperative efficiency characteristics are respectively mapped into initial scores of four cognitive dimensions of attention, fatigue, psychological load and stress level, and comprehensive decision is carried out through weighted fusion to obtain final scores of the four cognitive dimensions of the learner.
2. The method of claim 1, further comprising the steps of performing real-time validity analysis on the collected different voice data, if a preset validity condition is satisfied, retaining the current voice data, if the preset validity condition is not satisfied, deleting the current voice data, and starting the re-collection of the voice data.
3. The method for estimating cognitive state of a learner based on speech according to claim 1, wherein the step of separating the characteristic of the micro-vibration of the speech reflecting the tremor of the speech from the reference speech by the LPC and ICA analysis method is specifically: performing LPC analysis on each frame of reference voice to obtain a single-channel residual signal; Embedding the single-channel residual signal through time delay to construct a multi-channel observation signal; ICA analysis is carried out on the multi-channel observation signals to obtain a plurality of independent components; According to the physiological characteristics of the micro tremors of the voice, selecting the most concentrated energy component from all independent components as a micro tremor signal; And extracting the voice micro-vibration characteristics of the micro-tremor signal, wherein the voice micro-tremor characteristics comprise tremor energy, frequency band energy ratio and sample entropy.
4. The method for estimating cognitive state of a learner based on speech according to claim 1, wherein the method for reconstructing the vocal rhythm control feature from the spoken speech by hilbert transformation and nonlinear dynamics method comprises: extracting a voice envelope signal from the read voice through Hilbert transformation; extracting scale index of voice envelope signal by detrending fluctuation analysis method ; Drawing an MSE curve of a voice envelope signal by a multi-scale entropy analysis method, and extracting the area under the curve And complexity index ; Calculating a recursion chart of the voice envelope signal by a phase space reconstruction method, and extracting a deterministic index of the recursion chart by a recursion quantitative analysis method And laminar flow index ; Calculating the multi-fractal spectrum of the voice envelope signal by a multi-fractal analysis method, and then extracting the width of the multi-fractal spectrum ; Will scale the index Width of multi-fractal spectrum Area under curve of MSE curve And complexity index Deterministic index of recursive graph And laminar flow index Integrated as a sounding rhythm control feature.
5. The method for evaluating the cognitive state of a learner based on voice according to claim 1, wherein the method for extracting the breathing voice cooperative efficiency characteristic of the answering voice through energy envelope analysis and phase coordination calculation is specifically as follows: detecting respiratory cycles from answer voices through energy envelope analysis, and recording the inspiration time length, expiration time length, respiratory cycle length, inspiration-to-expiration ratio and expiration peak energy of each respiratory cycle; detecting syllable quantity in each expiration section, and calculating variation coefficients of syllable rate, voice sound ratio and inspiration-to-expiration ratio; mapping the time axis of each respiratory cycle to The phase value of each syllable starting time on the phase ring is utilized to obtain the phase consistency index of all respiratory cycles; Comparing the voice energy envelopes of all the expiration phases with a preset ideal envelope, and calculating the ratio of the voice energy envelopes to obtain an energy output ratio, wherein the ideal envelope is a rectangle formed by maintaining expiration peak energy in all the expiration phases; Calculating the comprehensive stress index of the current learner according to the phase consistency index, the energy yield ratio and the respiratory frequency baseline value of the current learner; the respiratory speech synergistic efficiency characteristic is integrated by the respiratory ratio, syllable rate, speech sound ratio, energy production ratio, variation coefficient of respiratory ratio, phase consistency index and comprehensive stress index.
6. The method for evaluating the cognitive state of a learner based on voice according to claim 1, wherein the voice micro-vibration feature, the sounding rhythm control feature and the breathing voice cooperative efficiency feature are mapped to initial scores of four cognitive dimensions of attention, fatigue, psychological load and stress level respectively, and comprehensive decision is performed through weighted fusion to obtain final scores of the four cognitive dimensions of the learner, specifically: preprocessing the extracted voice micro-vibration characteristics, the sounding rhythm control characteristics and the breathing voice cooperative efficiency characteristics respectively; predicting initial scores of all the features in four cognitive dimensions of attention, fatigue, psychological load and stress level by adopting three independent random forest models respectively; and calculating the final score of each cognitive dimension according to the following formula and outputting: ; Wherein, the To the learner The final score for the individual cognitive dimensions, 、 And The characteristic of the micro-vibration of the voice, the control characteristic of the rhythm of the sound production and the cooperative efficiency characteristic of the breathing voice are respectively in the first place The weight coefficients of the individual cognitive dimensions, 、 And The characteristics of the voice micro-vibration, the sounding rhythm control and the breathing voice cooperative efficiency which are respectively predicted by adopting the random forest model are in the first stage Initial scores for individual cognitive dimensions.
7. The method for speech-based learner cognitive state assessment according to claim 1, further comprising the step of determining a level of cognitive state for the learner based on the final scores of the four cognitive dimensions for the learner; The method specifically comprises the steps of comparing the final score of each cognitive dimension with a corresponding preset cognitive dimension abnormal interval to obtain the abnormal grade of each cognitive dimension, and outputting the cognitive state grade of the learner based on the abnormal grade of each cognitive dimension and a preset rule.
8. A speech-based learner cognitive state assessment system, comprising: the voice acquisition module is used for acquiring and preprocessing voice data under different sounding tasks, wherein the voice data comprises reference voice, reading voice and answering voice; The feature extraction module is used for separating the voice micro-vibration features reflecting voice tremors from the reference voice through an LPC and ICA analysis method, reconstructing sounding rhythm control features from the read voice through Hilbert transformation and a nonlinear dynamics method, and extracting breathing voice cooperative efficiency features of the answering voice through energy envelope analysis and phase coordination calculation; The evaluation module is used for mapping the voice micro-vibration characteristics, the sounding rhythm control characteristics and the breathing voice cooperative efficiency characteristics into initial scores of four cognitive dimensions of attention, fatigue, psychological load and stress level respectively, and carrying out comprehensive decision through weighted fusion to obtain final scores of the four cognitive dimensions of the learner.
9. A computer device comprising a processor and a memory, wherein the processor, when executing a computer program stored in the memory, implements the steps of the speech based learner cognitive state evaluation method according to any one of claims 1-7.
10. A computer readable storage medium for storing a computer program which when executed by a processor carries out the steps of the speech based learner cognitive state evaluation method according to any one of claims 1 to 7.

Description

Learner cognitive state evaluation method, system, equipment and medium based on voice Technical Field The invention belongs to the technical field of remote learning cognitive assessment, and particularly relates to a learner cognitive state assessment method, system, equipment and medium based on voice. Background In the background of the increasing popularity of remote learning and online education, how to timely and accurately evaluate the cognitive state of students has become a key challenge. The traditional assessment method mainly has some limitations, and the learning state of students mainly depends on observation of teachers, self-report questionnaires or periodic examination. These methods are subjective and cannot meet the educational scenario of remote learning. Some existing analysis technologies based on voice or video usually only pay attention to single dimension, such as voice intonation or facial expression, fail to deeply mine subtle biomarkers which are generated behind the biomarkers by neuromuscular and central nervous control systems and are closely related to cognitive states, so that the accuracy of evaluation results is low, accurate early warning and intervention cannot be realized, and in addition, the evaluation results are poor in interpretation due to the fact that the evaluation is performed by adopting a data black box mode. Disclosure of Invention Aiming at the defects in the prior art, the invention provides a learner cognitive state evaluation method, a system, equipment and a medium based on voice, which can deeply mine biomarkers affecting cognition from collected voice data, thereby improving the accuracy and the interpretability of the learner cognitive state evaluation in remote learning. The invention provides the following technical scheme: In a first aspect, a method for evaluating cognitive state of a learner based on voice is provided, including the steps of: collecting and preprocessing voice data under different sounding tasks, wherein the voice data comprises reference voice, reading voice and answering voice; Separating voice micro-vibration characteristics reflecting voice tremors from reference voice through an LPC and ICA analysis method, reconstructing sounding rhythm control characteristics from read voice through Hilbert transformation and a nonlinear dynamics method, and extracting breathing voice cooperative efficiency characteristics of answering voice through energy envelope analysis and phase coordination calculation; The voice micro-vibration characteristics, the sounding rhythm control characteristics and the breathing voice cooperative efficiency characteristics are respectively mapped into initial scores of four cognitive dimensions of attention, fatigue, psychological load and stress level, and comprehensive decision is carried out through weighted fusion to obtain final scores of the four cognitive dimensions of the learner. Optionally, the method further comprises the steps of respectively carrying out real-time validity analysis on the collected different voice data, if the preset validity condition is met, reserving the current voice data, if the preset validity condition is not met, deleting the current voice data, and starting the re-collection of the voice data. Optionally, the method for separating the voice micro-vibration features reflecting the voice tremors from the reference voice by using the LPC and ICA analysis method specifically comprises the following steps: performing LPC analysis on each frame of reference voice to obtain a single-channel residual signal; Embedding the single-channel residual signal through time delay to construct a multi-channel observation signal; ICA analysis is carried out on the multi-channel observation signals to obtain a plurality of independent components; According to the physiological characteristics of the micro tremors of the voice, selecting the most concentrated energy component from all independent components as a micro tremor signal; And extracting the voice micro-vibration characteristics of the micro-tremor signal, wherein the voice micro-tremor characteristics comprise tremor energy, frequency band energy ratio and sample entropy. Optionally, the method reconstructs sounding rhythm control features from the read-aloud voice through hilbert transformation and a nonlinear dynamics method, which specifically comprises the following steps: extracting a voice envelope signal from the read voice through Hilbert transformation; extracting scale index of voice envelope signal by detrending fluctuation analysis method ; Drawing an MSE curve of a voice envelope signal by a multi-scale entropy analysis method, and extracting the area under the curveAnd complexity index; Calculating a recursion chart of the voice envelope signal by a phase space reconstruction method, and extracting a deterministic index of the recursion chart by a recursion quantitative analysis methodAnd laminar flow index; Ca