CN-122023081-A - English reading understanding capability self-adaptive evaluation method and system
Abstract
The application relates to the technical field of education evaluation, in particular to an English reading understanding ability self-adaptive evaluation method and system. The method aims at solving the technical problems that the traditional fixed difficulty test is easy to cause inaccurate capacity estimation, the conventional self-adaptive test is rough in characterization of the question difficulty, the reading sub-skill diagnosis is fuzzy, and the question bank safety is not considered enough. The technical scheme is characterized by comprising the steps of extracting multi-dimensional difficulty characteristics of vocabulary, syntax, chapters and topics of a reading material by utilizing a pre-training language model, calibrating topic parameters by combining a project reaction theoretical model with answer data, realizing dynamic evaluation by adopting a self-adaptive test engine based on Bayesian capability estimation and maximum information quantity topic selection, outputting fine granularity skill mastery level by a cognitive diagnosis model with Q matrix constraint, and integrating topic exposure control strategies to balance measurement precision and topic library safety. The system finally outputs the capability level, the skill diagnosis radar chart and the personalized reading recommendation.
Inventors
- ZHENG XIAOZHEN
- Chen Yuanyue
- PAN YUZHI
Assignees
- 温州医科大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260206
Claims (10)
- 1. The self-adaptive evaluation method for English reading and understanding capability is characterized by comprising the following steps of: S1, extracting multidimensional difficulty features of reading materials for reading and understanding test questions by using a pre-training language model, and calibrating difficulty parameters and distinguishing degree parameters of each question by combining the multidimensional difficulty features with large-scale student response data based on an item reaction theoretical model; S2, initializing capacity prior distribution of a tested person and entering an evaluation cycle, namely presenting questions and recording response, dynamically updating capacity posterior distribution of the tested person based on Bayesian updating rules, selecting a question capable of providing the maximum information for current capacity estimation from a question library as a next question according to the updated capacity posterior distribution and a question exposure control strategy; S3, in the step S2, synchronously based on a preset reading understanding cognitive skill classification system, estimating grasping degrees of a tested person in at least two sub-skill dimensions by analyzing response modes of the tested person in different skill label questions; And S4, outputting a total capacity grade based on the final capacity posterior distribution estimated value, generating a skill diagnosis report by fusing the output of the step S3, and recommending personalized reading materials according to the total capacity grade and the skill diagnosis result.
- 2. The self-adaptive evaluation method for English reading and understanding capability according to claim 1, wherein in the step S1, a linear logic Steck test model is adopted as the item reaction theoretical model, the multi-dimensional difficulty features are adopted as covariates, basic difficulty parameters, distinguishing degree parameters and weight coefficients of various difficulty features on the problem difficulty are estimated in a combined mode, and the multi-dimensional difficulty features at least comprise vocabulary complexity, syntax complexity, chapter structure complexity and theme familiarity.
- 3. The method for adaptively evaluating the english reading and understanding ability according to claim 2, wherein the vocabulary complexity is determined based on a vocabulary proportion exceeding a target vocabulary in the multidimensional difficulty feature, the syntax complexity is determined based on an average sentence length or a number of clauses obtained by dependency syntactic analysis, the chapter structure complexity is determined based on a logic connector word density or a chapter semantic consistency score, and the topic familiarity is determined based on a similarity of a text topic vector and a reference corpus topic vector calculated by a latent dirichlet distribution model.
- 4. The adaptive evaluation method for english reading and understanding capabilities according to claim 1, wherein in step S2, the calculation of the maximum information amount is based on Fisher information amount, and the topic exposure control strategy uses an exponential penalty method, and the calculation formula for correcting the original information amount of the candidate topic i is: Wherein the method comprises the steps of As the current capability estimation value is, For the number of exposures to the subject history, Is a preset penalty factor.
- 5. The method for adaptively evaluating the English reading and understanding ability according to claim 1, wherein the preset reading and understanding cognitive skill classification system at least comprises four dimensions of literal understanding, reasoning and judging, gist summarization and viewpoint evaluation, and the fine-grained cognitive skill diagnosis step is used for realizing the grasping degree estimation through a diagnosis model with Q matrix constraint, and the Q matrix is used for identifying the association relation between topics and cognitive skill dimensions.
- 6. The adaptive evaluation method for english reading and understanding capabilities according to claim 1, wherein the preset termination condition is that a standard error of the capability estimation value is lower than a first threshold, or that the number of questions answered reaches a second threshold.
- 7. The method for adaptively evaluating the English reading and understanding ability according to claim 1, wherein in the step S4, the skill diagnosis report is presented in a radar chart form, the recommendation of the personalized reading material is subjected to difficulty matching based on multi-dimensional difficulty characteristics of the material to be recommended and the overall ability level of the testee, and skill enhancement matching is performed based on skill labels attached to the material and skill weaknesses of the testee.
- 8. An english reading comprehension ability adaptive assessment system for implementing the method according to any one of claims 1 to 7, comprising: The multi-dimensional question library management module is used for storing questions, reading materials, question parameters and multi-dimensional difficulty feature vectors extracted through a pre-training language model; The self-adaptive test engine module is used for executing capability estimation and dynamic topic selection in the test process and comprises a Bayesian capability estimation unit and a maximum information amount topic selection unit; The fine granularity diagnosis analysis module is used for running the cognitive diagnosis model and outputting grasping degree estimation of the tested person in each sub-skill dimension; the safety and exposure control module is used for monitoring the using frequency of the questions and executing the question exposure control strategy; and the report generation and recommendation module is used for synthesizing the evaluation result and generating a visual report and personalized recommendation.
- 9. The adaptive evaluation system for english reading and understanding capabilities according to claim 8, wherein the multidimensional question bank management module integrates a pre-training language model and a feature extraction unit for automatically performing the extraction and calculation of the multidimensional difficulty feature.
- 10. The system of claim 8, wherein the security and exposure control module is configured to implement an exponential penalty strategy and interact with a maximum information content choice unit in the adaptive test engine module to impose exposure penalty constraints in choice calculations.
Description
English reading understanding capability self-adaptive evaluation method and system Technical Field The application relates to the technical field of education evaluation, in particular to an English reading understanding ability self-adaptive evaluation method and system combining project response theory, a pre-training language model and fine granularity cognitive diagnosis. Background Computer self-adaptive testing (computed ADAPTIVE TESTING, CAT) has become an important development direction for educational assessment by dynamically selecting the questions that best match the subject's ability and implementing efficient and accurate ability assessment with fewer questions. In the field of English reading understanding evaluation, the prior art scheme is mainly based on two paths. The first path is centered on a psychometric model. For example, an adaptive test is constructed based on the Rasch model or the project reaction theory (Item Response Theory, IRT). Related researches show that the model can effectively improve the testing efficiency. It has been pointed out that the adaptive test using the Rasch model can be evaluated on average with only 22.25 questions and the winding is rapid. However, such methods generally rely on artificial experience or classical statistical methods to calibrate the topic parameters (e.g., difficulty, discrimination), and lack fine, automated quantitative analysis of the multidimensional language features (e.g., chapter structure, topic depth) of the reading material itself, resulting in insufficient and objective characterization of the topic difficulty. In addition, the output is typically a single capability score, making it difficult to diagnose specific reading understanding sub-skills such as "inference", "generalization", and the like. The second path attempts to introduce artificial intelligence techniques to deepen the analysis dimension. Chinese invention patent application CN120509406a (publication day: 2025) discloses a "deep learning based english reading comprehension analysis system". The system evaluates the degree of understanding and reading efficiency by analyzing the student's answer text and monitoring its reading process behavior (e.g., period of concentration). The method enriches the evaluation dimension, but the combination of an evaluation model and a standard psychometric framework is not tight enough, the capability estimation process may lack strict metrology reliability guarantee, and the method is difficult to efficiently adapt and select questions with a large-scale parameterized standardized test question library. In addition, CAT technology has shown significant advantages in basic capability assessment such as vocabulary recognition, such as the fact that an adaptive test system takes only about 3 minutes to reach a similar effectiveness as a conventional long test. However, comprehensive reading and understanding involves multi-level capabilities of vocabulary, syntax, chapters, logic and the like, the evaluation complexity is far higher than that of vocabulary recognition, and the prior art scheme still has defects in realizing accurate, quick and safe evaluation of multi-dimensional capabilities. In summary, although procedural data analysis is introduced in the prior art, the technical problems to be solved still exist that 1) the question difficulty modeling dimension is single, deep language features of texts cannot be systematically fused to perform automatic and fine question parameter calibration, 2) the self-adaptive evaluation process is not fully combined with multi-dimensional and fine-granularity cognitive skill diagnosis, the teaching guidance of feedback information is limited, and 3) an effective control strategy for question exposure of a question library in a large-scale application scene is lacking, and measurement accuracy and long-term security of the question library are difficult to balance. The foregoing background is only for the purpose of facilitating an understanding of the principles and concepts of the application and is not necessarily in the prior art to the present application and is not intended to be used as an admission that such background is not entitled to antedate such novelty and creativity by virtue of prior application or that it is already disclosed at the date of filing of this application. Disclosure of Invention The application aims to overcome the defects in the prior art and provides an English reading and understanding capability self-adaptive evaluation method and system. The method aims to solve the technical problems that the traditional fixed difficulty test is easy to cause a ceiling effect and frustration, the conventional self-adaptive test is rough in description of the question difficulty, fuzzy in diagnosis of reading sub-skills and insufficient in safety consideration of a question bank, so that the total capacity and subdivision skills of students are accurately estimated b