CN-122023074-A - Multi-mode fusion-based programming education home-school feedback generation method and system

CN122023074ACN 122023074 ACN122023074 ACN 122023074ACN-122023074-A

Abstract

The application relates to the technical field of education information, in particular to a programming education home-school feedback generation method and system based on multi-mode fusion. The method comprises the steps of firstly collecting multi-modal data of students in a programming learning process, at least comprising expression image data with time stamps and behavior log data, further extracting expression feature vectors and behavior-time association feature vectors, dynamically distributing fusion weights based on scene judgment and carrying out weighted fusion to generate multi-modal fusion features, determining technical shortboards, learning states and time efficiency of the students according to the multi-modal fusion features, matching specific suggestions from a feedback suggestion library, and generating personalized feedback reports to be pushed to a terminal. According to the application, multidimensional process data such as expression, behavior and time are fused, and accurate learning condition diagnosis and personalized guidance are realized through a scene weight distribution and conflict correction mechanism, so that the problems of feedback lag, general system and insufficient precision in the prior art are effectively solved.

Inventors

LU XIAOYU
MIAO XINYU
ZHANG SHUYUAN
ZHANG YANCHAO

Assignees

丽水学院

Dates

Publication Date: 20260512
Application Date: 20260127

Claims (10)

1. A programming education school feedback generation method based on multi-modal fusion is characterized by comprising the following steps: s1, acquiring multi-mode data of students in a programming learning process; the multi-modal data at least comprises expression image data, behavior log data and time dimension index data, wherein the expression image data and the behavior log data comprise corresponding time stamps; S2, processing the expression image data to extract expression feature vectors, performing association analysis on the behavior log data and the time dimension index data to extract behavior-time association feature vectors; S3, determining a current programming learning scene based on a preset scene judgment rule, and distributing fusion weights for the expression feature vectors and the behavior-time association feature vectors according to the scene; s4, carrying out weighted fusion on the expression feature vector and the behavior-time associated feature vector according to the fusion weight to generate a multi-mode fusion feature; s5, determining at least one of a technical short board, a learning state and time efficiency of the student according to the multi-mode fusion characteristics, and generating a personalized feedback report by matching from a preset feedback suggestion library according to the at least one of the technical short board, the learning state and the time efficiency of the student; And S6, pushing the personalized feedback report to the parent terminal.
2. The method according to claim 1, wherein in step S1: The expression image data are collected through a camera and at least comprise facial image frames and corresponding time stamps; The behavior log data is obtained through a programming software interface and at least comprises a module drag type, a parameter setting value, a control button clicking event and a feedback state of hardware equipment after program execution; The time dimension index data includes at least one of single step operation average time consumption, debugging interval average time, task total time consumption, pause total time length and parameter modification total times.
3. The method of claim 1, wherein the multimodal data further comprises code quality dimension data; the method further comprises the steps of: And determining the technical short board of the student based on the code quality dimension data.
4. The method according to claim 1, characterized in that in step S2: Processing the expression image data through a convolutional neural network model to extract the expression feature vector; and processing the behavior log data and the time dimension index data through a multi-layer perceptron model to extract the behavior-time correlation feature vector.
5. The method according to claim 1, wherein in step S3, the assigning a fusion weight according to the scene specifically includes: if the scene is a programming debugging scene, the weight of the behavior-time association feature vector is assigned as a, and the weight of the expression feature vector is assigned as 1-a, wherein the value range of a is 0.4-0.6; If the scene is a problem help-seeking scene, the weight of the expression feature vector is allocated as b, and the weight of the behavior-time association feature vector is allocated as 1-b, wherein the value range of b is 0.5-0.7.
6. The method of claim 5, wherein step S3 further comprises a collision correction step: When fusion weights are distributed, if the situation that the expression feature vector and the behavior-time associated feature vector collide exists under the same time stamp, the weights of the behavior-time associated feature vector and the expression feature vector under the time stamp are redistributed based on a preset collision weight adjustment strategy; The conflict comprises that the learning state indicated by the expression feature vector and the operation efficiency indicated by the behavior-time association feature vector conflict under the same time stamp.
7. The method according to claim 1, wherein in step S5, the feedback suggestion library stores mapping relations between a plurality of technical short boards and specific training method suggestions; the personalized feedback report is generated by matching, wherein the personalized feedback report comprises a report text and a visual chart which are generated by calling corresponding specific training method suggestions from the mapping relation and combining the learning state and/or the time efficiency according to the determined technical short board.
8. A programming education school feedback generation system based on multi-modal fusion for implementing the method of any one of claims 1-7, the system comprising: the data acquisition module is used for acquiring multi-mode data of students in the programming learning process; the multi-modal data at least comprises expression image data, behavior log data and time dimension index data, wherein the expression image data and the behavior log data comprise corresponding time stamps; The feature extraction module is connected with the data acquisition module and is used for processing the expression image data and extracting expression feature vectors; The multi-modal feature fusion module is connected with the feature extraction module and is used for determining a current programming learning scene based on a preset scene judgment rule, and distributing fusion weights for the expression feature vector and the behavior-time associated feature vector according to the scene; the feedback report generation module is connected with the multi-mode feature fusion module and is used for determining at least one of a technical short board, a learning state and time efficiency of the student according to the multi-mode fusion feature and generating a personalized feedback report in a matched mode from a preset feedback suggestion library according to the at least one of the technical short board, the learning state and the time efficiency of the student; And the report pushing module is connected with the feedback report generating module and used for pushing the personalized feedback report to the parent terminal.
9. The system of claim 8, wherein the data acquisition module comprises: The expression acquisition unit is used for acquiring facial image frames through the camera; the behavior acquisition unit is used for acquiring an operation log and hardware feedback data through a programming software interface; the feature extraction module includes: the expression feature extraction unit comprises a convolutional neural network model and is used for processing the facial image frames; The behavior-time characteristic extraction unit comprises a multi-layer perceptron model and is used for processing the associated operation log, hardware feedback data and time dimension index data.
10. An electronic device, comprising: A processor and a memory for storing a program executable by the processor; The processor is configured to implement the programming education home feedback generation method based on multi-modal fusion according to any one of claims 1 to 7 by running the program in the memory.

Description

Multi-mode fusion-based programming education home-school feedback generation method and system Technical Field The application relates to the technical field of education information, in particular to a programming education home-school feedback generation method and system based on multi-mode fusion. Background With the popularity of programming education in teenagers, parents have an increasing need to learn the learning process of children. However, the current mainstream home-school communication methods, such as delivering game results and job scores, have problems of information lag and content one-sided. Parents are not aware of the specific difficulties of children in programming (e.g., recurrent logic errors, emotional anxiety during debugging), operating habits (e.g., frequently modifying parameters), and time efficiency (e.g., single-step operations take too long), and thus it is difficult to provide accurate assistance. There are some schemes in the prior art for trying to analyze the learning state of students, but there are obvious limitations: code text analysis-based systems such systems generate analysis reports by analyzing codes submitted by students, detecting grammatical errors, counting code complexity, and the like. The method has the defects that the method only depends on a single mode of codes, and the emotional states (such as confusion and anxiety) and behavior sequences of students during programming cannot be captured, so that the root cause of errors is logic understanding problem, careless pen errors or emotional disturbance cannot be judged, feedback stays on a result level, and process insight is lacked. The system based on classroom expression recognition captures the facial expressions of students through cameras, and judges concentration, confusion and the like by utilizing an image recognition algorithm. The method has the defect that only a single mode of expression is analyzed, and the single mode is not related to behavior data such as programming operation, code execution results and the like. For example, the system recognizes student "frowning", but cannot distinguish whether "programming logic is not on" or "not in the way of teacher explanation" or other unrelated reasons, resulting in low feedback accuracy and limited guidance. In summary, the prior art fails to effectively integrate multidimensional and heterogeneous procedural data generated by students in the programming learning process, so that generated home-school feedback has the problems of hysteresis, generalization and insufficient accuracy, and the core requirements of procedural and precise child-bearing cannot be met. Disclosure of Invention The invention aims to overcome the defects of the prior art and provides a programming education home-school feedback generation method and system based on multi-mode fusion. The method has the core purposes of deeply mining programming capacity short plates, real learning states and behavior habits of students through fusion analysis of multidimensional process data such as expressions, behaviors and time of the students, so that personalized feedback reports with data support, specific content and strong operability are generated, and pertinence and effectiveness of cooperative child-bearing in home and school are improved. In order to achieve the above purpose, the invention adopts the following technical scheme: a programming education home-school feedback generation method based on multi-modal fusion comprises the following steps: s1, acquiring multi-mode data of students in a programming learning process; the multi-modal data at least comprises expression image data, behavior log data and time dimension index data, wherein the expression image data and the behavior log data comprise corresponding time stamps; S2, processing the expression image data to extract expression feature vectors, performing association analysis on the behavior log data and the time dimension index data to extract behavior-time association feature vectors; S3, determining a current programming learning scene based on a preset scene judgment rule, and distributing fusion weights for the expression feature vectors and the behavior-time association feature vectors according to the scene; s4, carrying out weighted fusion on the expression feature vector and the behavior-time associated feature vector according to the fusion weight to generate a multi-mode fusion feature; s5, determining at least one of a technical short board, a learning state and time efficiency of the student according to the multi-mode fusion characteristics, and generating a personalized feedback report by matching from a preset feedback suggestion library according to the at least one of the technical short board, the learning state and the time efficiency of the student; And S6, pushing the personalized feedback report to the parent terminal. Optionally, in step S1: The expression image data are collected throu