CN-122022342-A - Learning path real-time self-adaptive adjustment method and system based on multi-modal behavior perception

CN122022342ACN 122022342 ACN122022342 ACN 122022342ACN-122022342-A

Abstract

The invention discloses a learning path real-time self-adaptive adjustment method and system based on multi-modal behavior perception, and belongs to the field of intersection of education technology and artificial intelligence. The method comprises the steps that a terminal layer collects multimode data such as vision, voice, physiology and interactive behaviors of a learner in real time and carries out local privacy treatment, an edge layer self-adaptively fuses multimode characteristics through a dynamic attention mechanism and utilizes a lightweight element reinforcement learning model to combine a local knowledge graph to generate millisecond-level path adjustment suggestions, a cloud layer aggregates a global information optimization model through federal learning and utilizes a element reinforcement learning frame to optimize a multi-objective path strategy, and optimization parameters are safely synchronized to the edge layer regularly. The invention effectively solves the defects of the traditional system in real-time performance, multi-mode fusion precision and privacy protection through the three-level collaborative architecture of the end, the side and the cloud, realizes the dynamic personalized adjustment of the learning path with low delay, high accuracy and strong safety, and remarkably improves the learning efficiency and the user experience.

Inventors

HU BIN
Lv juan

Assignees

关保网络安全技术(广东)有限公司

Dates

Publication Date: 20260512
Application Date: 20260131

Claims (10)

1. A learning path real-time self-adaptive adjustment method based on multi-modal behavior perception is characterized by comprising the following steps: S1, acquiring multi-mode original data of a learner in real time through terminal equipment, and carrying out pretreatment, feature extraction and privacy treatment on the terminal equipment locally to obtain desensitized feature vectors of all modes, wherein the multi-mode original data comprises visual data, voice data, physiological signal data and learning interaction behavior data; S2, transmitting the desensitization feature vector to an edge computing node, wherein the edge computing node utilizes a dynamic attention multi-mode fusion network to carry out weighted fusion on the desensitization feature vector to generate comprehensive state characterization reflecting the current cognition and emotion state of a learner; S3, the edge computing node performs real-time reasoning through a light element reinforcement learning decision model based on the comprehensive state representation and a preset local knowledge graph, generates an instant path adjustment suggestion and feeds the instant path adjustment suggestion back to the terminal equipment for execution; The cloud server uses a meta reinforcement learning framework to iteratively optimize a global learning path adjustment strategy by taking a multi-objective rewarding function fused with learning efficiency, knowledge mastering degree, user satisfaction and cognitive load as an optimization target; and S5, encrypting the optimized global model parameters by the cloud server and then transmitting the encrypted global model parameters to each edge computing node, and updating a local model of the edge computing node by using the transmitted parameters to form a closed-loop optimization system of 'edge real-time decision-cloud global evolution-parameter safety synchronization'.
2. The method according to claim 1, wherein in the step S1, the privacy processing is specifically performed by applying a local differential privacy mechanism to the biometric original data including the face image and the original audio, generating a desensitized modal feature vector by adding laplace noise or gaussian noise satisfying epsilon-differential privacy to perform disturbance, and performing generalization and k-anonymization processing to the learning interactive behavior log data.
3. The method according to claim 1, wherein the dynamic attention multimode converged network in step S2 comprises: A feature coding branch corresponding to each mode for mapping the input desensitized feature vector to a high-dimensional representation space; A dynamic attention weight calculation module taking the output of each characteristic coding branch as input through a full connection layer and Activating the small neural network of the function, and calculating to obtain a group of attention weight coefficients related to the input at the current moment Wherein Represents the modal index, and ; A feature fusion layer for performing a weighted summation operation: wherein Is the first The characteristics of the individual modalities encode the output, Is the final comprehensive state characterization.
4. A method according to claim 3, wherein the input of the dynamic attention weight calculation module further comprises a context state vector encoded by one or more of a learning phase, an activity type, a history state sequence for directing the attention weight to a modality tilt more relevant to the current learning context.
5. The method of claim 1, wherein the lightweight element reinforcement learning decision model in step S3 is obtained by model compression of a standard deep Q network or a strategy gradient network, the compression method comprising knowledge distillation, training of edge student networks using a teacher network trained in the cloud, parameter quantization, converting network weights and activation values from 32-bit floating point numbers to 8-bit integers, and network pruning, removing connections or whole neuron channels with absolute weights below a threshold.
6. The method of claim 1, wherein in step S4, the federal learning framework adopts a federal average algorithm with adaptive performance, and dynamically calculates and distributes an aggregation weight according to the local data volume reported by each edge computing node, the model update quality index and the node availability reliability during parameter aggregation, so as to improve the robustness and convergence efficiency of the global model.
7. The method according to claim 1, wherein the multi-objective rewards function in step S4 Expressed as: Wherein, the method comprises the steps of, Based on the calculation of the number of knowledge points grasped in a unit time, Based on the accuracy and consolidation of the exercises and tests, Based on the positive interaction event and the negative feedback calculation, Cognitive load value and behavior stagnation index calculation based on physiological signal estimation, weight parameters Dynamic adjustment may be preset through educational expert experience or through online bayesian optimization.
8. The method according to claim 1, wherein in step S5, the parameters between the cloud server and the edge computing node are transmitted by compressing the parameter tensor using a Google Protocol Buffers-based binary coding format and transmitting the parameter via an ultra-reliable low-delay communication slice of the 5G network or via a dedicated time-sensitive network channel, and the parameters are encrypted by using a RLWE-based homomorphic encryption algorithm.
9. The method of claim 1, wherein the instant path adjustment advice generated in step S3 is of a specific type including recommending learning resources of a specific format, adjusting difficulty level of subsequent learning tasks, suggesting changes in learning rhythm, planning review intervals, and a prompt signal triggering co-ordination or teacher intervention.
10. A multi-modal behavior awareness based learning path real-time adaptive tuning system for implementing the method of any one of claims 1-9, employing an end-side-cloud three-level collaborative architecture, comprising: The terminal layer is composed of a learning terminal integrated with a multi-mode data acquisition sensor and a local processing unit and is used for executing the step S1; the edge layer is formed by a network edge server or an intelligent gateway which is arranged close to the terminal layer, and is internally provided with the dynamic attention multi-mode fusion network, the light element reinforcement learning decision model and a local knowledge graph database, and is used for executing the steps S2 and S3; The cloud layer is composed of a high-performance computing cluster which is deployed in the data center and comprises a federal learning aggregator and a meta reinforcement learning optimizer and is used for executing the cloud functions in the steps S4 and S5; The terminal layer, the edge layer and the cloud layer are connected through wired and wireless communication networks to form a closed loop system for cooperative work of data, control and parameter flow.

Description

Learning path real-time self-adaptive adjustment method and system based on multi-modal behavior perception Technical Field The invention relates to the crossing field of education technology and artificial intelligence, in particular to a personalized learning support technology. Specifically, the invention provides a learning path real-time self-adaptive adjustment method and system based on multi-modal behavior perception, which are used for realizing millisecond-level dynamic and personalized adjustment of a learning path by acquiring and fusing multi-modal data such as vision, voice, physiology and behavior of a learner in real time and utilizing an end-side-cloud cooperative computing architecture and a meta-reinforcement learning technology on the premise of protecting privacy. Background With the development of education informatization, personalized learning has become a key to improving education quality. The method is characterized in that learning content, sequence and difficulty are dynamically adjusted according to the real-time state of a learner, namely self-adaption of a learning path is realized. However, existing learning path adaptive systems face three significant challenges in practical applications. First, real-time performance is severely inadequate. Most systems employ a cloud centralized processing mode. Learner status data (e.g., video, audio) is uploaded to a remote server for analysis and decision results are returned to the terminal, which typically results in delays on the order of hundreds of milliseconds or even seconds. Studies have shown that feedback delays in excess of 200 milliseconds can significantly distract the learner, undermine learning immersion, and thereby reduce learning efficiency. Second, multi-modal data fusion is inefficient. The learning state is the comprehensive expression of multidimensional factors such as cognition, emotion, behavior and the like, and the comprehensive judgment needs to be carried out by fusing multi-mode information such as facial expression, voice intonation, physiological signals, interactive behavior and the like. The existing method mostly adopts simple feature splicing or later decision fusion, and cannot effectively capture complex nonlinear association and complementarity among modes. For heterogeneous data with different feature scales and sampling rates, the conventional method often has poor fusion effect and limited state recognition accuracy due to problems of unmatched feature spaces, sensitive noise and the like. Third, privacy disclosure risk and behavioral disturbance problem. Multimodal data, in particular biological characteristics such as face, voice, etc., belong to sensitive personal information. Uploading the raw data to the cloud has a risk of leakage. In addition, the ubiquitous sensing equipment is easy to cause 'observed anxiety' of learners, so that the behavior of the learners is unnatural, the acquired data lose ecological effectiveness, and the accuracy of path adjustment is further affected. In recent years, edge computing and federal learning techniques provide new ideas for addressing the above problems. Edge computation sinks the computation task to the network edge, which can greatly reduce latency. Federal learning allows co-training models without local data leaving the device, helping to preserve privacy. However, the conventional education application schemes combining the techniques still have obvious limitations that (1) the computing capacity of the edge equipment is limited, and an efficient and lightweight multi-mode fusion algorithm capable of running in real time is lacking, (2) close coordination is lacking between cloud global model optimization and edge local instant decision, policy update is lagged, individual difference and dynamic scenes are difficult to adapt quickly, and (3) the system optimization target is single, only knowledge mastery is often focused, and the balance of multi-dimensional targets such as learning efficiency, emotion experience, cognitive load and the like is ignored. Therefore, a novel learning path self-adaptive method and system capable of deeply integrating multi-mode sensing, edge intelligence, privacy calculation and multi-objective optimization technology are urgently needed, so that low-delay and high-precision personalized learning guidance is realized on the premise of ensuring privacy safety. Disclosure of Invention Object of the invention The invention aims to overcome the defects of the prior art and provides a learning path real-time self-adaptive adjustment method and system based on multi-mode behavior perception. The method aims at realizing millisecond response, high-precision judgment and strong privacy protection of a learning path through an innovative terminal-edge-cloud three-level collaborative architecture, a dynamic attention multi-mode fusion mechanism and a global optimization framework fusing federal learning and meta reinforcemen