CN-122023077-A - Multi-source teaching content automatic explanation method and system based on explanation structure generation

CN122023077ACN 122023077 ACN122023077 ACN 122023077ACN-122023077-A

Abstract

The invention discloses an automatic multisource teaching content interpretation method and system based on interpretation structure generation, and belongs to the technical field of computer information processing and intelligent teaching. The method comprises the steps of receiving multi-source teaching contents such as documents, web pages or questions, generating unified content representation according to self-adaptive analysis of sources and structural features of the multi-source teaching contents, conducting semantic analysis on the content representation, constructing a teaching explanation structure comprising explanation units with different teaching role attributes, generating explanation texts based on the structure, generating explanation audio and visual contents according to the role attributes of the explanation units by adopting a differentiation strategy, and finally aligning the audio and visual contents with caption time sequences to synthesize teaching audios and videos. In addition, the system enables interactive extended lecture content to be generated and output for user-following based on existing lecture context. The invention realizes the intelligent, structured and interactive automatic generation of the teaching content, and improves the teaching efficiency and experience.

Inventors

BIAN NING
WANG YU

Assignees

北京游知智学科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260202

Claims (8)

1. The automatic explanation method of the multisource teaching content generated based on the explanation structure is characterized by comprising the following steps of: (1) Receiving teaching content input, wherein the teaching content input at least comprises one of document files, web page links or problem content input by a user; (2) Judging the resolvability of the teaching content according to the source type and the structural characteristics of the input teaching content, selecting a corresponding content analysis strategy according to a judging result, analyzing the teaching content, and generating a unified teaching content representation; (3) The method comprises the steps that semantic analysis and content understanding are carried out on teaching contents by using a content understanding and generating module, a teaching explanation structure for organizing and driving the expression of the teaching contents is generated, the teaching explanation structure comprises at least two types of explanation units, each explanation unit at least comprises a first explanation unit and a second explanation unit, and different explanation units correspond to different teaching role attributes; (4) The teaching explanation structure is used as a control structure, and corresponding explanation text contents are generated according to the sequence of the explanation units; (5) According to the teaching role attribute of the explanation unit, different audio generation strategies are adopted for different explanation units, and corresponding explanation audio is generated; (6) Generating corresponding visual contents only for the explanation units with preset teaching role attributes according to the teaching role attributes of the explanation units, and multiplexing the previous explanation units or preset generated visual contents for the explanation units without generating the visual contents; (7) And carrying out time sequence alignment on the explanation audio, the visual content and the explanation text, and generating a multi-mode explanation output result for teaching display under the driving of the teaching explanation structure.
2. The method of claim 1, wherein the structural features include at least a number of pages of a document, a file size, a text extractability ratio, and determining whether the tutorial content is image content (e.g., scanned document, picture) that contains mostly non-directly editable text based on the structural features, thereby selecting a text parsing strategy or an image parsing strategy.
3. The method of claim 1, wherein the first interpretation unit is configured to represent an interpretation content and the second interpretation unit is configured to represent a question or an interactive content.
4. The method of claim 1, wherein the instructional presentation structure is represented in the form of structured data, wherein each presentation unit is populated with at least character identification information and corresponding presentation text content.
5. The method of claim 1, wherein the display duration of the corresponding subtitle is dynamically calculated according to the number of characters of each semantic unit in the lecture text to realize synchronous presentation of the subtitle and the lecture audio.
6. The method of claim 1, wherein the audio-video output results are continuous audio-video files for online tutorial presentation or offline playback.
7. An interactive teaching interpretation extension method based on an interpretation structure is characterized by comprising the following steps: (1) Acquiring a teaching instruction structure and a multi-mode teaching instruction output result generated according to the methods in claims 1-6; (2) Receiving the inquiry information proposed by the user aiming at the output result; (3) Constructing context information based on the teaching explanation structure and the generated explanation content, and generating extended explanation content corresponding to the additional information; (4) And executing audio generation, visual generation and time sequence alignment on the extended explanation content, generating and outputting supplementary teaching explanation audios and videos.
8. An automatic teaching explanation generation and interaction system comprises a content receiving module, a content analyzing module, an explanation structure generating module, an audio generating module, a visual generating module, a synthesis output module and an interaction processing module, wherein the interaction processing module is used for realizing the method of any one of claims 1 to 7.

Description

Multi-source teaching content automatic explanation method and system based on explanation structure generation Technical Field The invention relates to the technical field of computer information processing, in particular to a method and a system for automatically understanding content, constructing an explanation structure and synthesizing and outputting audio and video for teaching scenes, and belongs to the technical fields of computer information processing, multimedia information processing and online education. Background Along with the rapid development of online education, enterprise training and knowledge payment scenes, the production mode of teaching contents gradually evolves from manual recording to automation and intelligent directions. In the prior art, the teaching content is usually generated by means of text summarization, speech synthesis or video template stitching. However, such schemes generally suffer from the following disadvantages: (1) The content generation has low intelligent degree, most of the content generation can only carry out mechanical abstract extraction or unidirectional reading on an input text, the deep understanding of the internal logic and teaching context of knowledge is lacking, the explanation structure conforming to the cognition rule cannot be automatically constructed, a large amount of manual intervention is still needed, and the production efficiency is low. (2) The teaching roles and the forms are single, namely generated explanation flows are spread and described in a straight line, and differences of different roles (such as teacher explanation, question interaction and summary emphasis) in the teaching process cannot be distinguished and simulated, so that generated contents are monotonous, and interactivity and guidance are insufficient. (3) The multi-mode output splitting and unsynchronizing that audio, visual content (such as slides and pictures) and subtitles are usually generated independently by different modules and then simply spliced, the unified coordination and control based on content logic are lacking, the problems of unsynchronized audio and video, irrelevant pictures and texts or hard switching and the like often occur, and the teaching continuity and experience are seriously damaged. (4) The multi-source content has the defects of difficulty in uniformly processing various teaching content sources such as documents, web pages, pictures and the like with different formats and structures, lack of self-adaptive analysis and structural extraction capability and limit the universality of schemes. Therefore, an intelligent technical scheme capable of deeply fusing content understanding and teaching logic and automatically generating high-quality, interactive and multi-mode teaching explanation is urgently needed. Disclosure of Invention The invention provides a multi-source teaching content automatic explanation method and system based on an explanation structure to solve the problems of low intelligent degree, single role and form, multi-mode output cutting, insufficient multi-source adaptability and the like of teaching content generation in the prior art. The content comprises (a) an automatic teaching instruction generation and audio/video output method based on multi-source content, (a) an interactive teaching instruction extension method based on an instruction structure, and (a) an automatic teaching instruction generation and interaction system. The technical scheme adopted by the invention is summarized as follows: in a first aspect, the present invention provides a method for generating automatic teaching instruction and outputting audio and video based on multi-source content, as shown in fig. 1, comprising the following steps: (1) A content receiving step, namely receiving teaching content input by a user, wherein the teaching content at least comprises one or more of a document file, a webpage link or a user problem; (2) The content analysis step is to automatically judge the resolvability of the content according to the source type and the structural characteristics of the teaching content, select a corresponding analysis strategy and generate a unified teaching content representation; (3) The construction step of the explanation structure comprises the steps of carrying out semantic analysis on the teaching content representation and constructing the teaching structure of an explanation unit containing at least two types of different teaching role attributes, wherein the explanation unit at least comprises a first explanation unit used for representing the explanation content and a second explanation unit used for representing the questioning or interaction content; (4) Generating an explanation text according to the order of the explanation units based on the teaching explanation structure; (5) Generating corresponding explanation audio by adopting different voice generation strategies according to the teaching role attribute