US-20260129084-A1 - Augmentation Of Hybrid Cyber-Physical Environments

US20260129084A1US 20260129084 A1US20260129084 A1US 20260129084A1US-20260129084-A1

Abstract

A system for AI-enhanced educational telepresence integrates artificial intelligence intermediary capabilities into video conferencing platforms to monitor, analyze, and intelligently modify communications between remote participants and classroom environments in real-time. The system comprises audio-video interfaces connected through a communication network, with a central AI intermediary that monitors classroom dynamics and participant engagement. Key capabilities include real-time correction of misinformation, addition of contextual information, deliberate introduction of educational stimuli, translation services, and generation of deepfaked content to modify participant appearance or speech. The system provides engagement monitoring that responds through instructor alerts, generated deepfake questions, or visual stimuli injection. Advanced embodiments incorporate surrogate avatar functionality with private communication channels and coaching capabilities. The system enables AI entities to gain physical world presence through surrogate avatars, allowing artificial intelligence systems to interact with physical environments through human intermediaries.

Inventors

David R. Bruce
Neil D.B. Bruce

Assignees

UNIVERSITY OF OTTAWA

Dates

Publication Date: 20260507
Application Date: 20251106

Claims (20)

1 . A system for AI-enhanced educational telepresence over a video-conferencing system, comprising: a remote audio video interface used by a remote participant at a remote location to communicate over a communication network on which the video-conferencing system operates; at least one student audio video interface used by at least one student in the classroom setting to communicate over the communication network during a session over the communication network led by the instructor; an artificial intelligence (AI) intermediary configured to monitor and process communications between the remote participant, the instructor, and the at least one student over the communication network; wherein the AI intermediary is configured to analyze classroom dynamics and participant engagement in real-time by analyzing audio and video images communicated on the communication network; wherein the AI intermediary is configured to modify at least one of audio or video content transmitted over the communication network based on the analyzed classroom dynamics to facilitate interaction between the remote participant and others of the at least one student in the classroom setting.
2 . The system of claim 1 , further comprising an instructor audio video interface used by an instructor in a classroom setting to communicate over the communication network.
3 . The system of claim 1 , further comprising an audio video interface used by a surrogate avatar.
4 . The system of claim 1 , wherein the AI intermediary is configured to correct misinformation in real-time during communications over the communication network.
5 . The system of claim 1 , wherein the AI intermediary is configured to add contextual information to enhance understanding of educational content being discussed over the communication network.
6 . The system of claim 1 , wherein the AI intermediary is configured to deliberately introduce errors into communications over the communication network to stimulate discussion and engagement among participants.
7 . The system of claim 1 , wherein the AI intermediary is configured to provide translation services between different languages spoken by the participants over the communication network.
8 . The system of claim 1 , wherein the AI intermediary is configured to generate deepfaked audio or video content to modify the appearance or speech of the remote participant.
9 . The system of claim 8 , wherein the deepfake content masks emotional discomfort or social anxiety of the remote participant.
10 . The system of claim 1 , wherein the AI intermediary is configured to monitor engagement levels of the remote participant and send alerts to the instructor when disengagement is detected.
11 . The system of claim 10 , wherein the alerts comprise sending a direct message to the instructor with a suggested stimulus.
12 . The system of claim 10 , wherein the AI intermediary is configured to generate a deepfake question from an idle remote participant to prompt interaction with the class.
13 . The system of claim 10 , wherein the AI intermediary is configured to inject a visual stimulus into the video feed transmitted to the remote participant.
14 . The system of claim 1 , wherein the AI intermediary is configured to automatically augment the video feed with informative messages and graphics based on classroom content.
15 . The system of claim 1 , wherein the AI intermediary is configured to provide real-time correction of information in the video feed transmitted to the remote participant.
16 . The system of claim 1 , wherein the AI intermediary is configured to enable replay of video segments for the remote participant.
17 . The system of claim 1 , wherein the AI intermediary is configured to provide content summarization for the remote participant.
18 . The system of claim 1 , wherein the AI intermediary is configured to provide real-time translation of content for the remote participant.
19 . The system of claim 3 , further comprising an internal communication channel separate from public classroom audio, wherein the AI intermediary is configured to connect the remote participant to the surrogate avatar on the internal communication channel.
20 . The system of claim 19 , wherein the AI intermediary is configured to paste a non-moving mouth image on the remote participant's video while the remote participant communicates privately with the surrogate avatar.

Description

CLAIM OF PRIORITY This application claims priority to U.S. Provisional Patent Application Ser. No. 63/716,984 titled “Enhancing remote laboratory teaching practice using the surrogate avatar experience,” which was filed on Nov. 6, 2024, and whose contents are incorporated by reference. BACKGROUND The field of remote education and telepresence has evolved significantly, particularly following the widespread adoption of hybrid learning models during the COVID-19 pandemic. Traditional telepresence solutions, such as videoconferencing platforms like Zoom, Microsoft Teams, and Cisco Webex, have provided basic connectivity between remote participants and physical learning environments. However, these conventional approaches suffer from significant limitations that impede effective educational engagement. One of the primary challenges in current hybrid learning systems is the lack of physical embodiment for remote participants. Research has demonstrated that physical presence and embodied learning experiences are crucial for effective education, particularly in laboratory settings and interactive classroom environments. Remote students often experience feelings of isolation, disengagement, and disconnection from their peers and instructors when participating through traditional video conferencing alone. Existing telepresence technologies have attempted to address these limitations through various approaches. Robotic telepresence systems, such as those developed for healthcare and workplace applications, provide mobile platforms that can be remotely controlled to navigate physical spaces. However, these systems are typically expensive, require specialized programming for each environment, and often create barriers to natural social interaction due to their mechanical nature. Human surrogate avatar systems have emerged as an alternative approach, where volunteer participants act as physical representatives for remote users. The Surrogate Avatar Experience (SuAvE) has been explored in educational contexts, demonstrating improved engagement and interaction compared to traditional video conferencing. However, existing surrogate avatar implementations lack intelligent intermediary systems that can enhance and optimize the communication between remote participants and their physical representatives. Current telepresence solutions also fail to address several critical aspects of educational interaction. They do not provide mechanisms for real-time correction of misinformation, contextual enhancement of educational content, or intelligent monitoring of classroom dynamics and student engagement. Additionally, existing systems do not offer capabilities for deliberate introduction of educational stimuli to promote discussion and critical thinking. Furthermore, there is a growing need for artificial intelligence entities to have meaningful presence and interaction capabilities in physical environments. Current AI systems are limited to virtual interactions and lack the ability to engage with the physical world through embodied presence, which restricts their potential for learning, adaptation, and real-world application. The limitations of existing telepresence and educational technologies create a significant gap in the ability to provide truly integrated, intelligent, and engaging remote learning experiences. There remains a need for a system that combines the benefits of human surrogate representation with advanced artificial intelligence capabilities to create enhanced educational telepresence that can monitor, analyze, and intelligently modify interactions in real-time. SUMMARY As described in more detail below, a system is provided that addresses the limitations of conventional video conferencing in educational settings. The system integrates an artificial intelligence intermediary into video conferencing platforms to monitor, analyze, and intelligently modify communications between remote participants and classroom participants in real-time. The core system comprises audio-video interfaces for remote participants, instructors, and students, all connected through a communication network. A central AI intermediary monitors classroom dynamics and participant engagement by analyzing audio and video content, then modifies transmitted content to facilitate better interaction between remote and in-person participants. Key capabilities of the AI intermediary include real-time correction of misinformation, addition of contextual information, deliberate introduction of educational stimuli to promote discussion, and provision of translation services. The system can generate deepfake audio or video content to modify participant appearance or speech, particularly to mask emotional discomfort or social anxiety of remote participants. Deepfake audio or content can also prevent discomfort or conflict in the educational environment. The system provides sophisticated engagement monitoring, detecting when remote participants become diseng