CN-122018690-A - Virtual reality content generation and staring interaction system based on artificial intelligence

CN122018690ACN 122018690 ACN122018690 ACN 122018690ACN-122018690-A

Abstract

The invention relates to the technical field of fusion of virtual reality and artificial intelligence, and discloses a virtual reality content generation and staring interaction system based on artificial intelligence, which comprises the following steps: a multi-modal sensing module; a gaze intent resolution module; an AI dynamic content generation module; a virtual reality rendering module; a gaze interaction control module; a user preference modeling module; a rule constraint engine; a data storage module; an edge calculation scheduling module; a security verification module; the invention has the advantages of innovatively realizing multidimensional analysis of the gaze intention, breaking through the limitation that the existing system can only identify simple gaze behaviors, fusing gaze data and physiological characteristics, accurately distinguishing three types of gaze intentions, combining confidence verification and historical data correction, remarkably improving the analysis accuracy and reducing misdelivery mutual probability; the invention has the capability of generating dynamic personalized content, ensures the individuation of the content, ensures the consistency of logic and solves the homogenization problem.

Inventors

Liu Huxia

Assignees

上海中侨职业技术大学

Dates

Publication Date: 20260512
Application Date: 20260131

Claims (10)

1. An artificial intelligence based virtual reality content generation and gaze interaction system, comprising: The multi-mode sensing module is used for collecting gaze data, physiological characteristic data and virtual reality environment state data of a user in real time, wherein the gaze data comprises gaze point coordinates, gaze duration and eyeball motion trail, the physiological characteristic data comprises pupil dilatation and facial micro-expression characteristics, and the environment state data comprises current virtual scene element types and scene rendering parameters; the gaze intention analysis module is in communication connection with the multimodal perception module, is internally provided with a pre-trained intention recognition model, and is used for carrying out fusion analysis on the acquired gaze data and physiological characteristic data and outputting a user gaze intention result, wherein the intention result comprises three types of interest reinforcement, content switching and interaction triggering; The AI dynamic content generation module is in communication connection with the staring intention analysis module, integrates a diffusion model and a large language model, and is used for generating a 3D virtual scene, character actions and interactive content which meet the requirements according to staring intention results, user preference data and virtual world rules, and supporting real-time iterative updating of the content; The virtual reality rendering module is respectively in communication connection with the AI dynamic content generating module and the staring interaction control module, adopts a dynamic staring point rendering technology to conduct layered rendering on the generated virtual content, preferentially improves the rendering precision of a staring area of a user, and synchronously outputs an immersive picture of the adaptive VR equipment; The gaze interaction control module is in communication connection with the gaze intention analysis module and the virtual reality rendering module, constructs a hierarchical gaze interaction mechanism, and executes corresponding interaction operations according to a gaze intention result, wherein the operations comprise gaze strengthening operation, gaze switching operation and gaze triggering operation; The user preference modeling module is respectively in communication connection with the multi-mode sensing module and the AI dynamic content generation module and is used for constructing a personalized preference model based on user historical gaze data and interaction records and updating user preference labels and weights in real time; The rule constraint engine is in communication connection with the AI dynamic content generation module, is internally provided with virtual world physical rules, content logic rules and interaction security rules, and is used for carrying out compliance verification on AI generated content and eliminating logic contradiction and security risk content; the data storage module is used for storing multi-mode sensing data, user preference model data, AI generated content data and interaction log data and supporting real-time reading and writing and incremental updating of the data; the edge computing scheduling module is respectively connected with the functional modules in a communication way and is used for dynamically scheduling computing tasks of AI content generation, staring intention analysis and rendering processing, and distributing computing resources of edge nodes to reduce processing delay; and the security verification module is used for performing security verification on the user identity information, the interaction instruction and the generated content and preventing unauthorized access and malicious content generation.
2. The virtual reality content generating and gazing interaction system based on artificial intelligence according to claim 1, wherein the multi-modal sensing module comprises an infrared eye movement tracking unit, a physiological sensing unit and an environment collecting unit, wherein: The infrared eye movement tracking unit is used for capturing eyeball movement data of a user by adopting an infrared LED and a 120-frame/second high-speed camera which are arranged at the edge of the VR head display lens, and outputting a fixation point coordinate and an eyeball track with the precision less than or equal to 0.3 degrees; The physiological sensing unit is integrated with a miniature physiological sensor, acquires pupil dilation data (sampling frequency is 50 Hz) and facial microexpressive feature point data in real time, and obtains emotion related features through a feature extraction algorithm; the environment acquisition unit synchronously acquires an element list, rendering resolution and frame rate parameters of the current virtual scene, and provides data support for content generation and rendering optimization.
3. The virtual reality content generating and gazing interaction system based on artificial intelligence of claim 1, wherein the gaze intention resolving module comprises: step S11, staring data and physiological characteristic data output by the multi-mode sensing module are received, data preprocessing is carried out, abnormal data points are removed, and standardization is carried out; step S12, inputting the preprocessed data into a pre-training intention recognition model, wherein the model adopts a CNN and Transformer fusion architecture, and extracting the associated characteristics of staring characteristics and physiological characteristics; And S13, outputting an intention classification result based on the associated features, calculating intention confidence coefficient, directly outputting the result when the confidence coefficient is more than or equal to 85%, triggering secondary verification when the confidence coefficient is less than 85%, and correcting the result by combining with user history interaction data.
4. The virtual reality content generating and gazing interaction system based on artificial intelligence of claim 1, wherein the AI dynamic content generating module comprises a scene generating unit, a character action generating unit and an interaction content generating unit, wherein: The scene generation unit is used for inputting staring intention, preference labels and rule constraint parameters by adopting an improved Stable Diffusion model to generate a 3D virtual scene with resolution ratio more than or equal to 4K, and supporting incremental strengthening and integral switching of scene details; The character action generating unit generates an action description instruction based on the large language model, converts the action description instruction into a coherent action of the virtual character through the action capturing migration algorithm, and the action delay is less than or equal to 100ms; and the interactive content generating unit generates interactive options and dialogue contents adapted to the current scene according to the staring triggering intention, and ensures that the interactive logic is consistent with the virtual world rule.
5. The artificial intelligence based virtual reality content generating and gazing interaction system of claim 1, wherein the layered rendering strategy of the virtual reality rendering module comprises: Dividing a picture into a staring core area, a transition area and an edge area, wherein the staring core area adopts 8K resolution and highest image quality rendering, the transition area adopts 4K resolution and medium image quality rendering, and the edge area adopts 2K resolution and basic image quality rendering; and predicting the gaze position according to the eye movement track of the user, and performing rendering pretreatment on the predicted area in advance for 50ms to ensure that the picture is free from jamming and blurring when the sight moves.
6. The virtual reality content generating and gazing interaction system based on artificial intelligence of claim 1, wherein the hierarchical interaction mechanism of the gazing interaction control module is specifically: gaze strengthening interaction, namely when gaze intention is interest strengthening, prolonging the display time of a corresponding virtual element, increasing element detail rendering and dynamic effect, and generating associated element recommendation; The staring switching interaction is that when staring intention is content switching, alternative content is generated based on a user preference model, smooth switching of the content is realized through a fade-in fade-out effect, and switching duration is less than or equal to 300ms; and (3) staring triggering interaction, namely when staring intention is interaction triggering, keeping staring target elements for 300-500ms to trigger preset interaction, and generating interaction feedback animation and content.
7. The virtual reality content generating and gazing interaction system based on artificial intelligence of claim 1, wherein the user preference modeling module builds a preference model comprising: S21, extracting interest elements, staring duration and interaction frequency in historical staring data of a user to generate an initial preference label; step S22, distributing weights to the preference labels based on an analytic hierarchy process, and calculating the weights of the interest elements according to the staring duration and the interaction frequency, wherein the weight value range is 0-1; And S23, receiving new interaction data in real time, and updating preference labels and weights after each time of effective interaction is completed, so as to ensure that the model is consistent with the current preference of the user.
8. The virtual reality content generating and gazing interaction system based on artificial intelligence of claim 1, wherein the rule constraint engine comprises a physical rule base, a logical rule base and a security rule base, wherein: The physical rule base stores physical rules such as gravity, collision, light shadow and the like in the virtual world, and ensures that the generated content accords with physical logic; The logic rule base stores the association relation of scene elements and role behavior logic, so that contradiction contents are prevented from being generated; The security rule base stores forbidden content list and interaction authority rule, and filters illegal content and unauthorized interaction.
9. The virtual reality content generating and gazing interaction system based on artificial intelligence of claim 1, wherein the scheduling policy of the edge computing scheduling module comprises: The staring intention analysis and rendering preprocessing tasks are preferentially distributed to the local edge nodes, so that the transmission delay is reduced; when the complexity of the AI content generation task is high, dynamically calling cloud edge node resources, and accelerating content generation by adopting a distributed computing mode; And monitoring the calculation load of each node in real time, and when the load of a single node is more than or equal to 70%, migrating part of tasks to low-load nodes.
10. The virtual reality content generating and gazing interaction system based on artificial intelligence according to claim 1, wherein the security check module comprises an identity check unit, an instruction check unit and a content check unit, wherein: The identity verification unit performs user identity verification through the biological characteristics bound by the VR equipment; The instruction checking unit checks the legitimacy of the interaction instruction and refuses the operation instruction exceeding the user authority; and the content verification unit adopts a multi-mode content auditing model to audit the text and image content generated by the AI in real time and reject illegal contents.

Description

Virtual reality content generation and staring interaction system based on artificial intelligence Technical Field The invention belongs to the technical field of fusion of virtual reality and artificial intelligence, and particularly relates to a virtual reality content generation and staring interaction system based on artificial intelligence. Background With the rapid development of VR technology and AI technology, the generation efficiency and interactive experience of virtual reality content become industry core pursuits. In an interactive mode, staring interaction is used as a natural and visual interaction means and is gradually applied to VR equipment, but the staring interaction precision of the existing system is lower, only a simple staring triggering function can be realized, the staring intention of a user cannot be accurately analyzed, and the content generation and staring interaction lack of deep linkage, so that the user is not immersed enough. Part of the prior art attempts to combine AI and gaze interaction, such as intelligent glasses patent of Meta adjusts content presentation through eye tracking data, but can only perform simple content optimization based on interest level, lacks multi-dimensional analysis of gaze intention, and XR interaction patent of Apple triggers digital assistant through gaze, but content generation depends on preset script, and cannot realize dynamic personalized generation. Meanwhile, the existing system still has defects in the aspects of rendering optimization, calculation delay control, content logic consistency and the like, and the requirements of high-end VR application scenes are difficult to meet. Therefore, a system capable of precisely analyzing gaze intention, dynamically generating personalized content and realizing linkage of content and gaze interaction depth is needed to solve the defects of the prior art. Based on the method, an artificial intelligence-based virtual reality content generation and staring interaction system is designed. Disclosure of Invention Aiming at the situation, in order to overcome the defects of the prior art, the invention provides a virtual reality content generation and staring interaction system based on artificial intelligence, which effectively solves the problems proposed by the background art. In order to achieve the aim, the invention provides the technical scheme that the virtual reality content generation and staring interaction system based on artificial intelligence comprises a multi-mode perception module, a multi-mode perception module and a virtual reality interaction module, wherein the multi-mode perception module is used for collecting staring data, physiological characteristic data and virtual reality environment state data of a user in real time, the staring data comprises staring point coordinates, staring duration and eyeball motion trail, the physiological characteristic data comprises pupil dilation and facial micro-expression characteristics, and the environment state data comprises current virtual scene element types and scene rendering parameters; the gaze intention analysis module is in communication connection with the multimodal perception module, is internally provided with a pre-trained intention recognition model, and is used for carrying out fusion analysis on the acquired gaze data and physiological characteristic data and outputting a user gaze intention result, wherein the intention result comprises three types of interest reinforcement, content switching and interaction triggering; The AI dynamic content generation module is in communication connection with the staring intention analysis module, integrates a diffusion model and a large language model, and is used for generating a 3D virtual scene, character actions and interactive content which meet the requirements according to staring intention results, user preference data and virtual world rules, and supporting real-time iterative updating of the content; The virtual reality rendering module is respectively in communication connection with the AI dynamic content generating module and the staring interaction control module, adopts a dynamic staring point rendering technology to conduct layered rendering on the generated virtual content, preferentially improves the rendering precision of a staring area of a user, and synchronously outputs an immersive picture of the adaptive VR equipment; The gaze interaction control module is in communication connection with the gaze intention analysis module and the virtual reality rendering module, constructs a hierarchical gaze interaction mechanism, and executes corresponding interaction operations according to a gaze intention result, wherein the operations comprise gaze strengthening operation, gaze switching operation and gaze triggering operation; The user preference modeling module is respectively in communication connection with the multi-mode sensing module and the AI dynamic content g