US-12626421-B1 - System, method, and computer program for automatically generating branch scenes for a branching video
Abstract
This disclosure pertains to a system, method, and computer program for automatically creating branching scenes for videos within a video production workspace. The workspace offers a call-to-action feature that enables a user to initiate the automated creation of branching scenes using a natural language request. In response to receiving a request for branching scenes, the system identifies the video's current state within the workspace, discerning scenes, assets, and timelines within the video. System-defined attributes of the assets in the current state, as well as user-defined attributes for the branch, guide the branching process. These attributes are inputted into a generative AI model, which generates a branching narrative for the video and outputs structured data files describing the branching scenes. The branching scenes are rendered in the workspace for user review and refinement. Users can further refine scenes by inputting further instructions for the AI model.
Inventors
- Matthew Harney
Assignees
- GoAnimate, Inc.
Dates
- Publication Date
- 20260512
- Application Date
- 20231117
Claims (17)
- 1 . A method, performed by a computer system, for automatically creating branching scenes for branching videos, the method comprising: providing a multimedia video production workspace for creating scenes for a video, wherein the multimedia video production workspace includes a video timeline that illustrates an order and time in which scenes in the video appear; providing a call-to-action in the video production workspace to enable a user to initiate automated creation of scenes for a branching narrative in a video being created within the video production workspace; in response to receiving user input to create scenes for a branching narrative in a video being created in the video production workspace, performing the following: identifying a current state of the video in the video production workspace, including identifying all the assets in the current state of the video; obtaining metadata related to assets in the current state of the video; inputting said metadata into a generative AI model configured to generate a branching narrative for the video and output data describing a plurality of branching scenes for the branching narrative; and rendering the branching scenes in the video production workspace in accordance with the data outputted by the generative AI model, wherein the branching scenes are incorporated into the video timeline such that child branching scenes follow a parent branch scene.
- 2 . The method of claim 1 , further comprising receiving natural language input from the user with user-defined attributes for the branching narrative, and inputting said user input into the generative AI model in addition to the metadata related to the current state of the video to obtain a branching narrative and corresponding branching scenes that comply with the user's natural language input.
- 3 . The method of claim 1 , wherein the output data from the generative AI model describes at least two child branching scenes.
- 4 . The method of claim 3 , wherein the output data further describes a parent branch scene.
- 5 . The method of claim 1 , further comprising enabling a user to iteratively transform the branching scenes via a natural language interface.
- 6 . The method of claim 1 , wherein metadata for an asset includes metadata tags for the asset, one or more scenes associated with the asset, an asset position, an asset size, and timeline data associated with the asset.
- 7 . The method of claim 6 , wherein identifying attributes of the assets in the current state of the video comprising using a computer vision model to classify a visual asset in the video with one or more attributes.
- 8 . A video production system for automatically creating branching scenes for branching videos comprising: a video workspace module (VWM) that provides a multimedia video production workspace for creating scenes for a video, wherein the VWM enables a user to input a branch generation request, and wherein the multimedia video production workspace includes a video timeline that illustrates an order and time in which scenes in the video appear; a branch creation module (BCM) configured to receive both the branch generation request and metadata from the VWM related to video scenes in the video production workspace and configured to generate a branching narrative for the video, including a parent branch and two or more child branches; and a generative AI model integrated within the BCM, wherein the BCM utilizes the generative AI model to produce data describing a plurality of branching scenes for the branching narrative.
- 9 . The system of claim 8 , wherein the BCM utilizes the generative AI model to produce a question and multiple choice answers related to the branching narrative, with each answer leading to a subsequent scene described in the data outputted by the generative AI model.
- 10 . The system of claim 9 , wherein the output of the BCM is a structured data file that is readable by the VWM to render the branching video scenes in the video production workspace.
- 11 . A non-transitory computer-readable medium comprising a computer program, that, when executed by a computer system, enables the computer system to perform the following method for automatically creating branching scenes for branching videos, the method comprising: providing a multimedia video production workspace for creating scenes for a video, wherein the multimedia video production workspace includes a video timeline that illustrates an order and time in which scenes in the video appear; providing a call-to-action in the video production workspace to enable a user to initiate automated creation of scenes for a branching narrative in a video being created within the video production workspace; in response to receiving user input to create scenes for a branching narrative in a video being created in the video production workspace, performing the following: identifying a current state of the video in the video production workspace, including identifying all the assets in the current state of the video; obtaining metadata related to assets in the current state of the video; inputting said metadata into a generative AI model configured to generate a branching narrative for the video and output data describing a plurality of branching scenes for the branching narrative; and rendering the branching scenes in the video production workspace in accordance with the data outputted by the generative AI model, wherein the branching scenes are incorporated into the video timeline such that child branching scenes follow a parent branch scene.
- 12 . The non-transitory computer-readable medium of claim 11 , further comprising receiving natural language input from the user with user-defined attributes for the branching narrative, and inputting said user input into the generative AI model in addition to the metadata related to the current state of the video to obtain a branching narrative and corresponding branching scenes that comply with the user's natural language input.
- 13 . The non-transitory computer-readable medium of claim 11 , wherein the output data from the generative AI model describes at least two child branching scenes.
- 14 . The non-transitory computer-readable medium of claim 13 , wherein the output data further describes a parent branch scene.
- 15 . The non-transitory computer-readable medium of claim 11 , further comprising enabling a user to iteratively transform the branching scenes via a natural language interface.
- 16 . The non-transitory computer-readable medium of claim 11 , wherein metadata for an asset includes metadata tags for the asset, one or more scenes associated with the asset, an asset position, an asset size, and timeline data associated with the asset.
- 17 . The non-transitory computer-readable medium of claim 16 , wherein identifying attributes of the assets in the current state of the video comprising using a computer vision model to classify a visual asset in the video with one or more attributes.
Description
RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application No. 63/541,724 filed on Sep. 29, 2023, and titled “Interactive Branching Generation,” the contents of which are incorporated by reference herein as if fully disclosed herein. FIELD OF THE INVENTION This invention relates generally to video generation using artificial intelligence and, more specifically, to automatically generating branching scenes for a branching narrative video based on a current state of a video and any user input for the branching narrative. BACKGROUND Branching video is a common way to use the medium of digital video to create interactive stories, courses, and quizzes. Unlike traditional videos that are a linear collection of individual scenes played in a predetermined sequence, branching videos give viewers the ability to influence the narrative by making choices. These choices determine which scene is displayed next, providing an interactive experience. For example, a branching video might pause playback and allow the user the option of picking one of two or more options as a way of choosing what happens next in their viewing journey. The next scene shown to the user would depend on the option they chose. This allows a linear, non-interactive media such as video to appear as interactive and gives the viewer the ability to choose his or her own story. As an example, imagine a video scene of an actor standing next to a door. A text prompt may ask the viewer, “Open Locked Door?[YES] or [NO].” Two follow-up scenes are available: one where the door opens and another where the actor is unsuccessful and requires a key. The follow up scene played depends on the user's response to the question of whether the actor will be able to open the locked door. While the overarching narrative remains largely similar across the two follow-up (child) scenes, small yet crucial variations exist, making the production of such videos intricate. Traditional video production is inherently labor-intensive, necessitating the manual selection and sequencing of video clips, foley sounds, audio tracks, and other media elements. Crafting high-quality videos demands significant time and expertise. The introduction of branching paths complicates this process exponentially. It requires much more planning than normal video due to the non-linear and combinatorial effects of working with branching video. A graph like production structure is needed where the states of props, environments, actors, and costume is mapped out ahead of time and pre-recorded. Every potential narrative permutation based on viewer choices must be planned, recorded, and produced. A major difference between linear and branching videos lies in the repetition of various elements across the video production. Scenes in branching videos often mirror one another with only minor dialogue changes or other small differences. Presently, similar scenes must be filmed or animated separately, leading to a tremendous duplication of effort. Given the myriad combinations stemming from settings, actors, props, and dialogue for each narrative branch, the production process can quickly become overwhelming. Additionally, the inherently iterative nature of the creative process means that changes to early scenes can necessitate re-shooting or re-animating all subsequent scenes, including every branching path. Therefore, there is strong demand for a solution to automate and streamline the management and production of branching videos in order to significantly reduce the time and effort required by current methodologies. SUMMARY OF THE INVENTION This disclosure relates to a system, method, and computer program for automatically creating branching scenes for branching narrative videos based on a current state of the video and any user input regarding the branching narrative and/or scenes. The system provides a video production workspace where users can create scenes for a video by adding multimedia assets to the scenes such as text, images, video clips, animations, characters, backgrounds, props, and the like. The workspace includes a call-to-action feature (e.g., a clickable button or a command menu option) that enables users to initiate the creation of branching narratives within their video. Users can provide natural language instructions for the branching video or, in some embodiments, even voice requests to guide the branching process. The user input for the branching scenes may be very specific or a high-level directive. The user may add assets to a scene before requesting that the video branch from that scene or the user may create a blank scene and have the system create both the parent and child branch scenes. The user may also enter a branch creation request without creating any scenes and have the system generate an entire narrative and all the scenes for the video based on some user-defined instruction (“create a branching video with questions and answer choices how to add fr