EP-4216038-B1 - VIDEO CLIP OBJECT TRACKING

EP4216038B1EP 4216038 B1EP4216038 B1EP 4216038B1EP-4216038-B1

Inventors

HARE, SAMUEL EDWARD
MCPHEE, ANDREW JAMES
MATHEW, TONY

Dates

Publication Date: 20260506
Application Date: 20190828

Claims (14)

A method comprising: capturing (702), using a camera-enabled device, video content of a real-world scene; storing (704) the captured video content; after the captured video content is stored, in response to receiving a request to augment the stored captured video content with a virtual object: generating (708) an interactive augmented reality display that adds a virtual object to the stored captured video content to create augmented video content comprising the real-world scene and the virtual object; characterised by: in response to receiving a first type of input which presses and holds the virtual object for at least a first threshold amount of time, temporarily increasing a size of the virtual object to indicate a change to a state enabling a user to manipulate the virtual object in two-dimensional space; and in response to receiving a second type of input which touches and then drags the virtual object within the first threshold amount of time, adding a grid underneath the virtual object to indicate a change to a state enabling a user to manipulate the virtual object in three-dimensional space.
The method of claim 1, comprising: capturing (702), using the camera-enabled device, movement information collected by the camera-enabled device during capture of the video content; storing (704, 716) the captured movement information, the movement information that is stored comprising a plurality of inertial measurement unit, IMU, frames associated with respective timestamps; after the captured video content and the captured movement information are stored, in response to receiving a request to augment the stored captured video content with a virtual object: processing (706) the stored captured video content to identify a real-world object in the scene; and generating (708) the interactive augmented reality display; and adjusting (710), during playback of the augmented video content, an on-screen position of the virtual object within the augmented video content based at least in part on the stored movement information.
The method of claim 2, further comprising: after the captured video content and the captured movement information are stored, in response to receiving a request to augment the stored captured video content with a virtual object: retrieving (717) the plurality of IMU frames collected during capture of the video content; and for each frame timestamp of the stored captured video content, generating a paired IMU frame by performing bilinear interpolation of the two closest IMU frames; and wherein said adjusting comprises adjusting (710), during playback of the augmented video content, the on-screen position of the virtual object within the augmented video content based at least in part on the information from the generated IMU frames paired to the frames of the stored captured video content.
The method of claim 3, further comprising: correlating information from the plurality of IMU frames with the information from the stored captured video content based on the generated IMU frames paired to the frames of the stored captured video content.
The method of any of claims 2-4, wherein the virtual object comprises a virtual three-dimensional object and wherein adding the virtual object to the stored captured video content comprises: attaching the virtual three-dimensional object to a real-world object identified in the scene by processing (706) the stored captured video content; determining, based on the stored movement information, that a position of the camera-enabled device has moved during capture of the video content resulting in the real-world object being moved from a first position to a second position on a screen during playback of the video content; and maintaining the position of the virtual three-dimensional object in fixed association with a position of the real-world object by moving the position of the virtual three-dimensional object from a third position to a fourth position on the screen as the real-world object moves from the first position to the second position during playback.
The method of any preceding claim, wherein the virtual object comprises a virtual three-dimensional object and wherein adding the virtual object comprises: attaching the virtual three-dimensional object to a real-world object identified in the scene by processing (706) the stored captured video content; determining, based on an image analysis of the video content, that a real-world position of the real-world object has moved during capture of the video from a first position to a second position while the camera remained in a fixed position; and maintaining the position of the virtual three-dimensional object in fixed association with a position of the real-world object by moving the position of the virtual three-dimensional object on a screen from a third position to a fourth position as the real-world object moves from the first position to the second position during playback.
The method of any preceding claim, further comprising: playing the stored video content before adding the virtual object; displaying a plurality of icons representing virtual objects while the video content plays; receiving a user selection of the virtual object from the plurality of icons while the video content plays; in response to receiving the user selection: pausing the stored video content at a given frame; and performing the adding of the virtual object to the given frame; preferably, further comprising resuming playback of the augmented video content in response to determining that further user input has not been received after selection of the virtual object after a given time interval.
The method of any preceding claim, further comprising: determining that the virtual object overlaps boundaries of first and second real-world objects depicted in the video content; computing an overlap amount of each boundary of the first and second real-world objects that the virtual object overlaps; determining that the overlap amount of the second real-world object exceeds the overlap amount of the first real-world object; and in response to determining that the overlap amount of the second real-world object exceeds the overlap amount of the first real-world object, selecting the second real-world object as a target real-world object to track; preferably, further comprising: attaching the virtual object to the target real-world object in response to receiving a user input that presses and holds the virtual object on a given frame for a third threshold amount of time; and attaching the virtual object to a second real-world object in response to: receiving a first user input that touches the virtual object; and receiving, within the third threshold amount of time, a second user input that drags the virtual object to a position of the second real-world object.
The method of any preceding claim, wherein adding the virtual object to the real-world scene in the stored captured video content comprises presenting a plurality of icons each having a circular ground shadow positioned relative to the respective icon and receiving a user selection of one of the icons.
The method of any preceding claim, further comprising replacing the virtual object with another virtual object in response to receiving a user selection of the another virtual object, wherein the another virtual object retains properties of the virtual object that is replaced and is attached to a same real-world object to which the virtual object that is replaced was attached.
The method of any preceding claim, further comprising modifying a visual attribute of the virtual object after the virtual object is added to the real-world scene, preferably wherein modifying the visual attribute comprises adding at least one of a geofilter, a two-dimensional sticker, a caption, and paint.
The method of any preceding claim, further comprising automatically rewinding the augmented video content at a predetermined rate to a beginning of the augmented video content after adding the virtual object to the stored captured video content.
A system comprising: a processor configured to perform operations comprising: capturing (702), using a camera-enabled device, video content of a real-world scene; storing (704) the captured video content; after the captured video content is stored, in response to receiving a request to augment the stored captured video content with a virtual object: generating (708) an interactive augmented reality display that adds a virtual object to the stored captured video content to create augmented video content comprising the real-world scene and the virtual object; characterised by the operations further comprising: temporarily increasing a size of the virtual object in response to receiving a first type of input that presses and holds the virtual object for at least a first threshold amount of time to indicate a change to a state enabling a user to manipulate the virtual object in two-dimensional space; and adding a grid underneath the virtual object in response to receiving a second type of input that touches and then drags the virtual object within the first threshold amount of time to indicate a change to a state enabling a user to manipulate the virtual object in three-dimensional space.
A non-transitory machine-readable storage medium including an augmented reality system that includes instructions that, when executed by one or more processors of a machine, cause the machine to perform operations comprising: capturing (702), using a camera-enabled device, video content of a real-world scene; storing (704) the captured video content; after the captured video content is stored, in response to receiving a request to augment the stored captured video content with a virtual object: generating (708) an interactive augmented reality display that adds a virtual object to the stored captured video content to create augmented video content comprising the real-world scene and the virtual object; characterised by the operations further comprising: temporarily increasing a size of the virtual object in response to receiving a first type of input that presses and holds the virtual object for at least a first threshold amount of time to indicate a change to a state enabling a user to manipulate the virtual object in two-dimensional space; and adding a grid underneath the virtual object in response to receiving a second type of input that touches and then drags the virtual object within the first threshold amount of time to indicate a change to a state enabling a user to manipulate the virtual object in three-dimensional space.

Description

TECHNICAL FIELD The present disclosure relates generally to visual presentations and more particularly to rendering virtual modifications to surfaces in real-world environments. BACKGROUND Virtual rendering systems can be used to create engaging and entertaining augmented reality experiences, in which three-dimensional virtual object graphics content appears to be present in the real-world. Such systems can be subject to presentation problems due to environmental conditions, user actions, unanticipated visual interruption between a camera and the object being rendered, and the like. This can cause a virtual object to disappear or otherwise behave erratically, which breaks the illusion of the virtual objects being present in the real-world. US 2018/210628 A1 discloses a method and system that facilitates the manipulation of virtual content displayed in conjunction with images of real-world objects and environments, whereby virtual objects can be moved and manipulated relative to a real-world environment. In some embodiments, an image of a three-dimensional scene may be a still image or previously-recorded video (e.g., previously captured by the camera of the computing device). US 2018/082430 A1 describes a mobile device that stores a target image and target image data collected contemporaneously with the target image. The mobile device receives a reference position indication that corresponds to the target image data and receives a video feed from a camera while the mobile device is in the reference position. The mobile device detects a match between a first image from the video feed and the target image, unlocks an augmented reality space, and instructs presentation of a virtual object within the augmented reality space. SUMMARY According to an aspect of the invention, there is provided a method as defined in claim 1. According to further aspects of the invention, there is provided a system and a machine-readable medium as defined in claims 13 and 14. Preferred and/or optional features are set out in the dependent claims. BRIEF DESCRIPTION OF THE DRAWINGS In the drawings, which are not necessarily drawn to scale, like numerals may describe similar components in different views. To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the figure number in which that element is first introduced. Some embodiments are illustrated by way of example, and not limitation, in the figures of the accompanying drawings in which: FIG. 1 is a block diagram showing an example messaging system for exchanging data (e.g., messages and associated content) over a network, according to example embodiments.FIG. 2 is a block diagram illustrating further details regarding a messaging system, according to example embodiments.FIG. 3 is a schematic diagram illustrating data which may be stored in the database of the messaging server system, according to example embodiments.FIG. 4 is a schematic diagram illustrating a structure of a message generated by a messaging client application for communication, according to example embodiments.FIG. 5 is a schematic diagram illustrating an example access-limiting process, in terms of which access to content (e.g., an ephemeral message, and associated multimedia payload of data) or a content collection (e.g., an ephemeral message story) may be time-limited (e.g., made ephemeral), according to example embodiments.FIG. 6 is a block diagram illustrating various components of an augmented reality system, according to example embodiments.FIGS. 7A-B and 8 are flowcharts illustrating example operations of the augmented reality system in performing a process for rendering a virtual object in a video clip, according to example embodiments.FIG. 9 is a flowchart illustrating example operations of the augmented reality system in performing a process for tracking an object rendered in a video clip, according to example embodiments.FIGS. 10 and 11 are diagrams depicting an object rendered within a three-dimensional space by an augmented reality system, according to example embodiments.FIG. 12 is a block diagram illustrating a representative software architecture, which may be used in conjunction with various hardware architectures herein described, according to example embodiments.FIG. 13 is a block diagram illustrating components of a machine able to read instructions from a machine-readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein, according to example embodiments. DETAILED DESCRIPTION The description that follows includes systems, methods, techniques, instruction sequences, and computing machine program products that embody illustrative embodiments of the disclosure. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide an understanding of various embodiments. It will be evident, however,