Search

EP-4736458-A1 - MULTI-VIEW MULTIPLANE-IMAGING VIDEO STREAMING

EP4736458A1EP 4736458 A1EP4736458 A1EP 4736458A1EP-4736458-A1

Abstract

Methods and apparatus for multiplane-imaging (MPI) video streaming. According to an example embodiment, a method for streaming an MPI video includes generating a sequence of video frames, each of the video frames including a respective plurality of patches representing texture and transparency layers of one or more multiplane images of the MPI video and applying video compression to the sequence of video frames to generate a video sub-stream. The method also includes generating a sequence of representations of atlas frames corresponding to the sequence of video frames to specify at least a packing arrangement of the patches and applying compression to the sequence of representations to generate an atlas sub-stream. The method further includes multiplexing the video sub-stream and the atlas sub-stream to generate a first coded bitstream encoding at least a portion of the MPI video.

Inventors

  • OH, Sejin

Assignees

  • Dolby Laboratories Licensing Corporation

Dates

Publication Date
20260506
Application Date
20240624

Claims (1)

  1. Docket No. D23136WO01 CLAIMS What is claimed is: 1. A method for streaming a multiplane-image (MPI) video, the method comprising: generating a sequence of video frames, each of the video frames including a respective plurality of patches representing texture and transparency layers of one or more multiplane images of the MPI video; applying video compression to the sequence of video frames to generate a video sub- stream; generating a sequence of representations of atlas frames corresponding to the sequence of video frames to specify at least a packing arrangement of the patches; applying compression to the sequence of representations to generate an atlas sub-stream; and multiplexing the video sub-stream and the atlas sub-stream to generate a first coded bitstream encoding at least a portion of the MPI video. 2. The method of claim 1, wherein the first coded bitstream is a visual volumetric video- based coding (V3C) bitstream in accordance with the ISO.IEC.23090-5 standard. 3. The method of claim 2, further comprising encapsulating the first coded bitstream into a media file configured for file playback. 4. The method of claim 2, further comprising: encapsulating the first coded bitstream into a sequence of segments according to a selected media container file format; and streaming the sequence of segments over a communication channel to a client device for playback. 5. The method of claim 2, wherein the first coded bitstream is configured for encapsulation in a single V3C bitstream track. 6. The method of claim 2, wherein the first coded bitstream is configured for encapsulation in two or more V3C bitstream tracks. Docket No. D23136WO01 7. The method of claim 6, wherein the two or more V3C bitstream tracks include: a first track that carries at least a portion of the atlas sub-stream; and a second track that carries at least a portion of the video sub-stream. 8. The method of claim 1, wherein texture and transparency patches of the respective plurality of patches are packed into a corresponding video frame using a packing arrangement selected from the group consisting of: a side-by-side arrangement; a top-to-bottom arrangement; a vertically interleaved arrangement; and a horizontally interleaved arrangement. 9. The method of claim 8, wherein the texture and transparency patches include patches of a first size and patches of a different second size. 10. The method of claim 1, wherein the respective plurality of patches has fewer than all of the patches of a corresponding multiplane image. 11. The method of claim 1, further comprising: providing to a client device a media presentation description (MPD) of an MPI streaming content stored in a storage container accessible via a server device, the MPI streaming content including a plurality of coded bitstreams that includes the first coded bitstream; for a period, providing to the client device a respective initialization segment from the storage container, the respective initialization segment being configured to inform a selection, at the client device, of one or more views of the MPI streaming content for which to request media segments for rendering; receiving, from the client device, a request identifying the selection and indicating a respective recommended value of at least one parameter selected from the group consisting of a bit rate, a resolution, a codec type, and a frame rate; and transmitting to the client device one or more of the plurality of coded bitstreams carrying the media segments selected in the storage container based on the identified selection and further based on one or more of the respective recommended values. Docket No. D23136WO01 12. The method of claim 11, wherein, for the period, the storage container has a plurality of media segments logically organized in accordance with different views and further logically organized in accordance with one or more of different bit rates, different resolutions, different codec types, and different frame rates. 13. The method of claim 11, wherein, in a multi-track mode, a common atlas, an individual atlas, and a packed video component are represented in the MPD file as separate adaptation sets; and wherein the adaptation set for the common atlas contains the respective initialization segment having parameter sets for initializing a V3C decoder at the client device. 14. The method of claim 13, wherein media segments for a representation of the adaptation set for the common atlas contain one or more track fragments of the V3C atlas track; and wherein media segments for representations of the video component adaptation sets contain one or more track fragments of the corresponding packed video track. 15. The method of claim 11, wherein the first coded bitstream is encapsulated in a single V3C bitstream track stored in the storage container. 16. The method of claim 11, wherein the first coded bitstream is encapsulated in two or more V3C bitstream tracks stored in the storage container. 17. The method of claim 11, further comprising switching from a first sequence of media segments to a different second sequence of media segments when the request indicates a change in the identified selection. 18. The method of claim 17, wherein the transmitting includes: transmitting a first coded bitstream carrying the first sequence of media segments corresponding to a first one of the different views; and transmitting a second coded bitstream carrying the different second sequence of media segments corresponding to a second one of the different views, wherein the first coded bitstream and the second coded bitstream have respective media segments corresponding to a same one of the different respective video segment times. Docket No. D23136WO01 19. The method of claim 1, wherein the MPI video is a multiview MPI video. 20. The method of claim 19, further comprising: providing to a client device a media presentation description corresponding to the multiview MPI video; for a period, providing to the client device a respective initialization segment to inform a selection, at the client device, of two or more camera views of the multiview MPI video for which to request media segments for rendering; receiving, from the client device, a request identifying the selection; and transmitting to the client device one or more coded bitstreams carrying media segments in accordance with the identified selection. 21. The method of claim 20, further comprising switching from transmitting a first sequence of media segments to transmitting a different second sequence of media segments when the request indicates a change of at least one camera view in the identified selection of the two or more camera views. 22. The method of claim 20, wherein the transmitting includes: transmitting a first sequence of media segments corresponding to a first camera view of a scene; and transmitting a different second sequence of media segments corresponding to a different second camera view of the scene, wherein the first and second sequences of media segments correspond to a same time interval of the multiview MPI video. 23. A non-transitory computer-readable medium storing instructions that, when executed by an electronic processor, cause the electronic processor to perform operations comprising the method of claim 1. 24. An apparatus for streaming a multiplane-image (MPI) video, the apparatus comprising: at least one processor; and at least one memory including program code; and wherein the at least one memory and the program code are configured to, with the at least one processor, cause the apparatus at least to: Docket No. D23136WO01 generate a sequence of video frames, each of the video frames including a respective plurality of patches representing texture and transparency layers of one or more multiplane images of the MPI video; apply video compression to the sequence of video frames to generate a video sub-stream; generate a sequence of representations of atlas frames corresponding to the sequence of video frames to specify at least a packing arrangement of the patches; apply compression to the sequence of representations to generate an atlas sub-stream; and multiplex the video sub-stream and the atlas sub-stream to generate a first coded bitstream encoding at least a portion of the MPI video.

Description

Docket No. D23136WO01 MULTI-VIEW MULTIPLANE-IMAGING VIDEO STREAMING 1. Cross-Reference to Related Applications [0001] This patent application claims the benefit of priority to the following applications: U.S. Provisional Patent Application No.63/556,461 filed February 22, 2024, U.S. Provisional Patent Application No.63/588,337 filed October 6, 2023, and U.S. Provisional Patent Application No.63/510,571 filed June 27, 2023, each of which is hereby incorporated by reference in their entireties. 2. Field of the Disclosure [0002] Various example embodiments relate generally to multiplane imaging (MPI) and, more specifically but not exclusively, to transmission of multiplane images. 3. Background [0003] Multiplane images embody a relatively new approach to storing volumetric content. MPI can be used to render both still images and video and represents a three- dimensional (3D) scene within a view frustum using, e.g., 8, 16, or 32 planes of texture and transparency (alpha) information per camera. Example applications of MPI include computer vision and graphics, image editing, photo animation, robotics, and virtual reality. BRIEF SUMMARY OF SOME SPECIFIC EMBODIMENTS [0004] Example embodiments disclosed herein provide formats for coding, storage, and delivery of multi-view MPI video and a corresponding MPI video system. Some examples use the ISO Base Media File Format (BMFF) for storage and MPEG Dynamic Adaptive Streaming over HTTP (DASH) for streaming of MPI video to provide immersive volumetric video experience. Some examples enable seamless interactive view switching for multi-view MPI video streaming using modifications to available technologies and infrastructure, e.g., by defining new tools designed to fill the technological gap for the above-stated purposes. At least some of the disclosed solutions can be deployed in a relatively short time, after modifications of pertinent existing solutions are implemented in accordance with various embodiments disclosed herein. [0005] According to an example embodiment, provided is a method for streaming an MPI video, the method comprising: generating a sequence of video frames, each of the video frames including a respective plurality of patches representing texture and transparency layers of one or more multiplane images of the MPI video; applying video compression to the sequence of video frames to generate a video sub-stream; generating a sequence of representations of atlas Docket No. D23136WO01 frames corresponding to the sequence of video frames to specify at least a packing arrangement of the patches; applying compression to the sequence of representations to generate an atlas sub- stream; and multiplexing the video sub-stream and the atlas sub-stream to generate a first coded bitstream encoding at least a portion of the MPI video. [0006] According to another example embodiment, provided is a non-transitory computer-readable medium storing instructions that, when executed by an electronic processor, cause the electronic processor to perform operations comprising the above method. [0007] According to yet another example embodiment, provided is An apparatus for streaming an MPI video, the apparatus comprising: at least one processor; and at least one memory including program code; and wherein the at least one memory and the program code are configured to, with the at least one processor, cause the apparatus at least to: generate a sequence of video frames, each of the video frames including a respective plurality of patches representing texture and transparency layers of one or more multiplane images of the MPI video; apply video compression to the sequence of video frames to generate a video sub-stream; generate a sequence of representations of atlas frames corresponding to the sequence of video frames to specify at least a packing arrangement of the patches; apply compression to the sequence of representations to generate an atlas sub-stream; and multiplex the video sub-stream and the atlas sub-stream to generate a first coded bitstream encoding at least a portion of the MPI video. BRIEF DESCRIPTION OF THE DRAWINGS [0008] Other aspects, features, and benefits of various disclosed embodiments will become more fully apparent, by way of example, from the following detailed description and the accompanying drawings, in which: [0009] FIG.1 depicts an example process for a video/image delivery pipeline. [0010] FIG.2 pictorially illustrates a 3D-scene representation using a multiplane image according to an embodiment. [0011] FIG.3 pictorially illustrates a process of generating a novel view of a 3D scene according to one example. [0012] FIG.4 is a block diagram illustrating a change of the set of active views over time according to one example. [0013] FIG.5 is a block diagram illustrating an MPI video system that can be used in the delivery pipeline of FIG.1 according to an embodiment. [0014] FIG.6 is a block diagram illustrating an MPI encoder that can be used in the