US-20260129231-A1 - VIEWPORT AND/OR REGION-OF-INTEREST DEPENDENT DELIVERY OF V3C DATA USING RTP

US20260129231A1US 20260129231 A1US20260129231 A1US 20260129231A1US-20260129231-A1

Abstract

Viewport- and/or region-of-interest-dependent delivery of V3C data may be performed using RTF, RTP/RTCP signaling may support spatial region based and/or viewport-based partial access of V3C content. An SDP parameter may signal static 3D regions In immersive media content. An RTCP FB message type may carry 3D region of interest request during an RTP media transmission session. An SDP parameter may indicate an RTCP-based ability to request a desired 3D region during capability negotiations. An RTP header extension type may carry transmitted 3D regions information during RTP transmission of immersive media. An SDP parameter may indicate an RTP-based ability to signal transmitted 3D region information during capability negotiations. An SDP parameter may indicate an RTP-based ability to signal updated 3D region information during capability negotiations. An RTCP FB message type may carry viewport information during an RTP media transmission session. An SDP parameter may indicate RTCP-based capability to signal viewport information during capability negotiations.

Inventors

Ahmed Hamza
Srinivas Gudumasu

Assignees

INTERDIGITAL VC HOLDINGS, INC.

Dates

Publication Date: 20260507
Application Date: 20231013

Claims (20)

1 - 19 . (canceled)
20 . A device comprising: a processor configured to: receive a set of session description protocol (SDP) parameters indicating presence of one or more three-dimensional (3D) regions associated with immersive media content; send a real-time control protocol (RTCP) feedback (FB) message, wherein the RTCP FB message indicates a 3D region of interest based on at least three origin position coordinates and at least three dimensions of the 3D region of interest; and receive one or more 3D regions of visual volumetric video-based coding (V3C) content associated with the 3D region of interest.
21 . The device of claim 20 , wherein the at least three origin position coordinates comprise: a first origin position coordinate of the 3D region of interest comprising an x-axis coordinate of an origin point; a second origin position coordinate of the 3D region of interest comprising a y-axis coordinate of the origin point; and a third origin position coordinate of the 3D region of interest comprising a z-axis coordinate of the origin point.
22 . The device of claim 20 or 21 , wherein the at least three dimensions of the 3D region of interest comprise: a first dimension of the 3D region of interest comprising an extension of the 3D region of interest along the x-axis relative to an origin point; a second dimension of the 3D region of interest comprising an extension of the 3D region of interest along the y-axis relative to the origin point; and a third dimension of the 3D region of interest comprising an extension of the 3D region of interest along the z-axis relative to the origin point.
23 . The device of claim 20 , wherein the processor is further configured to: perform capability negotiations, wherein the capability negotiations are performed using SDP, and wherein the set of SDP parameters is received during capability negotiations.
24 . The device of claim 20 , wherein the processor is further configured to: receive an SDP message indicating use of a real time protocol (RTP) header extension, wherein the received one or more 3D regions of V3C content are received using the RTP header extension.
25 . The device of claim 24 , wherein the RTCP FB message is sent using an RTCP message, and wherein the RTCP FB message includes an indication of a region ID associated with the 3D region of interest.
26 . The device of claim 20 , wherein the 3D region of interest is a static 3D region or an arbitrary 3D region.
27 . A device comprising a processor configured to: receive a set of session description protocol (SDP) parameters indicating presence of one or more three-dimensional (3D) regions associated with an immersive media content; send a real-time control protocol (RTCP) feedback (FB) message, wherein the RTCP FB message indicates a 3D viewport of interest based on at least three camera position coordinates and at least three rotation values associated with the 3D viewport of interest; and receive one or more 3D regions of visual volumetric video-based coding (V3C) content associated with the 3D viewport of interest.
28 . The device of claim 27 , wherein the at least three camera position coordinates comprise: a first camera position coordinate of the 3D viewport of interest comprising an x-axis coordinate of a camera in a global reference coordinate system; a second camera position coordinate of the 3D viewport of interest comprising a y-axis coordinate of the camera in the global reference coordinate system; and a third camera position coordinate of the 3D viewport of interest comprising a z-axis coordinate of the camera in the global reference coordinate system.
29 . The device of claim 27 or 28 , wherein the at least three rotation values comprise: an x-axis component of a quaternion representation of rotation of a camera associated with the 3D viewport of interest; a y-axis component of the quaternion representation of rotation of the camera; and a z-axis component of the quaternion representation of rotation of the camera.
30 . The device of claim 27 , wherein the processor is further configured to: perform capability negotiations, wherein the capability negotiations are performed using SDP, and wherein the set of SDP parameters is received during capability negotiations.
31 . The device of claim 27 , wherein the processor is further configured to: receive an SDP message indicating use of a real time protocol (RTP) header extension, wherein the received one or more 3D regions of V3C content are received using the RTP header extension.
32 . The device of claim 27 , wherein the RTCP FB message is sent using an RTCP message, and wherein the RTCP FB message further indicates whether extrinsic camera parameters associated with the 3D viewport of interest are present in the RTCP message.
33 . The device of claim 27 , wherein the RTCP FB message is sent using an RTCP message, and wherein the RTCP FB message further indicates whether intrinsic camera parameters associated with the 3D viewport of interest are present in the RTCP message.
34 . The device of claim 33 , wherein the RTCP FB message further indicates whether a horizontal FOV associated with the 3D viewport of interest and a vertical FOV associated with the 3D viewport of interest are equal.
35 . A method comprising: receiving a set of session description protocol (SDP) parameters indicating presence of one or more three-dimensional (3D) regions associated with immersive media content; sending a real-time control protocol (RTCP) feedback (FB) message, wherein the RTCP FB message indicates a 3D region of interest based on at least three origin position coordinates and at least three dimensions of the 3D region of interest; and receiving one or more 3D regions of visual volumetric video-based coding (V3C) content associated with the 3D region of interest.
36 . The method of claim 35 , wherein the at least three origin position coordinates comprise: a first origin position coordinate of the 3D region of interest comprising an x-axis coordinate of an origin point; a second origin position coordinate of the 3D region of interest comprising a y-axis coordinate of the origin point; and a third origin position coordinate of the 3D region of interest comprising a z-axis coordinate of the origin point.
37 . The method of claim 35 or 36 , wherein the at least three dimensions of the 3D region of interest comprise: a first dimension of the 3D region of interest comprising an extension of the 3D region of interest along the x-axis relative to an origin point; a second dimension of the 3D region of interest comprising an extension of the 3D region of interest along the y-axis relative to the origin point; and a third dimension of the 3D region of interest comprising an extension of the 3D region of interest along the z-axis relative to the origin point.
38 . The method of claim 35 , wherein the method further comprises: performing capability negotiations, wherein the capability negotiations are performed using SDP, and wherein the set of SDP parameters is received during capability negotiations.

Description

CROSS REFERENCE This application claims the benefit of U.S. Provisional Application No. 63/415,893, filed on Oct. 13, 2022 and U.S. Provisional Application No. 63/539,958, filed on Sep. 22, 2023, the contents of which are incorporated by reference herein. BACKGROUND Video coding systems may be used to compress digital video signals, e.g., to reduce the storage and/or transmission bandwidth needed for such signals. Video coding systems may include, for example, block-based, wavelet-based, and/or object-based systems. SUMMARY Systems, methods, and instrumentalities are described herein for performing viewport- and/or region-of-interest-dependent delivery of visual volumetric video-based coding (V3C) data, for example, using a real-time transport protocol (RTP). RTP/real-time control protocol (RTCP) signaling mechanisms may be used to enable support for spatial region based and/or viewport-based partial access of V3C content. A session description protocol (SDP) parameter may be used to signal static 3D regions present in immersive media content. An RTCP feedback (FB) message type may carry a desired 3D region of interest request, for example, during an RTP media transmission of a session. The RTCP FB message may be signaled from the receiver to the sender. An SDP parameter may indicate the RTCP-based ability to request the desired 3D region, for example, during capability negotiations. An RTP header extension type may carry transmitted 3D regions information during the RTP transmission of immersive media. An RTP header extension type may be signaled from the sender to the receiver. Capability negotiations may be performed between a sender and a receiver of V3C content. An SDP parameter may indicate an RTP-based capability to signal transmitted 3D region information, for example, during the capability negotiations. An SDP parameter may indicate an RTP-based capability to signal updated 3D region information, for example, during capability negotiations. An RTCP FB message type may carry desired and/or requested viewport information, for example, during an RTP media transmission of a session. An RTCP FB message type may be signaled from the receiver to the sender. An SDP parameter may indicate an RTCP-based capability to signal desired and/or requested viewport information, for example, during capability negotiations. A device may be configured to receive a set of session description protocol (SDP) parameters indicating presence of one or more 3D regions associated with immersive content. The device may send a real-time control protocol (RTCP) feedback (FB) message indicating a three-dimensional (3D) region of interest and/or a 3D viewport of interest. The device may receive one or more 3D regions of visual volumetric video-based coding (V3C) content associated with the interested 3D region and/or the 3D viewport of interest. The device may perform capability negotiations (e.g., using SDP). The set of SDP parameters may be received during the capability negotiations. The set of received 3D regions information may be received using a real time protocol (RTP) header extensions and the use of RTP header extensions may be signaled in an SDP message. In examples, the interested 3D region may be a static 3D region and/or an arbitrary 3D region. The RTCP FB message may include an indication of a region ID associated with the interested 3D region, an indication of a position associated with the interested 3D region, and/or a size associated with the interested 3D region. The RTCP FB message may indicate whether extrinsic camera parameters are present in the RTCP message, whether intrinsic camera parameters are present in the RTCP message, and/or whether a horizontal FOV associated with the 3D viewport of interest and a vertical FOV associated with the 3D viewport of interest are equal. Systems, methods, and instrumentalities described herein may involve a decoder. In some examples, the systems, methods, and instrumentalities described herein may involve an encoder. In some examples, the systems, methods, and instrumentalities described herein may involve a signal (e.g., from an encoder and/or received by a decoder). A computer-readable medium may include instructions for causing one or more processors to perform methods described herein. A computer program product may include instructions which, when the program is executed by one or more processors, may cause the one or more processors to carry out the methods described herein. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1A is a system diagram illustrating an example communications system in which one or more disclosed embodiments may be implemented. FIG. 1B is a system diagram illustrating an example wireless transmit/receive unit (WTRU) that may be used within the communications system illustrated in FIG. 1A according to an embodiment. FIG. 1C is a system diagram illustrating an example radio access network (RAN) and an example core network (CN) that may be used within th