EP-4738339-A1 - LOW LATENCY VIDEO STREAMING - PARTIAL FRAMES

EP4738339A1EP 4738339 A1EP4738339 A1EP 4738339A1EP-4738339-A1

Abstract

In an example, a device may include logic to receive a plurality of video frames from a video source, logic to process the plurality of video frames, logic to provide at least some of the plurality of video frames to a video sink, and logic to reduce a display latency of each of the video frames provided to the video sink by decoding and/or processing only portions of some frames.

Inventors

MAMIDWAR, RAJESH
CHEN, XUEMIN
HENG, BRIAN

Assignees

Avago Technologies International Sales Pte. Limited

Dates

Publication Date: 20260506
Application Date: 20251028

Claims (15)

A device, comprising logic to receive a plurality of video frames from a video source; logic to process the plurality of video frames to produce a plurality of processed video frames, the plurality of processed video frames comprising one or more processed partial video frames; and logic to provide at least some of the plurality of processed video frames to a video sink; wherein the one or more processed partial frames reduces a display latency of one or more of the video frames provided to the video sink.
The device of claim 1, wherein receiving the plurality of video frames comprises receiving one or more encoded partial video frames.
The device of claim 1 or 2, wherein processing the plurality of video frames comprises creating the one or more partial video frames from one of the plurality of received video frames.
The device of any one of the claims 1 to 3, wherein the plurality of video frames received from the video source comprises a plurality of encoded video frames; processing the plurality of video frames comprises decoding at least some of the plurality of encoded video frames to produce a plurality of decoded video frames, the plurality of decoded video frames comprising one or more decoded partial video frames; providing at least some of the plurality of processed video frames to the video sink comprises providing at least some of the plurality of decoded video frames to the video sink; and reducing the display latency of each of the decoded video frames provided to the video sink further comprises providing a decoded partial video frame to the video sink.
The device of claim 4, wherein (A) providing at least some of the plurality of decoded video frames to the video sink comprises providing each of the plurality of decoded video frames, including the decoded partial video frame, to the video sink via a High-Definition Multimedia Interface (HDMI) connection; and/or wherein (B) providing at least some of the plurality of decoded video frames to the video sink comprises providing at least some of the plurality of decoded video frames to the video sink via a High-Definition Multimedia Interface (HDMI) connection; and providing the decoded partial video frame to the video sink via an alternate path separate from the HDMI connection.
The device of claim 4 or 5, further comprising logic to provide the video sink with metadata enabling the video sink to use the decoded partial video frame to create a decoded full video frame.
The device of any one of the claims 4 to 6, wherein the plurality of encoded video frames comprises one or more encoded partial video frames encoded with adaptive resolution; and decoding the one or more encoded partial video frames comprises decoding the one or more encoded partial video frames encoded with adaptive resolution.
The device of any one of the claims 4 to 7, wherein the plurality of encoded video frames comprises a first encoded full video frame encoded at a first resolution; and an encoded partial video frame subsequent to the encoded full video frame, the encoded partial video frame being encoded at a second resolution; and decoding at least some of the plurality of encoded video frames comprises decoding the encoded full video frame to produce a decoded full video frame; and decoding the encoded partial video frame to produce a decoded partial video frame.
The device of claim 8, further comprising at least one of the following features (A) providing at least some of the plurality of decoded video frames to the video sink comprises providing the decoded partial video frame to the video sink; (B) processing the plurality of video frames comprises producing a second decoded full video frame from the first decoded full video frame and the decoded partial video frame; and providing at least some of the plurality of decoded video frames to the video sink comprises providing the second decoded full video frame to the video sink; and (C) the encoded partial video frame comprises content predicted from a portion of the encoded full video frame corresponding to the encoded partial video frame using inter-frame coding.
The device of claim 8 or 9, wherein processing the plurality of video frames comprises storing the first decoded full video frame in a reference buffer; receiving a second encoded full video frame subsequent to the encoded partial video frame, the second encoded full video frame encoded with inter-frame encoding from the first encoded full video frame; and decoding the second encoded full video frame, using the first decoded full video frame, to produce a second decoded full video frame; and providing at least some of the plurality of decoded video frames to the video sink comprises providing the second decoded full video frame to the video sink.
The device of any one of the claims 8 to 10, further comprising at least one of the following features (A) logic to receive information from the video source indicating a location of the encoded partial video frame within a full frame; and (B) processing the plurality of video frames comprises calculating a location of the decoded partial video frame within a full frame; and providing information to the video sink indicating the location of the decoded partial video frame within the full frame.
The device of any one of the claims 1 to 11, wherein the device is a set-top box, a component of a set-top box or a system on a chip (SoC); or the device is a television.
The device of any one of the claims 1 to 12, further comprising logic to selectively disable the logic to process the plurality of video frames based on an application associated with the video frames; configuration settings; or user controls.
A method, comprising receiving a plurality of video frames from a video source; processing the plurality of video frames to produce a plurality of processed video frames, the plurality of processed video frames comprising one or more processed partial video frames; and providing at least some of the plurality of processed video frames to a video sink; wherein the one or more processed partial frames reduces a display latency of one or more of the video frames provided to the video sink.
A set-top box, comprising an input interface to receive a plurality of video frames from a video source; a decoder to decode the plurality of video frames to produce a plurality of decoded video frames, the plurality of decoded video frames comprising one or more decoded partial video frames; an output interface to provide at least some of the plurality of decoded video frames to a video sink; and a processor to reduce a display latency of the at least some of the plurality of decoded video frames provided to the video sink using the one or more decoded partial video frames.

Description

Technical Field This document relates generally to video streaming and more specifically to low latency video streaming by reducing a display latency of video frames provided to a video sink. Background Traditional video streaming, whether broadcast or IP-based, has historically relied on complex network infrastructure to deliver smooth video experiences to end users. This typically involves a video player decoding and sending every video and audio frame to the TV at the right time. Such traditional approaches relied on complex network architectures and extensive buffering to ensure consistent frame delivery. This often resulted in higher costs due to increased bandwidth and storage requirements. To address these challenges, various video coding standards (e.g., MPEG2, AVC, HEVC, VP9, AV1) have been developed for efficient compression, but this often comes at the cost of increased computational complexity. Further, to ensure smooth video playback, buffering mechanisms have been implemented at different stages of a video pipeline, including cloud servers, networks, and video players (e.g., set-top boxes or over-the-top (OTT) clients). For instance, popular streaming services like YouTube and Netflix typically buffer 10-40 seconds of video frames. Emerging applications such as cloud gaming, video conferencing, and virtual reality demand low-latency performance. These applications are driving the development of new network, encoder, and system standards. The cloud gaming has gained popularity as internet speeds have improved. In this model, the actual gaming server resides in the cloud, while the local game controller sends commands to the cloud server. The game is rendered on the cloud server, and the encoded video is transmitted to the user's device (e.g., TV or set-top box) over the video pipeline for display. Further, the video conferencing applications have become essential for remote work, online education, and telehealth. Low latency is crucial for a seamless experience. Many users are turning to OTT devices or set-top boxes for larger screen displays, as opposed to traditional conference equipment. Furthermore, the virtual reality experiences often require high-resolution video and low latency. Cloud-based rendering can provide the necessary processing power, while local devices can focus on displaying the rendered content. For applications such as gaming, video conference, and virtual reality and the like, end-to-end latency is more important than smooth video. Traditional set-top boxes and OTT devices, designed for smooth streaming video, often rely on fixed frame rates and buffering at multiple stages of the video pipeline. In other words, the video pipeline typically includes frame buffers at various stages, such as the encoder, decoder, video processing, high-definition multimedia interface (HDMI) input, and within the TV itself. This buffering ensures smooth playback but can also contribute to latency as each stage buffers multiple frames to ensure that complete frame data is available at the input before feeding the output stage. Traditional pipelines often require the transmission of entire frames, regardless of the number of pixels that have changed. This approach can introduce significant latency as well, which is undesirable for latency-sensitive applications like cloud gaming, video conferencing, and virtual reality. In other examples, at the HDMI interface, the frame rate is typically fixed and cannot be adjusted dynamically during playback. For example, in a 60 FPS configuration, one complete frame of data must be sent every 1/60th of a second. If the next frame is not ready in time (i.e., an underflow condition), the previous frame is repeated, resulting in potential visual artifacts. This fixed frame rate requirement can limit the ability to reduce latency in applications that demand real-time responsiveness. Brief Description of the Drawings Fig. 1 is a block diagram illustrating components of a device that can reduce latency in a video pipeline in accordance with some embodiments.Fig. 2 is a functional block diagram illustrating a device with a video pipeline in accordance with some embodiments.Fig. 3 is a frame timing diagram for a video pipeline in accordance with some embodiments.Fig. 4 is a flow diagram illustrating an exemplary method for reducing latency in a video pipeline in accordance with some embodiments.Fig. 5 is a flow diagram illustrating an exemplary method for reducing latency in a video pipeline in accordance with some embodiments.Fig. 6 is a functional block diagram illustrating a device comprising a video pipeline with reduced latency in accordance with some embodiments.Fig. 7 is a functional block diagram illustrating a device with a video pipeline having reduced latency in accordance with some embodiments.Fig. 8 is a functional block diagram illustrating a device with a video pipeline having reduced latency in accordance with some embodiments.Fig. 9 is a flow diag