EP-4738337-A1 - LOW LATENCY VIDEO STREAMING MODIFIED FRAME RATES

EP4738337A1EP 4738337 A1EP4738337 A1EP 4738337A1EP-4738337-A1

Abstract

In an example, a device may include logic to receive a plurality of video frames from a video source, logic to process the plurality of video frames, logic to provide at least some of the plurality of video frames to a video sink, and logic to reduce a display latency of each of the video frames provided to the video sink by changing the frame rate of some or all of the frames.

Inventors

MAMIDWAR, RAJESH
CHEN, XUEMIN
HENG, BRIAN

Assignees

Avago Technologies International Sales Pte. Limited

Dates

Publication Date: 20260506
Application Date: 20251028

Claims (15)

A device, comprising logic to receive a plurality of video frames from a video source at a fixed frame rate, the plurality of received video frames comprising a video frame that comprises a first plurality of sub-frames; logic to process at least some of the plurality of video frames to produce a plurality of processed video frames, wherein processing at least some of the plurality of video frames comprises processing at least some of the first plurality of sub-frames; and processing at least some of the plurality of the video frames decouples the fixed frame rate of the received video frames from a flexible frame rate of the processed video frames; and logic to provide at least some of the plurality of processed video frames, including at least some of the processed first plurality of sub-frames, to a video sink at the flexible frame rate.
The device of claim 1, wherein processing at least some of the plurality of video frames comprises encoding the at least some of the plurality of video frames with a video coder/decoder (CODEC) for transmission over a network.
The device of claim 1 or 2, wherein the received plurality of video frames comprises a plurality of encoded video frames encoded with a video CODEC; and processing at least some of the plurality of video frames comprises decoding the at least some of the plurality of video frames to produce a plurality of decoded video frames.
The device of claim 3, wherein providing each of the plurality of processed frames at the flexible frame rate comprises providing a first set of the plurality of decoded video frames to the video sink via dedicated multimedia interface; and providing a second set of the plurality of decoded video frames to the video sink via local area network; wherein in particular providing least some of the processed first plurality of sub-frames to the video sink comprises providing fewer than all of the processed first plurality of sub-frames to the video sink.
The device of any one of the claims 1 to 4, wherein processing at least some of the first plurality of sub-frames comprises processing fewer than all of the first plurality of sub-frames.
The device of any one of the claims 1 to 5, wherein the device comprises at least one of the following features: (A) providing least some of the processed first plurality of sub-frames to the video sink comprises providing at least some of the processed sub-frames as parts of separate streams; (B) providing least some of the processed first plurality of sub-frames to the video sink comprises providing the processed sub-frames as part of a single stream; (C) providing the plurality of processed video frames to the video sink comprises producing a processed video frame at least some of the processed first plurality of sub-frames and a prior processed video frame; and providing the processed video frame to the video sink.
The device of any one of the claims 1 to 6, wherein decoding at least some of the plurality of encoded video frames to produce a plurality of decoded video frames comprises decoding two or more of the first plurality of sub-frames in parallel.
The device of any one of the claims 1 to 7, wherein receiving the plurality of encoded video frames from the video source comprises receiving at least some of the first plurality of sub-frames as separate streams; or receiving the first plurality of encoded video frames from the video source comprises receiving the first plurality of sub-frames as a single stream.
The device of any one of the claims 1 to 8, wherein receiving the plurality of video frames from the video source comprises receiving receive a full video frame; processing at least some of the plurality of video frames comprises dividing the full video frame into a second plurality of sub-frames; and providing each of the plurality of processed video frames to the video sink at the flexible frame rate comprises providing at least some of the second plurality of sub-frames to the video sink.
The device of any one of the claims 1 to 9, wherein processing at least some of the plurality of received video frames comprises processing the at least some of the plurality of received video frames at a rate higher than the fixed frame rate; and providing each of the plurality of processed video frames to the video sink at the flexible frame rate comprises providing each of the plurality of processed video frames to the video sink, without post-processing buffering, at the rate at which each respective frame has been processed.
The device of any one of the claims 1 to 10, wherein the device comprises a set-top box, a component of a set-top box, or a system on a chip (SoC); or the device is a television.
The device of any one of the claims 1 to 11, wherein each of the received plurality of frames has a decode canvas with a first canvas size; each of the processed video frames has a display canvas with a second canvas size smaller than the first canvas size, the display canvas comprising a portion of the decode canvas; and the portion of the decode second canvas that the display canvas comprises changes between subsequent frames.
The device of any one of the claims 1 to 12, further comprising logic to selectively disable the decoupling of the fixed frame rate from the flexible frame rate and the providing of the processed video frames at the flexible frame rate based on an application associated with the video frames; configuration settings; or user controls.
A method, comprising receiving a plurality of video frames from a video source at a fixed frame rate, the plurality of received video frames comprising a video frame that comprises a first plurality of sub-frames; processing at least some of the plurality of video frames to produce a plurality of processed video frames, wherein processing at least some of the plurality of video frames comprises processing at least some of the first plurality of sub-frames; and processing at least some of the plurality of the video frames decouples the fixed frame rate of the received video frames from a flexible frame rate of the processed video frames; and providing at least some of the plurality of processed video frames, including at least some of the processed first plurality of sub-frames, to a video sink at the flexible frame rate.
A set-top box, comprising an input interface configured to receive a plurality of video frames from a video source at a fixed frame rate, the plurality of received video frames comprising a video frame that comprises a first plurality of sub-frames; a processor configured to process at least some of the plurality of video frames to produce a plurality of processed video frames, wherein processing at least some of the plurality of video frames comprises processing at least some of the first plurality of sub-frames; and processing at least some of the plurality of the video frames decouples the fixed frame rate of the received video frames from a flexible frame rate of the processed video frames; an output interface configured to provide at least some of the processed video frames, including at least some of the plurality of processed sub-frames, to a video sink at the flexible frame rate.

Description

Technical Field This document relates generally to video streaming and more specifically to low latency video streaming by reducing a display latency of video frames provided to a video sink. Background Traditional video streaming, whether broadcast or IP-based, has historically relied on complex network infrastructure to deliver smooth video experiences to end users. This typically involves a video player decoding and sending every video and audio frame to the TV at the right time. Such traditional approaches relied on complex network architectures and extensive buffering to ensure consistent frame delivery. This often resulted in higher costs due to increased bandwidth and storage requirements. To address these challenges, various video coding standards (e.g., MPEG2, AVC, HEVC, VP9, AV1) have been developed for efficient compression, but this often comes at the cost of increased computational complexity. Further, to ensure smooth video playback, buffering mechanisms have been implemented at different stages of a video pipeline, including cloud servers, networks, and video players (e.g., set-top boxes or over-the-top (OTT) clients). For instance, popular streaming services like YouTube and Netflix typically buffer 10-40 seconds of video frames. Emerging applications such as cloud gaming, video conferencing, and virtual reality demand low-latency performance. These applications are driving the development of new network, encoder, and system standards. The cloud gaming has gained popularity as internet speeds have improved. In this model, the actual gaming server resides in the cloud, while the local game controller sends commands to the cloud server. The game is rendered on the cloud server, and the encoded video is transmitted to the user's device (e.g., TV or set-top box) over the video pipeline for display. Further, the video conferencing applications have become essential for remote work, online education, and telehealth. Low latency is crucial for a seamless experience. Many users are turning to OTT devices or set-top boxes for larger screen displays, as opposed to traditional conference equipment. Furthermore, the virtual reality experiences often require high-resolution video and low latency. Cloud-based rendering can provide the necessary processing power, while local devices can focus on displaying the rendered content. For applications such as gaming, video conference, and virtual reality and the like, end-to-end latency is more important than smooth video. Traditional set-top boxes and OTT devices, designed for smooth streaming video, often rely on fixed frame rates and buffering at multiple stages of the video pipeline. In other words, the video pipeline typically includes frame buffers at various stages, such as the encoder, decoder, video processing, high-definition multimedia interface (HDMI) input, and within the TV itself. This buffering ensures smooth playback but can also contribute to latency as each stage buffers multiple frames to ensure that complete frame data is available at the input before feeding the output stage. Traditional pipelines often require the transmission of entire frames, regardless of the number of pixels that have changed. This approach can introduce significant latency as well, which is undesirable for latency-sensitive applications like cloud gaming, video conferencing, and virtual reality. In other examples, at the HDMI interface, the frame rate is typically fixed and cannot be adjusted dynamically during playback. For example, in a 60 FPS configuration, one complete frame of data must be sent every 1/60th of a second. If the next frame is not ready in time (i.e., an underflow condition), the previous frame is repeated, resulting in potential visual artifacts. This fixed frame rate requirement can limit the ability to reduce latency in applications that demand real-time responsiveness. Brief Description of the Drawings Fig. 1 is a block diagram illustrating components of a device that can reduce latency in a video pipeline in accordance with some embodiments.Fig. 2 is a functional block diagram illustrating a device with a video pipeline in accordance with some embodiments.Fig. 3 is a frame timing diagram for a video pipeline in accordance with some embodiments.Fig. 4 is a flow diagram illustrating an exemplary method for reducing latency in a video pipeline in accordance with some embodiments.Fig. 5 is a flow diagram illustrating an exemplary method for reducing latency in a video pipeline in accordance with some embodiments.Fig. 6 is a functional block diagram illustrating a device comprising a video pipeline with reduced latency in accordance with some embodiments.Fig. 7 is a functional block diagram illustrating a device with a video pipeline having reduced latency in accordance with some embodiments.Fig. 8 is a functional block diagram illustrating a device with a video pipeline having reduced latency in accordance with some embodiments.Fig. 9 is a flow diag