CN-122027829-A - Intelligent super-division coding system for low-code-rate video
Abstract
The invention discloses an intelligent super-resolution coding system for low-code-rate video, which comprises a multi-stage pipeline and a merging module, wherein each stage of the multi-stage pipeline is responsible for different GPU/threads, the multi-stage pipeline is integrated into a pipelined parallel task, the multi-stage pipeline comprises a video input and analysis module, a task segmentation and dynamic scheduling module, an AI super-resolution module, an HDR/SDR self-adaptive conversion module, a video coding module and a quality control and closed-loop optimization module which are sequentially processed.
Inventors
- YUAN KAI
- SHAO HAIFEI
- WU ZHENGPING
- YU LONG
- CHEN LINNAN
- REN RONG
- LIN QIANG
Assignees
- 华数传媒网络有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260210
Claims (10)
- 1. The intelligent super-division coding system for the low-code-rate video is characterized by comprising a multi-stage pipeline and a merging module, wherein each stage of the multi-stage pipeline is responsible for different GPU/threads, the multi-stage pipeline is integrated into a pipelined parallel task, and the multi-stage pipeline comprises sequentially processed GPU/threads The video input and analysis module is used for receiving a local video file to be transcoded, analyzing metadata of a video container/code stream and generating an initial transcoding task description; The task segmentation and dynamic scheduling module is used for segmenting the video into a plurality of independent task segments and dynamically distributing one task segment to a single GPU according to the real-time load of each GPU; the AI hyper-segmentation module is used for performing hyper-segmentation reasoning on the video task segments of the single GPU and generating and outputting an enhanced frame sequence; The HDR/SDR self-adaptive conversion module is used for automatically detecting the input dynamic range, receiving the enhanced frame sequence after super-division, performing tone mapping or reverse mapping to perform tone correction, outputting the final frame sequence and ensuring the color and brightness fidelity under the target code rate; The video coding module is used for compressing the final frame sequence in real time by adopting a GPU encoder and outputting a low-code rate video stream; The quality control and closed loop optimization module is used for carrying out sampling quality evaluation on the output low-code rate video stream, taking the sampling quality evaluation as a control signal and feeding back the control signal to the task segmentation and dynamic scheduling module, the AI superdivision module and the video coding module to form closed loop optimization; The merging module integrates the low-bitrate video stream into a complete video stream which is continuous in time and space and accords with a standard format.
- 2. The intelligent super-division encoding system for low-bit-rate video as set forth in claim 1, wherein said AI super-division module, HDR/SDR adaptive conversion module and video encoding module super-divide, tone map and encode video task segments of a single GPU.
- 3. The intelligent super-resolution coding system for low-bitrate video of claim 1, wherein the task segmentation and dynamic scheduling module segments the video into a plurality of independent task segments based on a segmentation strategy of time window + overlap boundary.
- 4. The intelligent super-resolution encoding system for low-rate video as recited in claim 3, wherein said slicing strategy comprises slicing video into L seconds/frame as task segments, adding overlap of N frames between adjacent task segments, and calculating frame interval k of task segments.
- 5. The intelligent super-resolution coding system for low-bit-rate video of claim 1, wherein said task segmentation and dynamic scheduling module dynamically allocates task segments to individual GPUs based on a dynamic scheduling algorithm, the dynamic scheduling algorithm operating based on real-time resource assessment indicators during CPU operation.
- 6. The intelligent super-division encoding system for low-bitrate video according to claim 1, wherein the AI super-division module comprises a model selection and dynamic switching module, an inter-frame consistency module, a boundary processing module, a memory/video management module and a model parallelization module.
- 7. The intelligent super-resolution coding system for low-bit-rate video of claim 6, wherein said model selection and dynamic switching module dynamically switches model and inference accuracy according to GPU computing power, video memory, task priority and quality targets.
- 8. The intelligent super-resolution coding system for low-bitrate video of claim 6, wherein said inter-frame consistency module has two implementation schemes, including a scheme based on feature map referencing and a scheme based on optical flow guiding.
- 9. The intelligent super-resolution encoding system for low-bitrate video according to claim 6, wherein the boundary processing module is configured to process an overlapping region of task segments in the task segmentation and dynamic scheduling module, and the boundary processing module generates enhancement results OleftO _left_text { left } Oleft and OrightO _right } Oright on two sides respectively for the overlapping region, and performs weight mixing or uses a time consistency priority policy.
- 10. The intelligent super-resolution encoding system for low-bitrate video as claimed in claim 1, wherein the HDR/SDR adaptive conversion module comprises a dynamic range detection module and a partition mapping module, the dynamic range detection module determines whether the source is HDR or SDR according to container metadata, the partition mapping module divides the picture into a plurality of grid blocks, and calculates a local mapping parameter for each grid block.
Description
Intelligent super-division coding system for low-code-rate video Technical Field The invention belongs to the technical field of video image processing, and relates to an intelligent super-division coding system for low-bit-rate video. Background With the wide spread of high-definition video and ultra-high-definition video content, the video data volume is rapidly increased, and the conventional video compression and transmission system has the problems of high storage pressure, high transmission cost, low code rate video quality, and the like. In order to reduce network bandwidth and storage cost, low-rate coding modes (such as h.264, h.265, AV1, etc.) are commonly adopted in the industry, but low-rate compression often causes obvious image quality loss, such as detail blurring, edge artifacts, noise enhancement, etc. In recent years, artificial intelligence Super-Resolution (SR) technology has made remarkable progress in image sharpness restoration. AI models (e.g., real-ESRGAN, topaz Video AI, EDVR, etc.) enable high quality reconstruction of low resolution images based on deep convolutional networks or generating antagonistic network structures. However, these models exist in a form of independent reasoning, are separated from the traditional video coding flow, and therefore have low overall efficiency, are difficult to apply to large-scale transcoding tasks, and have the problems that fusion processing of super and coding cannot be achieved, GPU resource utilization rate is low, task scheduling and load are unbalanced, HDR-SDR/SDR-HDR processing is inaccurate, and image quality and code rate balance capability are weak and time consistency is insufficient. Disclosure of Invention The invention aims to overcome at least one defect of the prior art and provides an intelligent super-division coding system for low-bit-rate video. In order to achieve the aim, the intelligent super-division coding system for the low-bit-rate video comprises a multi-stage pipeline and a merging module, wherein each stage of the multi-stage pipeline is responsible for different GPU/threads, the multi-stage pipeline is integrated into pipelined parallel tasks, and the multi-stage pipeline comprises sequentially processed multi-stage pipelines The video input and analysis module is used for receiving a local video file to be transcoded, analyzing the metadata of a video container/code stream and generating an initial transcoding Task description (Task Descriptor); The task segmentation and dynamic scheduling module is used for segmenting the video into a plurality of independent task segments (segments) and dynamically distributing one task segment to a single GPU according to the real-time load of each GPU; the AI hyper-segmentation module is used for performing hyper-segmentation reasoning on the video task segments of the single GPU and generating and outputting an enhanced frame sequence; The HDR/SDR self-adaptive conversion module is used for automatically detecting the input dynamic range, receiving the enhanced frame sequence after super-division, performing Tone Mapping (Tone Mapping) or reverse Mapping to perform Tone correction, outputting a final frame sequence and ensuring color and brightness fidelity under a target code rate; The video coding module is used for compressing the final frame sequence in real time by adopting a GPU encoder and outputting a low-code rate video stream; The quality control and closed loop optimization module is used for carrying out sampling quality evaluation (VMAF/SSIM/PSNR) on the output low-code-rate video stream, taking the sampling quality evaluation as a control signal and feeding back the control signal to the task segmentation and dynamic scheduling module, the AI super-segmentation module and the video coding module to form closed loop optimization; The merging module integrates the low-bitrate video stream into a complete video stream which is continuous in time and space and accords with a standard format. Furthermore, the AI hyper-segmentation module, the HDR/SDR self-adaptive conversion module and the video coding module perform hyper-segmentation, tone mapping and coding operation on the video task segments of the single GPU. Furthermore, the task segmentation and dynamic scheduling module segments the video into a plurality of independent task segments based on a segmentation strategy of 'time window + overlapping boundary'. Further, the slicing strategy comprises the steps of slicing the video into L seconds/frame as task segments according to time, adding N frame overlapping between adjacent task segments, and calculating a frame interval k of the task segments. Further, the task segmentation and dynamic scheduling module dynamically distributes task segments to the single GPU based on a dynamic scheduling algorithm, and the dynamic scheduling algorithm operates based on real-time resource evaluation indexes when the CPU operates. Furthermore, the AI superdivision module