Search

CA-3084867-C - PROCESSING MEDIA BY ADAPTIVE GROUP OF PICTURES (GOP) AND MINI-GOP STRUCTURING

CA3084867CCA 3084867 CCA3084867 CCA 3084867CCA-3084867-C

Abstract

A spatial complexity and a temporal complexity associated with one or more frames of media content may be determined. Based on the spatial complexity and the temporal complexity of the media content, a Group of Picture (GOP) size for the one or more frames of the media content may be determined. The GOP size may be inversely proportional to the spatial complexity and the temporal complexity of the one or more frames of media content. Certain frames of the media content may be arranged in a different GOP size as compared to one or more other frames of the media content. By varying the GOP size of the plurality of frames of the media content, the bitrate required to transmit the media content may be decreased without decreasing or substantially decreasing the overall quality of the media content.

Inventors

  • Alexander Giladi
  • Dan Grois

Assignees

  • COMCAST CABLE COMMUNICATIONS, LLC

Dates

Publication Date
20260505
Application Date
20200625
Priority Date
20190628

Claims (20)

  1. CLAIMS: 1. A method comprising: determining a spatial feature value associated with one or more frames of a plurality of frames of content; determining a temporal feature value associated with two or more frames of the content; 5 determining a Group of Picture (GOP) size for a first group of frames of the plurality of frames of the content, the first group of frames comprising the one or more frames and the two or more frames, and the GOP size for the first group of frames being inversely proportional to the spatial feature value and the temporal feature value; and encoding, based on the determined GOP size for the first group of frames of the content, 10 the first group of frames of the content.
  2. 2. The method of claim 1, wherein determining the spatial feature value comprises at least one of applying a Fast Discrete Cosine Transform (Fast DCT) to the one or more frames of the content or applying one or more filters to the one or more frames of the content. 15
  3. 3. The method of claim 2, wherein applying the one or more filters to the one or more frames of the content comprises applying at least one of an edge detection filter and a high pass filter to the one or more frames of the content.
  4. 4. The method of any one of claims 1-3, wherein determining the temporal feature value associated with the one or more frames of the content comprises applying a Mean Co-Located Pixel Difference (MCPD) metric to the two or more frames of the content.
  5. 5. The method of any one of claims 1-4, further comprising determining, based on 25 the spatial feature value and the temporal feature value, a hierarchical structure associated with the GOP.
  6. 6. The method of any one of claims 1-5, further comprising: determining, based on the spatial feature value and the temporal feature value, at least 30 one mini-GOP and at least one mini-GOP size for the first group of frames of the content; and CA 3084867 31 encoding, based on the determined at least one mini-GOP size for the first group of frames of the content, the first group of frames of the content, wherein the mini-GOP comprises a portion of the GOP.
  7. 7. The method of claim 6, 5 wherein the one or more frames of the content comprise at least one I-frame, at least one P-frame, and at least one B-frame; wherein the GOP begins with an I-frame and ends at a frame immediately preceding the next I-frame of the one or more frames of the content; and wherein at least one of the mini-GOPs begins with a P-frame. 10
  8. 8. A method comprising: determining at least one of a spatial feature value associated with one or more frames of content or a temporal feature value associated with two or more frames of the content; determining, based on the spatial feature value or the temporal feature value, a Group of 15 Picture (GOP) size and at least one mini-GOP size for a first group of frames of the content, the mini-GOP comprises a portion of the GOP, and the GOP size for the first group of frames being inversely proportional to the spatial feature value or the temporal feature value; and encoding, based on the determined GOP size and the at least one mini-GOP size, the first group of frames of the content. 20
  9. 9. The method of claim 8, wherein determining the spatial feature value comprises at least one of applying a Fast Discrete Cosine Transform (Fast DCT) to the one or more frames of the content or applying a filter to the one or more frames of the content, and 25 wherein determining the temporal feature value comprises applying a Mean Co-Located Pixel Difference (MCPD) metric to the two or more frames of the content.
  10. 10. The method of any one of claims 8-9, wherein the GOP begins with an I-frame and ends at a frame immediately preceding 30 another I-frame; and CA 3084867 32 wherein at least one of the mini-GOPs begins with a P-frame.
  11. 11. The method of any one of claims 8-10, further comprising determining, based on the at least one of the spatial feature value or the temporal feature value, a hierarchical structure associated with the GOP. 5
  12. 12. The method of any one of claims 8-11, wherein at least one of the GOP size and the mini-GOP size changes adaptively throughout an encoding process for the content.
  13. 13. A method comprising: 10 determining a spatial feature value associated with one or more frames of a plurality of frames of content; based at least on the determined spatial feature value, determining a Group of Picture (GOP) size for a first group of frames of the plurality of frames of the content, wherein the first group of frames comprises the one or more frames; and 15 encoding, based on the determined GOP size for the first group of frames of the content, the first group of frames of the content.
  14. 14. The method of claim 13, wherein determining the spatial feature value comprises at least one of applying a Fast Discrete Cosine Transform (Fast DCT) to the one or more frames 20 of the content or applying one or more filters to the one or more frames of the content.
  15. 15. The method of claim 14, wherein applying the one or more filters to the one or more frames of the content comprises applying at least one of an edge detection filter or a high pass filter to the one or more frames of the content. 25
  16. 16. The method of any one of claims 13-15, further comprising: determining, based at least on the spatial feature value, at least one mini-GOP and at least one mini-GOP size for the first group of frames of the content; and CA 3084867 33 encoding, based on the determined at least one mini-GOP size for the first group of frames of the content, the first group of frames of the content, wherein the mini-GOP comprises a portion of the GOP.
  17. 17. The method of any one of claims 13-16, wherein determining the spatial feature 5 value comprises determining an amount of high frequency information in the one or more frames of the content.
  18. 18. A device comprising: one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the 10 device to perform the method of any one of claims 1-17.
  19. 19. A computer-readable medium storing instructions that, when executed, cause the method of any one of claims 1-17 to be performed.
  20. 20. A system comprising: a first computing device configured to perform the method of any one of claims 1-17; and a second computing device configured to receive, from the first computing device, the encoded first group of frames of the content. CA 3084867

Description

PROCESSING MEDIA BY ADAPTIVE GROUP OF PICTURES (GOP) AND MINI-GOP STRUCTURING BACKGROUND [0001] Video compression techniques may be used to compress video content in an efficient manner, thereby enabling high-quality video content to be provided to customers while minimizing the bandwidth required to transmit that video content. As video quality continues to improve, the computational complexities for processing the video content and the bitrate requirements for transmitting the video content may also increase. There is currently a need to reduce bit-rate requirements, particularly for high-resolution video content, without decreasing perceived video content quality and while keeping computational complexity at a reasonable level. SUMMARY [0002] Methods and systems for improved media content ( e.g., video content) compression are described. A spatial complexity associated with one or more frames of media content may be determined. Determining the spatial complexity of the one or more frames of the media content may comprise performing a frequency analysis of the one or more frames in order to determine an amount of high frequency components and low frequency components of the one or more frames. A temporal complexity associated with the one or more frames of the media content may be determined. Determining the temporal complexity of the one or more frames of the media content may comprise determining an amount of motion between the one or more frames of the media content. Based on the spatial complexity and the temporal complexity of the media content, a Group of Picture (GOP) size for the one or more frames of the media content may be determined. The GOP size may be inversely proportional to the spatial complexity and the temporal complexity of the one or more frames of media content. Certain frames of the media content may be arranged in a different GOP size as compared to one or more other frames of the media content. By varying the GOP size of the plurality of frames of the media content, the bitrate required to transmit the media content may be decreased without decreasing or substantially decreasing the overall quality of the media content. 1 Date Re9ue/Date Received 2020-06-25 BRIEF DESCRIPTION OF THE DRAWINGS [0003] The following detailed description is better understood when read in conjunction with the appended drawings. For the purposes of illustration, examples are shown in the drawings; however, the subject matter is not limited to specific elements and instrumentalities disclosed. In the drawings: [0004] FIG. 1 shows a block diagram of an example system; [0005] FIGS. 2A and 2B show an example Group of Pictures (GOP); [0006] FIG. 3 shows an example of a GOP hierarchical structure; [0007] FIGS. 4A and 4B show examples of mini-GOP hierarchical structures; [0008] FIG. 5 shows a flow chart of an example method; [0009] FIG. 6 shows a flow chart of an example method; [0010] FIG. 7 shows a flow chart of an example method; [0011] FIG. 8 shows a block diagram of an example computing device. DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS [0012] The first version of the H.265/MPEG-HEVC (High Efficiency Video Coding) standard enabled for the efficient compression of high-resolution video content (e.g., 3840x2160 (4K) video) as compared to its predecessor H.264/MPEG-AVC. This compression provided a good trade-off between the visual quality of the content and its corresponding bit-rate. The Versatile Video Coding (VVC) standard is being developed with the ultra high-definition UltraHD and high frame rate video requirements in mind (such as 7680x4320 (8K) video). However, the average computational complexity of VVC is expected to be several times higher than of its predecessor ( e.g., HEVC). There is currently a need to reduce bit-rate requirements, particularly for high-resolution video content, without decreasing perceived video content quality and while keeping computational complexity at a reasonable level. [0013] Accordingly, methods and systems are described for improved video compression. A spatial complexity associated with one or more frames of media content may be determined. Determining the spatial complexity of the one or more frames of the media content may comprise performing a frequency analysis of the one or more frames in order to determine an amount of high frequency components and low frequency components of the one or more frames. A temporal complexity associated with the one or more frames of the media content may 2 Date Re9ue/Date Received 2020-06-25 be determined. Determining the temporal complexity of the one or more frames of the media content may comprise determining an amount of motion between the one or more frames of the media content. Based on the spatial complexity and the temporal complexity of the media content, a Group of Picture (GOP) size for the one or more frames of the media content may be determined. The GOP size may be inversely proportional to the spatial complexity