CN-120639986-B - Video coding method, device, equipment and storage medium

CN120639986BCN 120639986 BCN120639986 BCN 120639986BCN-120639986-B

Abstract

The present disclosure provides a video encoding method, apparatus, device, and storage medium. In some embodiments of the present disclosure, a current frame to be encoded and a reference frame that can be used by the current frame are obtained, an initial search window is extended for a target reference frame to obtain a target search window, wherein the target reference frame is an image frame allocated to a plurality of target motion estimation engines in the reference frame, the target search window is smaller than a preset search range, the target search window in the target reference frame is utilized for motion search by the plurality of target motion estimation engines to obtain a target block matched with a current block of the current frame, and the present disclosure allocates motion estimation resources to a search area or a reference frame that does not overlap, thereby expanding the search range, enhancing the accuracy of motion vector prediction, reducing motion estimation errors, and improving video coding efficiency.

Inventors

HU HECHEN
ZHANG SHIJIA
Ernesto Andrade Neto
LOU JIAN
Nie Renke
ZHU ZHAOYUAN

Assignees

镕铭微电子(济南)有限公司

Dates

Publication Date: 20260508
Application Date: 20250717

Claims (13)

1. A video encoding method, comprising: Acquiring a current frame to be encoded and a reference frame which can be used by the current frame; Expanding an initial search window aiming at a target reference frame to obtain a target search window, wherein the target reference frame is an image frame distributed to a plurality of target motion estimation engines in the reference frame, and the target search window is smaller than a preset search range; Performing motion search by utilizing a plurality of target motion estimation engines in target search windows in the target reference frame to obtain a target block matched with the current block of the current frame; The initial search window comprises a motion vector prediction search window and a current block search window, the target search window comprises a first search window and a second search window, the initial search window is expanded to obtain the target search window, and the method comprises the following steps: Expanding the motion vector prediction search window by adopting a transverse expansion search window or a longitudinal expansion search window expansion mode to obtain the first search window; expanding the current block search window by adopting a transverse expansion search window or a longitudinal expansion search window expansion mode to obtain the second search window; The step of performing motion search by using target search windows of a plurality of target motion estimation engines in a target reference frame to obtain a target block matched with a current block of a current frame comprises the following steps: calculating to obtain motion vector prediction by utilizing a plurality of target motion estimation engines according to the information of the current block and surrounding blocks corresponding to the current block, wherein the position pointed by the motion vector prediction is used as an initial search center; Performing motion search in a first search window by using an initial search center to obtain a first matching block; performing motion search in a second search window by using the position center of the current block to obtain a second matching block; A target block is selected from the first matching block and the second matching block.
2. The method of claim 1, wherein the expanding the initial search window to obtain the target search window includes any one of the following expansion modes: expanding the initial search window by adopting an expansion mode of transversely expanding the search window to obtain a target search window; expanding the initial search window by adopting an expansion mode of longitudinally expanding the search window to obtain a target search window; And expanding the initial search window by adopting an expansion mode of transversely and longitudinally expanding the search window to obtain a target search window.
3. The method of claim 2, wherein the expanding the initial search window by using the expansion manner of the laterally expanding search window to obtain the target search window includes: determining a first transverse search center and a second transverse search center according to an initial motion search center and a transverse offset of the initial search window, wherein the transverse offset is half of the length of the initial search window; and expanding the initial search window according to the first transverse search center and the second transverse search center to obtain the target search window.
4. The method of claim 2, wherein expanding the initial search window by expanding the search window longitudinally to obtain a target search window comprises: determining a first longitudinal search center and a second longitudinal search center according to an initial motion search center and a longitudinal offset of the initial search window, wherein the longitudinal offset is half of the height of the initial search window; And expanding the initial search window according to the first longitudinal search center and the second longitudinal search center to obtain the target search window.
5. The method of claim 2, wherein the expanding the initial search window by expanding the search window in a lateral-longitudinal direction to obtain a target search window includes: Determining a third transverse search center, a fourth transverse search center, a third longitudinal search center and a fourth longitudinal search center according to the initial motion search center, the transverse offset and the longitudinal offset of the initial search window, wherein the transverse offset is half of the length of the initial search window, and the longitudinal offset is half of the height of the initial search window; and expanding the initial search window according to the third transverse search center, the fourth transverse search center, the third longitudinal search center and the fourth longitudinal search center to obtain the target search window.
6. The method of claim 1, wherein expanding the initial search window to obtain the target search window comprises: acquiring a motion vector of each reference block in the target reference frame; determining a target expansion mode of the initial search window according to the motion vector; And expanding the initial search window by adopting the target expansion mode to obtain the target search window.
7. The method of claim 6, wherein the target extension comprises a laterally extended search window and a longitudinally extended search window, and wherein the determining the target extension for the initial search window based on the motion vector comprises: counting a first number of motion vectors having a horizontal component greater than a vertical component, and counting a second number of motion vectors having the horizontal component less than the vertical component; if the first number is larger than the second number, determining that the target expansion mode of the initial search window is the transverse expansion search window; And under the condition that the first number is less than or equal to the second number, determining the target expansion mode of the initial search window as the longitudinal expansion search window.
8. The method of claim 1, wherein performing a motion search using a plurality of target motion estimation engines in a target search window in the target reference frame to obtain a target block matching a current block of the current frame, comprises: Calculating to obtain motion vector prediction by using a plurality of target motion estimation engines according to the current block and the information of surrounding blocks corresponding to the current block, wherein the position pointed by the motion vector prediction is used as an initial search center; performing motion search on the first search window by using the initial search center to obtain a first matching block; Performing motion search on the second search window by using the position center of the current block to obtain a second matching block; The target block is selected from the first matching block and the second matching block.
9. The method of claim 8, wherein the selecting the target block from the first matching block and the second matching block comprises: calculating a first absolute difference sum from each pixel of the first matching block and each pixel of the current block; Calculating a second absolute difference sum from each pixel of the second matching block and each pixel of the current block; And selecting a matching block with the smallest absolute difference value from the first matching block and the second matching block as the target block.
10. A video encoding apparatus, comprising: the acquisition module is used for acquiring a current frame to be coded and a reference frame which can be used by the current frame; The expansion module is used for expanding an initial search window aiming at a target reference frame to obtain a target search window, wherein the target reference frame is an image frame distributed to a plurality of target motion estimation engines in the reference frame, and the target search window is smaller than a preset search range; The searching module is used for carrying out motion searching by utilizing a plurality of target motion estimation engines in target searching windows in the target reference frame to obtain a target block matched with the current block of the current frame; The initial search window includes a motion vector prediction search window and a current block search window, and the target search window includes a first search window and a second search window; The expansion module is further configured to, when the initial search window is expanded to obtain a target search window: Expanding the motion vector prediction search window by adopting a transverse expansion search window or a longitudinal expansion search window expansion mode to obtain the first search window; expanding the current block search window by adopting a transverse expansion search window or a longitudinal expansion search window expansion mode to obtain the second search window; the search module is further configured to, when performing motion search using a plurality of target motion estimation engines in a target search window in a target reference frame to obtain a target block that matches a current block of a current frame: calculating to obtain motion vector prediction by utilizing a plurality of target motion estimation engines according to the information of the current block and surrounding blocks corresponding to the current block, wherein the position pointed by the motion vector prediction is used as an initial search center; Performing motion search in a first search window by using an initial search center to obtain a first matching block; performing motion search in a second search window by using the position center of the current block to obtain a second matching block; A target block is selected from the first matching block and the second matching block.
11. An electronic device, comprising: A processor; A memory for storing processor-executable instructions; wherein the processor is configured to execute instructions to implement the steps in the method of any of claims 1-10.
12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1-9.
13. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method of any of claims 1-9.

Description

Video coding method, device, equipment and storage medium Technical Field The present disclosure relates to the field of video compression technologies, and in particular, to a video encoding method, apparatus, device, and storage medium. Background Motion estimation is one of the most critical techniques in video coding, the main purpose of which is to improve coding efficiency by eliminating temporal redundancy in video sequences. In the currently prevailing video coding standards (such as h.264/AVC, h.265/HEVC and the latest h.266/VVC), a block-based hybrid coding framework is typically employed, wherein motion estimation is responsible for finding the best matching block of an image block in the current frame in a reference frame, and recording the corresponding motion vectors and residual information. In this way, the amount of data that needs to be transmitted can be significantly reduced, thereby improving overall compression performance. The existing motion estimation method mainly comprises a full search method, a three-step search method, a four-step search method, a diamond search method, layered motion estimation and the like. Among them, hierarchical motion estimation is widely used because of its significant advantage in computational complexity. According to the method, the images are subjected to multi-scale decomposition, and coarse-to-fine motion search is sequentially performed on different levels, so that the calculated amount is greatly reduced on the premise of ensuring certain precision. In addition, some improved hierarchical algorithms introduce a prediction mechanism or a dynamic adjustment strategy to further improve the search efficiency and accuracy. At present, the existing motion estimation method has larger motion estimation error and lower video coding efficiency. Disclosure of Invention The disclosure provides a video coding method, a device, equipment and a storage medium, which are used for at least solving the problems of larger error and lower video coding efficiency of the existing motion estimation. The technical scheme of the present disclosure is as follows: The embodiment of the disclosure provides a video encoding method, which comprises the following steps: Acquiring a current frame to be encoded and a reference frame which can be used by the current frame; Expanding an initial search window aiming at a target reference frame to obtain a target search window, wherein the target reference frame is an image frame distributed to a plurality of target motion estimation engines in the reference frame, and the target search window is smaller than a preset search range; And performing motion search by utilizing a plurality of target motion estimation engines in target search windows in the target reference frame to obtain a target block matched with the current block of the current frame. An embodiment of the present disclosure provides a video encoding apparatus, including: the acquisition module is used for acquiring a current frame to be coded and a reference frame which can be used by the current frame; The expansion module is used for expanding an initial search window aiming at a target reference frame to obtain a target search window, wherein the target reference frame is an image frame distributed to a plurality of target motion estimation engines in the reference frame, and the target search window is smaller than a preset search range; And the searching module is used for searching the motion by utilizing a plurality of target motion estimation engines in a target searching window in the target reference frame to obtain a target block matched with the current block of the current frame. The embodiment of the disclosure also provides an electronic device, including: A processor; A memory for storing processor-executable instructions; wherein the processor is configured to execute instructions to implement the steps of the method described above. The disclosed embodiments also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method. The disclosed embodiments also provide a computer program product comprising a computer program/instruction which, when executed by a processor, implements the steps of the method described above. The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: In some embodiments of the present disclosure, a current frame to be encoded and a reference frame that can be used by the current frame are obtained, an initial search window is extended for a target reference frame to obtain a target search window, wherein the target reference frame is an image frame allocated to a plurality of target motion estimation engines in the reference frame, the target search window is smaller than a preset search range, the target search window in the target reference frame is utilized