CN-121985131-A - Video coding dynamic parameter optimization method and device based on deep learning

CN121985131ACN 121985131 ACN121985131 ACN 121985131ACN-121985131-A

Abstract

The invention provides a video coding dynamic parameter optimization method and device based on deep learning, which are used for acquiring original video stream data, framing the original video stream data, preprocessing and extracting features according to each frame of data, inputting the extracted features into a pre-trained parameter prediction model, outputting an adjustment parameter vector corresponding to the corresponding frame of data by the parameter prediction model according to the extracted vector, adjusting parameters in an encoder according to the adjustment parameter vector, realizing the dynamic adjustment optimization of the parameters in the encoder, ensuring the real-time performance and adaptability of the encoder parameter optimization, and improving the efficiency and quality of video coding.

Inventors

ZHU CONG
Gan Wenrui
ZHAO HUISHOU
ZENG XIANGYU

Assignees

中国船舶集团有限公司第七〇九研究所

Dates

Publication Date: 20260505
Application Date: 20260128

Claims (10)

1. The video coding dynamic parameter optimization method based on the deep learning is characterized by comprising the following steps of: acquiring original video stream data, preprocessing the original video stream data, and extracting features according to the preprocessed original video stream data to obtain a feature set; inputting the feature set into a pre-trained parameter prediction model, wherein the parameter prediction model outputs an adjustment parameter vector corresponding to the original video stream data; And adjusting the encoder according to the adjustment parameter vector to realize dynamic optimization of the encoding parameters.
2. The method for optimizing video coding dynamic parameters based on deep learning according to claim 1, wherein the preprocessing of the raw video stream data and the feature extraction according to the preprocessed raw video stream data are performed to obtain a feature set, specifically comprising: framing original video stream data according to a preset frame rate, and scaling each frame of data to a preset size to obtain each frame of preprocessed data; carrying out gray processing on each preprocessed frame data to obtain space complexity characteristics corresponding to the corresponding frame data; acquiring motion vectors of all adjacent frame data, and acquiring time complexity characteristics corresponding to corresponding frame data according to the motion vectors; Extracting static feature vectors corresponding to the corresponding frame data according to each preprocessed frame data; And taking the space complexity characteristic, the time complexity characteristic and the static characteristic vector corresponding to the corresponding frame data as a characteristic set corresponding to the corresponding frame data.
3. The method for optimizing video coding dynamic parameters based on deep learning according to claim 2, wherein the step of performing gray processing on each preprocessed frame data to obtain spatial complexity characteristics corresponding to the corresponding frame data specifically comprises: carrying out graying treatment on the corresponding frame data after pretreatment to obtain a gray gradient mean value corresponding to the corresponding frame data, and taking the gray gradient mean value as a space complexity characteristic corresponding to the corresponding frame data; The expression of the gray gradient mean value is as follows: ; wherein SI is the gray gradient mean value, N is the total number of pixels, Is a gradient in the horizontal direction, and the gradient is formed in the horizontal direction, Is a vertical gradient.
4. The method for optimizing video coding dynamic parameters based on depth learning according to claim 2, wherein the obtaining motion vectors of all adjacent frame data, and obtaining time complexity characteristics corresponding to corresponding frame data according to the motion vectors, specifically comprises: obtaining an optical flow vector variance corresponding to corresponding frame data according to the motion vector of adjacent frame data, and taking the optical flow vector variance as a time complexity characteristic corresponding to the corresponding frame data; The expression of the optical flow vector variance is: ; Where TI is the optical flow vector variance, M is the number of motion vectors, Is the kth vector.
5. The method for optimizing video coding dynamic parameters based on deep learning according to claim 1, wherein the feature set is input into a pre-trained parameter prediction model, and the parameter prediction model outputs an adjustment parameter vector corresponding to original video stream data, specifically comprising: inputting the space complexity characteristic and the static characteristic vector into an LSTM layer of a parameter prediction model to obtain a space redundancy mode; inputting the time complexity characteristic into an LSTM layer of a parameter prediction model to obtain a motion continuous characteristic; And splicing the spatial redundancy mode and the motion continuous characteristic and inputting the spliced spatial redundancy mode and the motion continuous characteristic into a full-connection layer of a parameter prediction model to obtain a quantization parameter, a frame rate adjustment coefficient, a prediction space complexity characteristic and a prediction time complexity characteristic corresponding to corresponding frame data, wherein the prediction quantization parameter, the frame rate adjustment coefficient, the prediction space complexity characteristic and the prediction time complexity characteristic corresponding to the corresponding frame data are used as adjustment parameter vectors corresponding to the corresponding frame data.
6. The depth learning based video coding dynamic parameter optimization method of claim 1, wherein a loss function for training the parameter prediction model is as follows: ; Where L is the overall loss value and, And Are all the equilibrium super-parameters of the liquid crystal display, In order to predict the QP value, For the actual QP value to be the actual QP value, In order to predict the similarity of the mechanisms, Is the actual mechanism similarity.
7. The method for optimizing video coding dynamic parameters based on depth learning according to claim 1, wherein the adjusting the encoder according to the adjustment parameter vector specifically comprises: Obtaining an adjustment quantization parameter according to the prediction quantization parameter, the prediction space complexity characteristic and the prediction time complexity characteristic corresponding to the corresponding frame data, and applying the adjustment quantization parameter to an encoder; And obtaining an adjustment code rate according to the frame rate adjustment coefficient corresponding to the corresponding frame data, and applying the adjustment code rate to the encoder.
8. The depth learning based video coding dynamic parameter optimization method according to claim 1, wherein the expression for adjusting quantization parameters is: ; Wherein, the In order to adjust the quantization parameter(s), In order to predict the quantization parameter(s), In order to predict the temporal complexity characteristic, To predict spatial complexity characteristics; the expression of the code rate adjustment is as follows: ; Wherein, the In order to adjust the code rate, For the initial code rate, alpha is the frame rate adjustment coefficient, In order to be able to use a wide band, Broadband is required.
9. A depth learning based video coding dynamic parameter optimization device comprising at least one processor and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor for executing the depth learning based video coding dynamic parameter optimization method of any one of claims 1-8.
10. A non-transitory computer storage medium storing computer program instructions which, when executed by one or more processors, implement the depth learning based video coding dynamic parameter optimization method of any one of claims 1-8.

Description

Video coding dynamic parameter optimization method and device based on deep learning Technical Field The invention belongs to the technical field of video processing, and particularly relates to a video coding dynamic parameter optimization method and device based on deep learning. Background With the development and progress of the network age, the increase of network bandwidth and the continuous promotion of user experience pursuit, the rapid development of ultra-high definition video, virtual reality video, panoramic video, intelligent monitoring video and the like, the video data volume also increases rapidly, and the video coding technology is also challenged more. The traditional video coding technology relies on a fixed rule or a statistical model to carry out parameter adjustment, and has the problems that (1) the traditional method has a bottleneck of coding efficiency, the compression efficiency approaches to a theoretical limit when compressing high-resolution and high-dynamic-range videos, and is difficult to meet the transmission requirements of 5G and ultra-high definition videos, (2) the traditional algorithm has insufficient instantaneity, namely, under complex scenes (such as rapid motion and complex textures), the traditional algorithm needs to carry out iterative computation for multiple times, so that video delay is increased, and (3) the adaptability is poor, the fixed parameter strategy is difficult to adapt to the coding requirements of different content complexity (such as alternation of static scenes and dynamic scenes), so that code rate allocation is unreasonable, and picture details are lost under high compression rate. In view of this, overcoming the drawbacks of the prior art is a problem to be solved in the art. Disclosure of Invention The invention aims at improving the efficiency, adaptability and real-time performance of video coding. In a first aspect, a method for optimizing dynamic parameters of video coding based on depth learning is provided, including: acquiring original video stream data, preprocessing the original video stream data, and extracting features according to the preprocessed original video stream data to obtain a feature set; inputting the feature set into a pre-trained parameter prediction model, wherein the parameter prediction model outputs an adjustment parameter vector corresponding to the original video stream data; And adjusting the encoder according to the adjustment parameter vector to realize dynamic optimization of the encoding parameters. Preferably, the preprocessing the original video stream data, and extracting features according to the preprocessed original video stream data to obtain a feature set, specifically includes: framing original video stream data according to a preset frame rate, and scaling each frame of data to a preset size to obtain each frame of preprocessed data; carrying out gray processing on each preprocessed frame data to obtain space complexity characteristics corresponding to the corresponding frame data; acquiring motion vectors of all adjacent frame data, and acquiring time complexity characteristics corresponding to corresponding frame data according to the motion vectors; Extracting static feature vectors corresponding to the corresponding frame data according to each preprocessed frame data; And taking the space complexity characteristic, the time complexity characteristic and the static characteristic vector corresponding to the corresponding frame data as a characteristic set corresponding to the corresponding frame data. Preferably, the gray processing is performed on each preprocessed frame data to obtain a spatial complexity characteristic corresponding to the corresponding frame data, which specifically includes: carrying out graying treatment on the corresponding frame data after pretreatment to obtain a gray gradient mean value corresponding to the corresponding frame data, and taking the gray gradient mean value as a space complexity characteristic corresponding to the corresponding frame data; The expression of the gray gradient mean value is as follows: ; wherein SI is the gray gradient mean value, N is the total number of pixels, Is a gradient in the horizontal direction, and the gradient is formed in the horizontal direction,Is a vertical gradient. Preferably, the obtaining motion vectors of all adjacent frame data, and obtaining the time complexity characteristic corresponding to the corresponding frame data according to the motion vectors specifically includes: obtaining an optical flow vector variance corresponding to corresponding frame data according to the motion vector of adjacent frame data, and taking the optical flow vector variance as a time complexity characteristic corresponding to the corresponding frame data; The expression of the optical flow vector variance is: ; Where TI is the optical flow vector variance, M is the number of motion vectors, Is the kth vector. Preferably, the inputt