CN-121239860-B - Video transcoding method, system, electronic equipment and storage medium
Abstract
The application provides a video transcoding method, a video transcoding system, electronic equipment and a storage medium, and belongs to the technical field of video processing. The method comprises the steps of obtaining a video stream, decapsulating the video stream to obtain decapsulated data, extracting dynamic metadata from the video code stream, traversing the dynamic metadata to judge whether scene switching exists, recording target dynamic metadata corresponding to the scene switching when the scene switching exists when the traversing the dynamic metadata determines that the scene switching exists, determining a target frame number according to the target dynamic metadata, determining a scene switching frame and a non-scene switching frame in the video code stream according to the target frame number, adjusting coding parameters of the scene switching frame and/or the non-scene switching frame in the recoding process, and carrying out video coding processing according to the adjusted coding parameters to obtain the transcoding code stream. In this way, the dynamic metadata is utilized to identify video scene switching, the code rate is regulated by recoding according to the scene switching, the non-scene switching frames reduce the code words, the total code words are saved, and the coding performance is improved.
Inventors
- YU JING
Assignees
- 马栏山音视频实验室
Dates
- Publication Date
- 20260505
- Application Date
- 20251202
Claims (14)
- 1. A method of transcoding video, the method comprising: obtaining a video stream, and decapsulating the video stream to obtain decapsulated data, wherein the decapsulated data comprises a video code stream; extracting dynamic metadata from the video code stream, traversing the dynamic metadata, and judging whether scene switching exists or not; When the dynamic metadata are traversed to determine that the scene switching exists, recording target dynamic metadata corresponding to the scene switching; determining a target frame number according to the target dynamic metadata, and determining a scene switching frame and a non-scene switching frame in the video code stream according to the target frame number; in the recoding process, adjusting the coding parameters of the scene switching frames and/or the non-scene switching frames; Performing video coding processing according to the adjusted coding parameters to obtain a transcoding code stream; The extracting dynamic metadata from the video bitstream includes: extracting NAL units from the video code stream frame by frame, judging whether the extracted NAL units are of the NAL type for storing dynamic metadata, if so, extracting the dynamic metadata, and if not, decoding to obtain YUV data; the judging whether scene switching exists or not includes: The method comprises the steps of obtaining a difference value of dynamic metadata of a current frame and dynamic metadata of a target frame, wherein the current frame is an N-th frame in a sliding window, the target frame is each frame in a previous N-1 frame in the sliding window, and the size of the sliding window is N, wherein N is more than or equal to 2; judging whether the scene switching exists or not according to the difference value and a preset difference threshold; The difference value comprises a difference value, and when N is more than or equal to 3, the step of obtaining the difference value between the dynamic metadata of the current frame and the dynamic metadata of the target frame comprises the following steps: Substituting the dynamic metadata parameter values of the current frame and the dynamic metadata parameter values of each target frame into a first deviation degree calculation formula, and calculating to obtain a plurality of first deviation degree values; Or substituting the dynamic metadata parameter value of the current frame, the dynamic metadata parameter value of each target frame and a preset weight value into a second deviation degree calculation formula to obtain a plurality of second deviation degree values; The first deviation degree calculation formula is as follows: (1) The second deviation degree calculation formula is: (2) Wherein d1 (x, y) represents the first deviation value, d2 (x, y) represents the second deviation value, x i represents the dynamic metadata parameter value of the current frame, the number of which is n, y i represents the dynamic metadata parameter value of the target frame, the number of which is n, (x i ,y i ) represents a dynamic metadata parameter pair, x i ,y i represents a parameter value with the same dynamic metadata name, w i represents a weight value of different dynamic metadata, and w i is larger to represent that the corresponding dynamic metadata is more important.
- 2. The method of claim 1, wherein said determining scene-switched frames and non-scene-switched frames in the video bitstream based on the target frame number comprises: Decoding the video code stream to obtain YUV data, reading the YUV data frame by frame to recode, and determining the read YUV data as the scene switching frame or the non-scene switching frame according to the target frame number; the adjusting the coding parameters of the scene-switched frame and/or the non-scene-switched frame comprises: when recoding, if the read YUV data is the non-scene-switching frame, determining the read YUV data as a non-I frame and/or raising qp value; and when recoding, if the read YUV data is the scene switching frame, determining the read YUV data as an I frame and/or turning down qp value.
- 3. The method according to claim 2, wherein the performing video encoding processing according to the adjusted encoding parameters to obtain a transcoded bitstream comprises: if the read YUV data is the non-scene-switching frame, encoding the read YUV data into a non-I frame and/or encoding according to the qp value after the step-up; if the read YUV data is the scene switching frame, encoding the read YUV data into an I frame and/or encoding according to the reduced qp value; adding the dynamic metadata, and encoding to obtain current frame data; and packaging all the frame data obtained by encoding according to a preset packaging format, and outputting the transcoding code stream.
- 4. The method according to claim 1, wherein the difference value includes a difference value, and when N is 2, the obtaining the difference value between the dynamic metadata of the current frame and the dynamic metadata of the target frame includes: Substituting the dynamic metadata parameter value of the current frame and the dynamic metadata parameter value of the target frame into a first deviation degree calculation formula, and calculating to obtain the deviation degree value; Or substituting the dynamic metadata parameter value of the current frame, the dynamic metadata parameter value of the target frame and a preset weight value into a second deviation degree calculation formula, and calculating to obtain the deviation degree value.
- 5. The method according to claim 1, wherein the difference value includes a curve deviation, and when N is 2, the step of obtaining the difference value between the dynamic metadata of the current frame and the dynamic metadata of the target frame includes the following steps: tone mapping is carried out on the dynamic metadata of the current frame, and a first mapping curve is obtained; Tone mapping is carried out on the dynamic metadata of the target frame, and a second mapping curve is obtained; and obtaining the curve deviation degree between the first mapping curve and the second mapping curve.
- 6. The method of claim 5, wherein the obtaining a degree of curve deviation between the first map curve and the second map curve comprises: determining coordinate values of the first mapping curve and coordinate values of the second mapping curve; Substituting the coordinate values of the first mapping curve and the coordinate values of the second mapping curve into a preset curve deviation degree calculation formula, and calculating to obtain the curve deviation degree.
- 7. The method according to claim 1, wherein the difference value includes a curve deviation, and when N is greater than or equal to 3, the step of obtaining the difference value between the dynamic metadata of the current frame and the dynamic metadata of the target frame includes the following steps: tone mapping is carried out on the dynamic metadata of the current frame, and a first mapping curve is obtained; tone mapping is carried out on the dynamic metadata of each target frame, and a plurality of second mapping curves are obtained; respectively obtaining a first curve deviation degree between the first mapping curve and each second mapping curve; and carrying out mean value calculation or weighted calculation according to the plurality of first curve deviation degrees to obtain the curve deviation degrees.
- 8. The method according to claim 1, wherein the obtaining the difference value between the dynamic metadata of the current frame and the dynamic metadata of the target frame when the difference value includes a correlation value and N is 2 includes: Substituting the dynamic metadata parameter value of the current frame and the dynamic metadata parameter value of the target frame into a correlation calculation formula, and calculating to obtain the correlation value; The difference degree value comprises a correlation degree value, and when N is more than or equal to 3, the step of obtaining the difference degree value of the dynamic metadata of the current frame and the dynamic metadata of the target frame comprises the following steps: Substituting the dynamic metadata parameter values of the current frame and the dynamic metadata parameter values of the target frames into a correlation calculation formula, calculating to obtain a plurality of first correlation values, and carrying out mean value calculation or weighting calculation according to the plurality of first correlation values to obtain the correlation values.
- 9. The method of claim 4, wherein the predetermined variance threshold comprises a predetermined variance threshold, and wherein determining whether the scene cut exists based on the variance value and the predetermined variance threshold comprises: Judging whether the deviation degree value is larger than or equal to the preset deviation degree threshold value, if so, determining that the scene switching exists; If not, determining that the scene change does not exist.
- 10. The method according to any one of claims 5-7, wherein the preset variance threshold includes a preset curve variance threshold, and the determining whether the scene change exists according to the variance value and the preset variance threshold includes: Judging whether the curve deviation degree is larger than or equal to the preset curve deviation degree threshold value, if so, determining that the scene switching exists; If not, determining that the scene change does not exist.
- 11. The method of claim 8, wherein the predetermined variance threshold comprises a predetermined correlation threshold, and wherein determining whether the scene cut exists according to the variance value and the predetermined variance threshold comprises: Judging whether the correlation value is smaller than or equal to the preset correlation threshold value, if yes, determining that the scene switching exists; If not, determining that the scene change does not exist.
- 12. A system for transcoding video, the system comprising: the acquisition module is used for acquiring a video stream, and decapsulating the video stream to obtain decapsulated data, wherein the decapsulated data comprises a video code stream; the judging module is used for extracting dynamic metadata from the video code stream, traversing the dynamic metadata and judging whether scene switching exists or not; The processing module is used for recording the corresponding target dynamic metadata when the scene switching happens when the dynamic metadata is traversed to determine that the scene switching exists; determining a target frame number according to the target dynamic metadata, and determining a scene switching frame and a non-scene switching frame in the video code stream according to the target frame number; The adjusting module is used for adjusting the coding parameters of the scene switching frame and/or the non-scene switching frame in the recoding process; The transcoding module is used for carrying out video coding processing according to the adjusted coding parameters to obtain a transcoding code stream; The extracting dynamic metadata from the video bitstream includes: extracting NAL units from the video code stream frame by frame, judging whether the extracted NAL units are of the NAL type for storing dynamic metadata, if so, extracting the dynamic metadata, and if not, decoding to obtain YUV data; the judging whether scene switching exists or not includes: The method comprises the steps of obtaining a difference value of dynamic metadata of a current frame and dynamic metadata of a target frame, wherein the current frame is an N-th frame in a sliding window, the target frame is each frame in a previous N-1 frame in the sliding window, and the size of the sliding window is N, wherein N is more than or equal to 2; judging whether the scene switching exists or not according to the difference value and a preset difference threshold; The difference value comprises a difference value, and when N is more than or equal to 3, the step of obtaining the difference value between the dynamic metadata of the current frame and the dynamic metadata of the target frame comprises the following steps: Substituting the dynamic metadata parameter values of the current frame and the dynamic metadata parameter values of each target frame into a first deviation degree calculation formula, and calculating to obtain a plurality of first deviation degree values; Or substituting the dynamic metadata parameter value of the current frame, the dynamic metadata parameter value of each target frame and a preset weight value into a second deviation degree calculation formula to obtain a plurality of second deviation degree values; The first deviation degree calculation formula is as follows: (1) The second deviation degree calculation formula is: (2) Wherein d1 (x, y) represents the first deviation value, d2 (x, y) represents the second deviation value, x i represents the dynamic metadata parameter value of the current frame, the number of which is n, y i represents the dynamic metadata parameter value of the target frame, the number of which is n, (x i ,y i ) represents a dynamic metadata parameter pair, x i ,y i represents a parameter value with the same dynamic metadata name, w i represents a weight value of different dynamic metadata, and w i is larger to represent that the corresponding dynamic metadata is more important.
- 13. An electronic device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, performs the method of transcoding video according to any one of claims 1 to 11.
- 14. A computer-readable storage medium, characterized in that it stores a computer program which, when run on a processor, performs the method of transcoding video according to any one of claims 1 to 11.
Description
Video transcoding method, system, electronic equipment and storage medium Technical Field The present application relates to the field of video processing technologies, and in particular, to a video transcoding method, a system, an electronic device, and a storage medium. Background With the vigorous development of the high-definition video industry, the experience demand of users on video image quality experience is increasingly improved. In the rapid evolution process of the ultra-high definition ecological industry, the video playing platform presents a diversified development trend, so that the same content needs to adapt to the playing requirements of a plurality of platforms. In view of the significant differences in video formats supported by each platform, market demand for specialized video transcoding tools continues to grow. In this context, how to realize efficient video transcoding has become a central topic of intense research by video technology developers. Disclosure of Invention In order to solve the technical problems, the application provides a video transcoding method, a video transcoding system, electronic equipment and a storage medium. In a first aspect, the present application provides a method for transcoding video, the method comprising: obtaining a video stream, and decapsulating the video stream to obtain decapsulated data, wherein the decapsulated data comprises a video code stream; extracting dynamic metadata from the video code stream, traversing the dynamic metadata, and judging whether scene switching exists or not; When the dynamic metadata are traversed to determine that the scene switching exists, recording target dynamic metadata corresponding to the scene switching; determining a target frame number according to the target dynamic metadata, and determining a scene switching frame and a non-scene switching frame in the video code stream according to the target frame number; in the recoding process, adjusting the coding parameters of the scene switching frames and/or the non-scene switching frames; And performing video coding processing according to the adjusted coding parameters to obtain a transcoding code stream. In an embodiment, the determining the scene-switched frame and the non-scene-switched frame in the video bitstream according to the target frame number includes: Decoding the video code stream to obtain YUV data, reading the YUV data frame by frame to recode, and determining the read YUV data as the scene switching frame or the non-scene switching frame according to the target frame number; the adjusting the coding parameters of the scene-switched frame and/or the non-scene-switched frame comprises: when recoding, if the read YUV data is the non-scene-switching frame, determining the read YUV data as a non-I frame and/or raising qp value; and when recoding, if the read YUV data is the scene switching frame, determining the read YUV data as an I frame and/or turning down qp value. In an embodiment, the performing video encoding processing according to the adjusted encoding parameter to obtain a transcoded bitstream includes: if the read YUV data is the non-scene-switching frame, encoding the read YUV data into a non-I frame and/or encoding according to the qp value after the step-up; if the read YUV data is the scene switching frame, encoding the read YUV data into an I frame and/or encoding according to the reduced qp value; adding the dynamic metadata, and encoding to obtain current frame data; and packaging all the frame data obtained by encoding according to a preset packaging format, and outputting the transcoding code stream. In an embodiment, the determining whether a scene change exists includes: The method comprises the steps of obtaining a difference value of dynamic metadata of a current frame and dynamic metadata of a target frame, wherein the current frame is an N-th frame in a sliding window, the target frame is each frame in a previous N-1 frame in the sliding window, and the size of the sliding window is N, wherein N is more than or equal to 2; and judging whether the scene switching exists or not according to the difference value and a preset difference threshold value. In an embodiment, the difference value includes a difference value, and when N is 2, the obtaining the difference value between the dynamic metadata of the current frame and the dynamic metadata of the target frame includes: Substituting the dynamic metadata parameter value of the current frame and the dynamic metadata parameter value of the target frame into a first deviation degree calculation formula, and calculating to obtain the deviation degree value; Or substituting the dynamic metadata parameter value of the current frame, the dynamic metadata parameter value of the target frame and a preset weight value into a second deviation degree calculation formula, and calculating to obtain the deviation degree value. In an embodiment, the difference value includes a difference va