CN-115661303-B - Video processing method, device, equipment and storage medium

CN115661303BCN 115661303 BCN115661303 BCN 115661303BCN-115661303-B

Abstract

The invention discloses a video processing method, a device, equipment and a storage medium, wherein the method comprises the steps of obtaining original video data, wherein the original video data is provided with multi-frame original image data; extracting a specified target object from each frame of original image data, adding a frame into at least part of the original image data, mapping the target object back to the original image data to obtain target image data if the frame is added, and replacing the original image data with the target image data in the original video data to obtain target video data. According to the embodiment, the target object is firstly extracted from the original image data, and is mapped back into the original image data after the frame is added to the original image data, so that the target object is arranged above the frame, the frame is prevented from shielding the target object, the target video data is ensured to normally express main information, a gap is formed between the target object and the non-target object in vision, and a stereoscopic visual effect can be manufactured.

Inventors

WANG CHUANPENG
LI TENGFEI
Lu Jukang

Assignees

安徽尚趣玩网络科技有限公司

Dates

Publication Date: 20260505
Application Date: 20220927

Claims (12)

1. A video processing method, comprising: Acquiring original video data, wherein the original video data has multi-frame original image data; Extracting a specified target object from the original image data of each frame; Adding a frame to at least part of the original image data, including setting the size of the frame to at least part of the original image data; if the frame is added, mapping the target object back to the original image data to obtain target image data; In the original video data, replacing the original image data with the target image data to obtain target video data; wherein said sizing of the border for at least a portion of said raw image data comprises: Inquiring the time range of the portrait data in the original video data; Dividing the time range into three areas in sequence, wherein the three areas are respectively used as a head range, a middle range and a tail range; Incrementally to what is described within the head setting the size of a frame according to the original image data; maintaining the size of the border for the raw image data within the mid-range; Setting the size of a frame for the original image data in the tail range in a decreasing manner; If the time ranges corresponding to the different roles are overlapped, calculating the time length of overlapping between a first role range and a second role range, wherein the first role range is the time range corresponding to the role ranked in front, and the second role range is the time range corresponding to the role ranked in rear; calculating a ratio between the time length and the first character range; Calculating a difference value between the first role range and the second role range as a difference range; If the ratio is smaller than a preset overlapping threshold and the difference range is smaller than a preset range threshold, ignoring the first role range or the second role range; and if the ratio is greater than or equal to a preset overlapping threshold value or the difference range is greater than or equal to a preset range threshold value, overlapping the first role range and the second role range to obtain a new time range.
2. The method of claim 1, wherein extracting the specified target object in each frame of the raw image data comprises: Loading a correlation vector machine; inputting the original image data into the related vector machine for semantic recognition so as to extract portrait data as a target object.
3. The method of claim 1, wherein extracting the specified target object in each frame of the raw image data comprises: Loading a text recognition network; dividing a region with the content being text from the original image data as text image data; And inputting the regional image data into the text recognition network for optical character recognition so as to extract the text data as a target object.
4. A method according to any one of claims 1-3, wherein said adding a border to at least part of said raw image data comprises: setting a target range at edges in at least part of the original image data according to the size; And filling a specified color in the target range to serve as a frame.
5. The method of claim 4, wherein said sizing a border for at least a portion of said raw image data comprises: inquiring the width and the height of the original image data; Taking a preset first proportion of the width to obtain the size of the frame in the horizontal direction; and taking a preset second proportion of the height to obtain the dimension of the frame in the vertical direction.
6. The method of claim 1, wherein the time range in which the portrait data is queried in the original video data comprises: inquiring the time point of the portrait data in the original video data; identifying face data for the portrait data; And for the face data of the same character, if the time difference value of the interval between two adjacent time points is smaller than or equal to a preset distance threshold value, connecting the two adjacent time points to obtain a time range.
7. The method of claim 1, wherein sequentially dividing the time range into three regions, which are a head range, a middle range, and a tail range, comprises: Sequentially taking a first candidate range from a starting point of the time range, and sequentially taking a second candidate range from an ending point of the time range in an inverted order; Calculating a first highlight value representing a level of highlighting for each frame of the raw image data within the first candidate range; setting a region from a start point of the time range to a point in time at which the original image data having the highest first highlight value is located as a head range; Calculating a second highlight value representing a degree of highlighting for each frame of the raw image data within the second candidate range; Setting a region from an end point of the time range to a point of time at which the original image data having the highest second highlight value is located as a tail range; and setting other areas except the head range and the tail range in the time range as a middle range.
8. The method of claim 4, wherein filling a specified color within the target area as a border comprises: clustering at least part of pixel points in the original image data according to the color values to obtain a plurality of candidate clusters; Screening out the candidate clusters of which the number is ranked in the first n according to the sequence from big to small, and taking the candidate clusters as target clusters; screening color values meeting deviation conditions as target colors, wherein the deviation conditions are that the distances between the color values and any color value represented by the target cluster are larger than a preset color threshold; And filling the pixel points in the target range with the target color to serve as a frame.
9. A video processing apparatus, comprising: an original video data acquisition module for acquiring original video data, the original video data is provided with a plurality of frames of original image data; The target object extraction module is used for extracting a specified target object from the original image data of each frame; The frame adding module is used for adding frames to at least part of the original image data and comprises the steps of setting the sizes of the frames for at least part of the original image data; The target object mapping module is used for mapping the target object back to the original image data if the frame is added, so as to obtain target image data; the target video data generation module is used for replacing the original image data with the target image data in the original video data to obtain target video data; Wherein, the frame adding module is further used for: Inquiring the time range of the portrait data in the original video data; Dividing the time range into three areas in sequence, wherein the three areas are respectively used as a head range, a middle range and a tail range; Incrementally to what is described within the head setting the size of a frame according to the original image data; maintaining the size of the border for the raw image data within the mid-range; Setting the size of a frame for the original image data in the tail range in a decreasing manner; If the time ranges corresponding to the different roles are overlapped, calculating the time length of overlapping between a first role range and a second role range, wherein the first role range is the time range corresponding to the role ranked in front, and the second role range is the time range corresponding to the role ranked in rear; calculating a ratio between the time length and the first character range; Calculating a difference value between the first role range and the second role range as a difference range; If the ratio is smaller than a preset overlapping threshold and the difference range is smaller than a preset range threshold, ignoring the first role range or the second role range; and if the ratio is greater than or equal to a preset overlapping threshold value or the difference range is greater than or equal to a preset range threshold value, overlapping the first role range and the second role range to obtain a new time range.
10. An electronic device, the electronic device comprising: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the video processing method of any one of claims 1-8.
11. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program for causing a processor to implement the video processing method of any one of claims 1-8 when executed.
12. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the video processing method according to any of claims 1-8.

Description

Video processing method, device, equipment and storage medium Technical Field The present invention relates to the field of multimedia technologies, and in particular, to a video processing method, apparatus, device, and storage medium. Background In the scenes of popularizing games, electronic products and the like, video data are often used for introducing objects such as games, electronic products and the like, and the video data present information of business objects such as games, electronic products and the like in a picture and sound mode, so that the user can read the information conveniently. After the original video data is recorded, the artist uses a professional video production tool to carry out post-processing on the video data so as to improve the quality of the video data. In one common post-processing, artwork uses video production tools to add uniform frames to the video data, but these frames can obscure portions of the content of the video data, affecting the information expressed by the video data. Disclosure of Invention The invention provides a video processing method, a device, equipment and a storage medium, which are used for solving the problem of reducing the influence on information expressed by video data when frames are manufactured on the video data. According to an aspect of the present invention, there is provided a video processing method including: Acquiring original video data, wherein the original video data has multi-frame original image data; Extracting a specified target object from the original image data of each frame; adding a frame to at least part of the original image data; if the frame is added, mapping the target object back to the original image data to obtain target image data; and in the original video data, replacing the original image data with the target image data to obtain target video data. According to another aspect of the present invention, there is provided a video processing apparatus including: an original video data acquisition module for acquiring original video data, the original video data is provided with a plurality of frames of original image data; The target object extraction module is used for extracting a specified target object from the original image data of each frame; The frame adding module is used for adding frames to at least part of the original image data; The target object mapping module is used for mapping the target object back to the original image data if the frame is added, so as to obtain target image data; and the target video data generation module is used for replacing the original image data with the target image data in the original video data to obtain target video data. According to another aspect of the present invention, there is provided an electronic apparatus including: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the video processing method according to any one of the embodiments of the present invention. According to another aspect of the present invention, there is provided a computer readable storage medium storing a computer program for causing a processor to execute the video processing method according to any one of the embodiments of the present invention. According to another aspect of the present invention, there is provided a computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements a video processing method according to any of the embodiments of the present invention. In the embodiment, original video data is obtained, wherein the original video data is provided with multiple frames of original image data, a specified target object is extracted from each frame of original image data, a frame is added to at least part of the original image data, if the frame is added, the target object is mapped back to the original image data to obtain target image data, and the original image data is replaced by the target image data in the original video data to obtain target video data. According to the method, the target object is firstly extracted from the original image data, after the frame is added to the original image data, the target object is mapped back to the original image data, so that the target object is arranged above the frame, the frame is prevented from shielding the target object, the target video data is guaranteed to normally express main information, a gap is formed between the target object and the non-target object in vision, a three-dimensional visual effect can be manufactured, continuity change exists between the target object and the frame in continuous multi-frame target image data, and various dynamic effects that the target object jumps from the inside of the frame to the outside of the frame can be crea