CN-121981126-A - Video intention understanding method and system based on implicit behavioral entropy

CN121981126ACN 121981126 ACN121981126 ACN 121981126ACN-121981126-A

Abstract

The application relates to the technical field of data processing, in particular to a video intention understanding method and system based on implicit behavioral entropy; the method comprises the steps of collecting real-time interaction behaviors of a user on any video segment to obtain a hidden interaction sequence, calculating hidden behavior entropy of the video segment according to occurrence probabilities of various interaction behaviors in the hidden interaction sequence, dynamically updating weights of edges between video segment nodes and creative concept nodes based on the hidden behavior entropy, calculating semantic association abundance of any video segment according to the weights of the edges connected with the video segment, and calculating creative value scores of the video segment according to the semantic association abundance of the video segment. The method provided by the application can be used for deeply mining the value of the video and improving the experience of the creator and the user.

Inventors

DING ZIJIAN
DING JUNWEI
CHEN DEPIN

Assignees

钛动科技股份有限公司

Dates

Publication Date: 20260505
Application Date: 20260209

Claims (10)

1. A video intention understanding method based on implicit behavioral entropy is characterized by comprising the steps of constructing a dynamic knowledge graph, wherein the dynamic knowledge graph at least comprises video segment nodes and creative concept nodes, and edge connection exists between the video segment nodes and the creative concept nodes; for any video segment, acquiring real-time interaction behaviors of a user on the video segment to obtain a hidden interaction sequence; dynamically updating the weights of edges between the video segment nodes and the creative concept nodes based on the implicit behavior entropy; for any video segment, calculating semantic association abundance according to the weight of the edge connected with the video segment, and calculating creative value score of the video segment according to the semantic association abundance of the video segment, wherein the semantic association abundance is positively correlated with the creative value score.
2. The method of claim 1, wherein for an initial weight of an edge between any video segment node and a creative concept node, the initial weight is positively correlated with a similarity of video features of the video segment and semantic features of the creative concept.
3. The method for understanding video intention based on implicit behavioral entropy of claim 1, wherein dynamically updating weights of edges between video segment nodes and creative concept nodes based on implicit behavioral entropy comprises calculating an update value for an edge between a video segment node and any creative concept node according to the implicit behavioral entropy of the video segment node, weighting and fusing the current weight of the edge with the update value to obtain an up-to-date weight, and updating the weight of the edge by using the up-to-date weight.
4. The method for understanding video intention based on implicit behavioral entropy of claim 3, wherein calculating the updated value based on the implicit behavioral entropy of the video segment node comprises obtaining feature similarities of the video segment node and the creative concept node, and linearly combining the feature similarities with the implicit behavioral entropy to obtain the updated value.
5. The method for understanding video intent based on implicit behavioral entropy of claim 1, wherein calculating a creative value score for a video segment based on semantic association abundance of the video segment comprises taking normalized results of the semantic association abundance as the creative value score.
6. The method for understanding the video intention based on the implicit behavioral entropy, which is disclosed in claim 1, is characterized in that the method comprises the steps of calculating a creative value score of a video clip according to semantic association abundance of the video clip, taking a normalized result of the semantic association abundance as a gain score, taking a weighted sum result of the gain score and a basic score as a creative value score, and obtaining the basic score comprises the steps of calculating similarity of video features of the video clip and semantic features corresponding to each reference word in a preset word bank, and taking the maximum value in the similarity calculation result as the basic score.
7. The video intention understanding method based on the implicit behavior entropy of claim 1 is characterized by comprising the steps of calculating the implicit behavior entropy of a video segment according to the occurrence probability of various interactive behaviors in an implicit interactive sequence, wherein the interactive behaviors are divided into positive interactive behaviors and negative interactive behaviors, the positive interactive behaviors, all the positive interactive behaviors in the implicit interactive sequence form a positive interactive set, all the negative interactive behaviors in the implicit interactive sequence form a negative interactive set, calculating positive attention entropy according to the probability of each positive interactive behavior in the positive interactive set, calculating negative loss entropy according to the probability of each negative interactive behavior in the negative interactive set, and taking the difference of subtracting the negative loss entropy from the positive attention entropy as the implicit behavior entropy.
8. The method for understanding video intention based on implicit behavioral entropy of claim 6, wherein calculating similarity of video features of the video segment and semantic features corresponding to each reference word in a preset word bank comprises extracting video feature vectors of the video segment and semantic vectors of each reference word in the preset word bank, and taking cosine angles between the video feature vectors and the semantic vectors as the similarity of the video features of the video segment and the semantic features of the reference words in the preset word bank.
9. The method for understanding video intention based on implicit behavioral entropy of claim 1, wherein the step of obtaining the video clip comprises dividing an original video stream uploaded by an creator to obtain the video clip.
10. A video intent understanding system based on implicit behavioral entropy comprising a processor and a memory storing computer program instructions that when executed by the processor implement a video intent understanding method based on implicit behavioral entropy as claimed in any one of claims 1 to 9.

Description

Video intention understanding method and system based on implicit behavioral entropy Technical Field The application relates to the technical field of data processing, in particular to a video intention understanding method and system based on implicit behavioral entropy. Background In the age of rapid development of digital economies today, short videos, micro-dramas, and AIGC (artificial intelligence generated content) present a rapidly growing situation. As this trend advances, the way in which video content value is mined has changed. Early, the mining of video content value focused mainly on the primary level, namely, identifying physical tags such as objects, scenes or people in the video, which are relatively intuitive and easy to obtain. However, with industry development and increasing user demand, the emphasis of video content value mining has now turned to cognitive level resolution of deep creative kernels of emotional tension, narrative logic, cultural metaphors, and aesthetic styles. This transition reflects the market's higher demands on the depth and breadth of understanding of video content. Although video understanding technology is continually evolving, there are still a number of problems with existing technology in dealing with complex creative intents. Existing video recognition models have "static" features in nature. These models are highly dependent on a predefined tag system during the construction process. Although this pre-defined tagging system can meet the basic video content recognition requirements to some extent, it has some limitations. Because the video recognition model is fixed and cannot be flexibly adjusted, the instant and implicit aesthetic preference generated by the user in the interaction process is difficult to capture. For example, a user may suddenly generate a particular preference for a color match at a certain instant or resonate for a particular narrative rhythm while watching a video, but the existing recognition model cannot timely perceive and record the temporal, implicit aesthetic trends. In addition, conventional video recommendation and mining algorithms are typically trapped in "traffic traps," i.e., overfocus on explicit indicators such as playout rate, praise, etc., and ignore the user interaction behavior process. For example, the user's repeated playback of a particular mirror video clip, long pauses at a particular artistic composition, and these implicit actions back-hide the user's cravings for a particular creative skill. The existing knowledge graph can not realize dynamic adjustment, so that a recommendation result is single, and the real valuable information is difficult to push for a user. In summary, the insufficient understanding depth of the video intention in the related art results in that part of the content with high artistic value in the video platform is difficult to obtain higher exposure rate, so that the experience of a creator of the short video platform is poor, and meanwhile, the user is difficult to obtain accurate video recommendation, so that the user experience of the short video platform is poor. Disclosure of Invention In order to deepen understanding depth of video content and improve experience of an author and a user on using a short video platform, the application provides a video intention understanding method and system based on implicit behavior entropy. In a first aspect, the present application provides a video intention understanding method based on implicit behavioral entropy, which adopts the following technical scheme: A video intention understanding method based on implicit behavior entropy comprises the steps of constructing a dynamic knowledge graph, wherein the dynamic knowledge graph at least comprises video segment nodes and creative concept nodes, and edge connection exists between the video segment nodes and the creative concept nodes; for any video segment, acquiring real-time interaction behaviors of a user on the video segment to obtain a hidden interaction sequence; dynamically updating the weights of edges between the video segment nodes and the creative concept nodes based on the implicit behavior entropy; for any video segment, calculating semantic association abundance according to the weight of the edge connected with the video segment, and calculating creative value score of the video segment according to the semantic association abundance of the video segment, wherein the semantic association abundance is positively correlated with the creative value score. And the real-time interaction behavior of the user on the video clips is collected, and a implicit interaction sequence is formed, so that the real attention distribution and the cognitive interest of the user are more close to the real attention distribution and the cognitive interest of the user compared with the situation that the user only depends on dominant feedback indexes such as praise, finish playing rate and the like. And