CN-122002061-A - EPG system, method, storage medium and computer device based on large language prediction

CN122002061ACN 122002061 ACN122002061 ACN 122002061ACN-122002061-A

Abstract

The invention discloses an EPG system, a method, a storage medium and computer equipment based on large language prediction, which relate to the technical field of intelligent pushing of electronic programs and mainly solve the problems of insufficient depth information integration, low navigation efficiency, low individuation degree and broken business conversion paths of the traditional EPG system; the large language prediction model layer comprises a multi-mode understanding module, a dynamic metadata factory module, a live broadcast stream analysis module, an on-demand map analysis module, a large language prediction engine module and a cross-domain content scheduling module, wherein the modules are connected through a security protocol to form a closed-loop intelligent presentation chain, and the closed-loop intelligent presentation chain is used for solving the problems of insufficient integration of traditional EPG depth information, low navigation efficiency and the like.

Inventors

CHEN ZHIYE
GUO YONGKANG
ZHOU MIN
LI PAN
CAI JIEFENG
LUO ZEWEN
ZHANG ZHIJIAN
ZHOU HAILIANG
WANG BING

Assignees

广东南方新媒体股份有限公司

Dates

Publication Date: 20260508
Application Date: 20260104

Claims (10)

1. An EPG system based on large language prediction is characterized by comprising a user interface layer, an intelligent interaction layer, a large language prediction model layer, a metadata management and back-end system and a unified play control layer; The large language prediction model layer comprises a multi-mode understanding module, a dynamic metadata factory module, a live stream analysis module, an on-demand spectrum analysis module, a large language prediction engine module and a cross-domain content scheduling module; the multi-mode understanding module is used for processing multi-mode instructions from the intelligent interaction layer, extracting multi-mode user characteristics, and extracting long-term behavior characteristics of users based on user figures corresponding to the users; the live stream analysis module is used for analyzing live programs or event related hot spots; The on-demand spectrum analysis module is used for extracting on-demand heat spectrum characteristics according to new conditions, a heat list and social media discussion conditions on-demand contents; The dynamic metadata factory module is used for receiving input information of the live broadcast stream analysis module and the on-demand spectrum analysis module and generating real-time dynamic metadata characteristics; the large language prediction engine module is used for coupling and intention analysis processing the multi-mode user characteristics and the long-term user behavior characteristics output by the multi-mode understanding module and the real-time dynamic metadata characteristics output by the dynamic metadata factory module, and outputting intention vectors and matching probabilities; And the cross-domain content scheduling module is used for pushing the video content and the commercial advertisement according to the intention vector and the matching probability.
2. The system of claim 1, wherein the multimodal instructions processed by the multimodal understanding module include voice instructions, key trajectory instructions, and camera perception instructions; The extracting multi-modal user characteristics includes: Text intention understanding coding is carried out after the voice instruction is translated, so that voice intention characteristics are obtained; Performing behavior intention coding on the key track instruction to obtain behavior intention characteristics; and carrying out emotion recognition coding on the face and the voice to obtain emotion characteristics.
3. The system of claim 1, wherein the multimodal understanding module comprises a representation construction unit to: Collecting static basic data and dynamic multi-modal data of each user, wherein the static basic data comprises a human face, voiceprints and fingerprints, and the dynamic multi-modal data comprises behavior data, semantic data, emotion data and content associated data; carrying out identity coding on each user based on the static basic data to obtain identity labels corresponding to each user; Respectively extracting the characteristics of each dynamic multi-mode data to obtain behavior characteristics, interest characteristics, emotion and scene characteristics and business preference characteristics corresponding to each user; and carrying out multidimensional feature fusion processing on the behavior features, the interest features, the emotion and scene features and the business preference features to obtain the user portraits corresponding to the users.
4. The system of claim 3, wherein the multimodal understanding module further comprises a representation optimization unit to: triggering portrait updating when the user generates new interactive behaviors, and acquiring actual playing data of each user from the unified playing control layer; performing iterative processing on the features in the user portrait based on the actual playing data, wherein the iterative processing comprises strengthening forward features, weakening outdated features and newly added scene features; And after iteration, updating the user portrait, and synchronizing the updated user portrait to the large language prediction engine module and the cross-domain content scheduling module.
5. The system of claim 1, wherein the large language prediction engine module is further configured to: And coupling the multi-mode user characteristics, the user long-term behavior characteristics and the real-time dynamic metadata characteristics by adopting a CM-INTENTNET cross-mode intention network, and carrying out dynamic intention reasoning to obtain the intention vector and the matching probability.
6. The system according to any one of claims 1-5, wherein the cross-domain content scheduling module is further configured to: And adopting a multi-target optimization framework, taking user interests, commercial values and policy compliance as optimization targets, making a decision through reinforcement learning with constraint, and pushing the video content and commercial advertisements according to the intention vector and the matching probability under the condition that the policy control requirement is met.
7. The system of claim 6, wherein the cross-domain content scheduling module is further provided with a dynamic weight adjustment mechanism; the dynamic weight adjustment mechanism adjusts the weights of the user interests and the commercial values in real time according to scenes; and when the matching probability is lower than a preset threshold value, commercial matching is not performed, and the EPG configuration before the last shutdown is called only according to the user characteristics.
8. An EPG method based on large language prediction, which is applied to a large language prediction model layer, the method comprising: collecting and processing multi-mode user data and user portrait information through a multi-mode understanding module, and extracting multi-mode user characteristics and user long-term behavior characteristics; Acquiring content data through a live stream analysis module and an on-demand spectrum analysis module, and generating real-time dynamic metadata characteristics through a dynamic metadata factory module; Coupling and intention analysis processing are carried out on the multi-mode user characteristics, the user long-term behavior characteristics and the dynamic metadata characteristics by adopting a large language prediction engine module, and intention vectors and matching probabilities are output; and pushing the video content and the commercial advertisement according to the intention vector and the matching probability by adopting a cross-domain content scheduling module.
9. A storage medium having stored therein at least one executable instruction for performing operations corresponding to the large language prediction based EPG method of claim 8.
10. A computer device comprising a processor, a memory, a communication interface, and a communication bus, said processor, said memory, and said communication interface completing communication with each other via said communication bus; the memory is configured to store at least one executable instruction that causes the processor to perform operations corresponding to the large language prediction based EPG method of claim 8.

Description

EPG system, method, storage medium and computer device based on large language prediction Technical Field The invention relates to the technical field of intelligent pushing of electronic programs, in particular to an EPG system, a method, a storage medium and computer equipment based on large language prediction. Background An electronic program guide (ELECTRICAL PROGRAM GUIDE, EPG for short) is taken as a portal of an IP interactive television system, the core provides functions of program indexing, navigation and multimedia content browsing, multi-source massive metadata such as live broadcast, on-demand broadcast, review and the like are required to be integrated and presented efficiently, the system comprises four core modules including a user interface layer, a metadata management and back-end system, an intelligent interaction layer and a play control system, a complete process of supporting from application starting and data loading to content playing is supported, and the system is a key entrance for users to contact audio-visual content and is of great importance to improving user experience and platform operation efficiency. However, under the trend of content diversification and user individuation film watching, the traditional EPG system has the obvious defects that firstly, depth information integration is insufficient, association of programs and associated contents is weak, users are difficult to explore a content library deeply, secondly, navigation efficiency is low, linear list browsing needs to turn pages frequently under a multi-channel scene, thirdly, individuation degree is low, recommendation depends on simple tag matching, real-time interests of users and family multi-member use scenes are ignored, fourthly, commercial conversion paths are broken, advertisements and contents are cut apart, hard and wide insertion influences experience, thirdly, data analysis value is not released, the granularity of behavior logs is coarse, and interaction depth analysis is lacking. Accordingly, it has been difficult for the conventional EPG system to satisfy new scene demands of content diversification and user personalization. Disclosure of Invention In view of the above, the present invention provides an EPG system, a method, a storage medium, and a computer device based on large language prediction, which are mainly aimed at solving the problems of insufficient depth information integration, low navigation efficiency, low individuation degree, and broken commercial conversion path of the conventional EPG system. According to one aspect of the invention, an EPG system based on large language prediction is provided, comprising a user interface layer, an intelligent interaction layer, a large language prediction model layer, a metadata management and back-end system and a unified play control layer; The large language prediction model layer comprises a multi-mode understanding module, a dynamic metadata factory module, a live stream analysis module, an on-demand spectrum analysis module, a large language prediction engine module and a cross-domain content scheduling module; the multi-mode understanding module is used for processing multi-mode instructions from the intelligent interaction layer, extracting multi-mode user characteristics, and extracting long-term behavior characteristics of users based on user figures corresponding to the users; the live stream analysis module is used for analyzing live programs or event related hot spots; The on-demand spectrum analysis module is used for extracting on-demand heat spectrum characteristics according to new conditions, a heat list and social media discussion conditions on-demand contents; The dynamic metadata factory module is used for receiving input information of the live broadcast stream analysis module and the on-demand spectrum analysis module and generating real-time dynamic metadata characteristics; the large language prediction engine module is used for coupling and intention analysis processing the multi-mode user characteristics and the long-term user behavior characteristics output by the multi-mode understanding module and the real-time dynamic metadata characteristics output by the dynamic metadata factory module, and outputting intention vectors and matching probabilities; And the cross-domain content scheduling module is used for pushing the video content and the commercial advertisement according to the intention vector and the matching probability. Further, the multi-mode instructions processed by the multi-mode understanding module comprise voice instructions, key track instructions and camera perception instructions; The extracting multi-modal user characteristics includes: Text intention understanding coding is carried out after the voice instruction is translated, so that voice intention characteristics are obtained; Performing behavior intention coding on the key track instruction to obtain behavior intention characteristics; a