CN-121637460-B - Private network content copyright monitoring and evidence obtaining system based on AI and large model

CN121637460BCN 121637460 BCN121637460 BCN 121637460BCN-121637460-B

Abstract

The invention discloses a private network content copyright monitoring and evidence obtaining system based on AI and a large model, which utilizes AI and large model technology to carry out copyright monitoring and evidence obtaining on multimedia content in a private network environment, carries out deep semantic understanding and cross-modal feature extraction through a large model processor, generates unified semantic representation, and stores the unified semantic representation in a copyright content knowledge base, wherein the copyright content knowledge base is used for storing metadata and unified semantic representation of original content protected by copyright and copyright statement in a natural language form provided by a copyright party, converts the copyright statement into query vectors through semantic understanding technology, carries out similarity calculation with the semantic representation of the multimedia content, identifies infringement content, records the original source, release time and publisher information of the infringement content when the infringement content is identified, carries out differential analysis, generates an evidence chain, displays a copyright monitoring result through a user interface, generates infringement alarm information and presents evidence chain details.

Inventors

LIU XIANG
LI XU
WANG JIAN

Assignees

北京流金岁月科技有限公司

Dates

Publication Date: 20260512
Application Date: 20251118

Claims (8)

1. A private network content copyright monitoring and evidence obtaining system based on AI and a large model is characterized by comprising a content acquisition module, a multi-mode content analysis module, a copyright content knowledge base, a zero sample copyright identification module, an infringement behavior evidence obtaining module and a visualization and alarm module; The content acquisition module is used for acquiring multimedia content to be monitored from a private network environment, wherein the multimedia content comprises texts, images, audio, video and any mixed form thereof; The multi-modal content analysis module utilizes a large model processor to perform deep semantic understanding and cross-modal feature extraction on the acquired multimedia content so as to generate unified semantic representation of the multimedia content, and stores the unified semantic representation in a copyrighted content knowledge base, wherein the multi-modal content analysis module specifically comprises the following contents: performing modal identification and preliminary analysis on the acquired multimedia content, and decomposing the multimedia content into text modal sub-content Image modality subcontent Audio modality subcontent Video modality subcontent Generating preliminary modal feature vectors for each identified modal sub-content through corresponding encoders 、、、 Modal feature vectors respectively representing a text mode, an image mode, an audio mode and a video mode; The large model processor is a large-scale pre-training model based on a transducer architecture, performs pre-training on massive image-text and audio-video data by using a self-supervision learning mode by using a self-attention mechanism and a feedforward neural network, and generates a unified semantic representation according to a preliminary modal feature vector, and the large model processor further comprises: Preliminary feature vectors of different modalities 、、、 Mapping to a large model shared embedded space through a linear projection layer, adding a position code P, and forming a token sequence input by a model , wherein, Representing the operation of mapping the preliminary feature vectors of each modality to a shared embedding space through a linear projection layer; Sequence of projected tokens The transducer encoder fed into the model dynamically calculates the association weight between any two tokens in the sequence by the self-attention mechanism, and the specific formula is as follows Wherein Q, K, V is a sequence of input tokens Query, key, value matrices obtained through different linear transformation matrices, Is the dimension of the key vector, used for scaling the dot product result; After being processed by a multi-layer transducer encoder, the large model processor outputs a final hidden state sequence, generates a unified semantic representation by pooling the sequence, stores the generated unified semantic representation and metadata of original multimedia content into a copyrighted content knowledge base, and marks the unified semantic representation as The copyrighted content knowledge base stores a copyrighted statement in a natural language form provided by a copyrighted party; the copyright content knowledge base is used for storing metadata of original content protected by copyright, unified semantic representation generated by the multi-mode content analysis module and copyright statement in natural language form provided by a copyright party; The zero sample copyright identification module is used for receiving a natural language copyright statement from a copyright party, carrying out semantic understanding on the natural language copyright statement by using the large model processor, converting the natural language copyright statement into a semantic query vector, and carrying out semantic similarity calculation and matching on the semantic query vector and the unified semantic representation generated by the multi-mode content analysis module so as to identify infringement content; The infringement behavior evidence obtaining module records the original source, the release time, the identity information of the publisher, the modification history and the semantic similarity scoring and matching area between the infringement content and the original copyright content when the zero sample copyright identification module identifies the infringement content, and performs differential analysis on the infringement content and the original copyright content through the large model processor to generate an evidence chain; The visualization and alarm module is used for providing a user interface, displaying real-time copyright monitoring results, generating infringement alarm information, presenting complete evidence chain details, and allowing a user to manage copyright statement and query historical monitoring data.
2. The private network content copyright monitoring and evidence obtaining system based on the AI and the large model as set forth in claim 1, wherein said content obtaining module is configured to obtain multimedia content to be monitored from a private network environment, including text, image, audio, video and any mixed forms thereof, and specifically includes the following contents: Aiming at static resources of a file server, a database and an internal website deployed in a private network, a directional crawler program is used for content capturing according to a preset strategy so as to acquire multimedia files and data contained in the content capturing program; The method comprises the steps of deploying flow mirror equipment at key network nodes of a private network, and importing mirror flow into a sniffer, wherein the sniffer can carry out deep packet analysis on data packets circulating in the private network, and reorganize multimedia contents in files, mail attachments and instant messaging messages encapsulated in the transmission process.
3. The private network content copyright monitoring and evidence obtaining system based on AI and large model as set forth in claim 1, wherein said generating preliminary modal feature vectors by corresponding encoders 、、、 The mode feature vectors respectively representing a text mode, an image mode, an audio mode and a video mode, and further comprise: text mode, namely word segmentation and word vector embedding, and converting a text sequence into an embedded sequence by using a pre-trained word embedding model And generating context-aware preliminary modal feature vectors by a shallow transform encoder, denoted as ; Image modality, size normalization, pixel value normalization, scaling image I to fixed size Normalizing, extracting visual characteristics by using a lightweight convolutional neural network, and representing the primary modal characteristic vector as Wherein H, W, D respectively represents height, width and depth; Audio frequency mode, namely, generating a spectrogram through sampling rate unification and short time Fourier transformation, regarding the spectrogram as an image, extracting the characteristics of the spectrogram by using a lightweight CNN similar to the image mode, and representing the characteristic vector of the preliminary mode as ; Video modality video is processed Decomposition into frame sequences And extracting the accompanying audio For each frame Generating its preliminary feature vector For accompanying audio Processing according to audio mode to obtain its characteristics Preliminary feature vectors of a video modality are represented as a set of sequences of frame features , wherein, Representing an audio feature vector extracted from an audio track of the video file.
4. The private network content copyright monitoring and evidence obtaining system based on AI and large model as set forth in claim 1, wherein said zero sample copyright identification module receives natural language copyright statement from copyright party, uses said large model processor to understand natural language copyright statement semantically, converts it into semantic query vector, and performs semantic similarity calculation and matching on semantic query vector and unified semantic representation generated by multi-mode content analysis module to identify infringement content, specifically including the following contents: Representing copyright statement in natural language form provided by copyright party as word element sequence , wherein, The method comprises the steps of word elements in a copyright statement, and performing text preprocessing on the copyright statement, including word segmentation, stop word removal and spelling normalization operation, so as to form a standardized text sequence; deep semantic coding is carried out on the preprocessed copyright statement by utilizing a large model processor, and the preprocessed copyright statement is subjected to preprocessing Is input into a text input end in a large model processor, captures context semantic information of copyright statement through a self-attention mechanism in a fransformer architecture, and outputs a text sequence with a fixed dimension as Semantic query vectors of (a) ; Semantic query vectors to be generated Unified semantic characterization with copyrighted content knowledge base generated by multi-modal content analysis module Comparing, calculating unified semantic representation With semantic query vectors In sharing Semantic similarity in a dimensional semantic space, and a specific calculation formula is as follows: , wherein, Representing a function of the semantic similarity, Is the ith unified semantic representation Is used to determine the component of the (j) th dimension, Is a semantic query vector And comparing the calculated similarity with a predetermined similarity threshold Comparing when Judging that the content to be monitored generated by the multi-mode content analysis module is highly related to original content described by the copyright statement in terms of semantics to form infringement; And identifying all contents to be monitored which are highly similar to the meaning of the copyright statement according to the obtained matching result, and outputting an identified infringement content list, wherein the infringement content list comprises uniform semantic characterization and a similarity score with the copyright statement.
5. The private network content copyright monitoring and evidence obtaining system based on AI and large model as set forth in claim 1, wherein the infringement behavior evidence obtaining module records original source, release time, publisher identity information, modification history and semantic similarity score and matching area of the infringement content and the original copyright content when the zero sample copyright identification module identifies the infringement content, and the large model processor performs differential analysis on the infringement content and the original copyright content and generates evidence links, specifically comprising the following contents: Automatically capturing and recording full-dimension metadata related to infringement content, and integrating the full-dimension metadata into an electronic evidence file associated with a unique identifier of the infringement content, wherein the electronic evidence file comprises basic identification information, time and identity information, content comparison information and core feature data, the content comparison information comprises semantic similarity scores between the infringement content and copyright statement, and for images and video content, coordinates of local areas highly similar to original content are recorded, and start and stop positions of similar paragraphs are recorded for text content; The differential analysis is to compare nuances of infringement content and original copyright content in respective modes, identify semantic replacement, sentence recombination and paragraph deletion of texts, detect watermark removal, local shielding, element addition and deletion and color adjustment in images, analyze audio shearing, splicing and track separation, analyze propagation paths of the infringement content in a private network by combining network flow logs and access control lists in the private network, identify secondary propagation nodes, and construct a content derivative relation graph by comparing respective unified semantic characterization and primary feature vectors when infringement content of different versions related to the same copyright content exists, so that the evolution process of the infringement content is intuitively displayed; All the information is automatically synthesized into a standardized electronic evidence chain report, and a tamper-proof digital timestamp is embedded for each key event and data record in the evidence chain by using a blockchain technology.
6. The private network content copyright monitoring and evidence obtaining system based on AI and large model as set forth in claim 1, wherein said visualization and alarm module provides a user interface, displays real-time copyright monitoring results, generates infringement alarm information, presents complete evidence chain details, and allows a user to manage copyright statement and query history monitoring data, and specifically comprises the following contents: Displaying the latest identified infringement content in a list form, wherein each record comprises an infringement content thumbnail, an infringement type, a similarity score, a discovery time and a preliminary source, supporting quick jump to a detailed evidence chain page, and displaying the variation trend of the infringement quantity along with time, different content types and infringement distribution of a platform through a chart so as to help a user to know infringement dynamics and modes; According to the alarm rules preset by the user, when the similarity is higher than a preset threshold, automatically generating infringement alarm information, timely informing the authorized user of the alarm information through an in-station message, a mail, a short message and an API callback mode, navigating to a detailed page when the user clicks the infringement alarm record, completely presenting an electronic evidence file generated by an infringement behavior evidence obtaining module, displaying all metadata in the electronic evidence file, including basic identification information, time and identity information, content comparison information and core feature data, embedding a propagation path diagram and content derivative relation diagram in the page, enabling the user to interact with the map, and checking details by clicking nodes.
7. A computer device comprises a memory and a processor, wherein the memory stores a computer program, and the computer program is characterized in that the processor realizes the functional module of the private network content copyright monitoring and evidence obtaining system based on AI and a large model according to any one of claims 1-6 when executing the computer program.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements a functional module of a private network content rights monitoring and forensic system based on AI and large models as set forth in any one of claims 1-6.

Description

Private network content copyright monitoring and evidence obtaining system based on AI and large model Technical Field The invention relates to the technical field of artificial intelligence, in particular to a private network content copyright monitoring and evidence obtaining system based on AI and a large model. Background With the rapid development of internet technology and the explosive growth of digital content, proprietary networks have become an important carrier for information circulation and content creation. Content within the private network, including but not limited to enterprise documents, internal training materials, industry reports, business presentations, customer data, and the like, forms a large quantity of highly concentrated sensitive information and intellectual property. However, with the expansion and diversification of the scope of use of private networks, particularly when content sharing and collaboration are performed between departments and employees, copyright infringement is also becoming increasingly serious. In the traditional private network content copyright protection mechanism, technical means such as digital watermarking, content fingerprint, access control and rights management are generally relied on. Although the technologies can limit illegal copying and spreading of contents to a certain extent, with diversification and complexity of user generated contents and mixed editing of multimedia contents such as video, audio, graphics and texts, the limitations of the traditional technologies are increasingly prominent, and especially in private network environments, frequent changes, modifications and reorganization of contents make a monitoring method based on accurate matching difficult to effectively cope with. Disclosure of Invention In order to overcome the above-mentioned drawbacks of the prior art, an embodiment of the present invention provides a private network content copyright monitoring and evidence obtaining system based on AI and a large model, so as to solve the problems set forth in the above-mentioned background art. In order to achieve the above purpose, the invention provides a technical scheme that the private network content copyright monitoring and evidence obtaining system based on AI and a large model specifically comprises a content acquisition module, a multi-mode content analysis module, a copyright content knowledge base, a zero sample copyright identification module, an infringement behavior evidence obtaining module and a visualization and alarm module; The content acquisition module is used for acquiring multimedia content to be monitored from a private network environment, wherein the multimedia content comprises texts, images, audio, video and any mixed form thereof; The multi-modal content analysis module utilizes a large model processor to perform deep semantic understanding and cross-modal feature extraction on the acquired multimedia content so as to generate unified semantic representation of the multimedia content, and stores the unified semantic representation in a copyrighted content knowledge base; the copyright content knowledge base is used for storing metadata of original content protected by copyright, unified semantic representation generated by the multi-mode content analysis module and copyright statement in natural language form provided by a copyright party; The zero sample copyright identification module is used for receiving a natural language copyright statement from a copyright party, carrying out semantic understanding on the natural language copyright statement by using the large model processor, converting the natural language copyright statement into a semantic query vector, and carrying out semantic similarity calculation and matching on the semantic query vector and the unified semantic representation generated by the multi-mode content analysis module so as to identify infringement content; The infringement behavior evidence obtaining module records the original source, the release time, the identity information of the publisher, the modification history and the semantic similarity scoring and matching area between the infringement content and the original copyright content when the zero sample copyright identification module identifies the infringement content, and performs differential analysis on the infringement content and the original copyright content through the large model processor to generate an evidence chain; The visualization and alarm module is used for providing a user interface, displaying real-time copyright monitoring results, generating infringement alarm information, presenting complete evidence chain details, and allowing a user to manage copyright statement and query historical monitoring data. Preferably, as a preferred scheme of the private network content copyright monitoring and evidence obtaining system based on AI and large models of the present invention, the content obtaining