CN-122019395-A - Text video model safety test method, system, equipment and medium

CN122019395ACN 122019395 ACN122019395 ACN 122019395ACN-122019395-A

Abstract

The invention discloses a safe test method, a safe test system, safe test equipment and safe test media for a video model, and relates to the technical field of safe test for the video model, wherein the safe test method comprises the steps of firstly, obtaining a video to be tested generated by the model based on an initial sample, and carrying out multidimensional compliance analysis on the video to obtain a plurality of analysis results; the method comprises the steps of establishing a unified feature vector from each result, inputting the unified feature vector into a decision tree auditing model to obtain a comprehensive risk factor, comparing the risk factor with a preset threshold value, mutating an initial test sample according to a comparison result to generate a new test sample, inputting the new sample into the model to generate a new video, repeating the steps of analyzing, feeding back and mutating until a stopping condition is met, and outputting a safety test result of the model. The method can comprehensively, efficiently and accurately evaluate the safety of the video model under the condition of not relying on a large amount of manual intervention.

Inventors

ZHAO YUFEI
LIN YONGFENG
ZHANG DAOJUAN
ZHANG GUANGXIN
WANG WENHUI
DONG PENG
ZHAO KUN
WANG JIANKUAN
CUI JIE
SONG YU
Yan Guanzheng
ZHANG GUOQIANG
Pu Zedai
GONG YAQIANG
LI SHUO
MENG XIANDONG
YAN WEI
LIU YING
CHEN KAI
FEI KEXIONG
WANG XUDONG
WANG XINZHE
LI JIE
Zhang Beng

Assignees

国网天津市电力公司电力科学研究院
中国电力科学研究院有限公司
国网天津市电力公司
国家电网有限公司

Dates

Publication Date: 20260512
Application Date: 20260410

Claims (10)

1. The method for testing the safety of the video model of the text is characterized by comprising the following steps of: s1, acquiring a video to be tested generated by a target text video model based on an initial test sample; s2, carrying out multidimensional compliance analysis on the video to be detected to obtain a plurality of analysis results corresponding to each analysis dimension; s3, constructing a plurality of analysis results into unified feature vectors, and inputting the unified feature vectors into a pre-trained decision tree audit model to obtain comprehensive risk factors; s4, comparing the comprehensive risk factors with a preset threshold value to obtain a comparison result, and mutating an initial test sample according to the comparison result to generate a new test sample; S5, inputting the new test sample into the target text-to-text video model, generating a new video to be tested, and repeating the steps S2 to S4 until a preset stop condition is met, and outputting a safety test result of the target text-to-text video model.
2. The method for testing the security of a video model according to claim 1, wherein the obtaining the video to be tested generated by the target video model based on the initial test sample comprises: Constructing an adaptive detection synthesis model, wherein the adaptive detection synthesis model is used for generating various initial test samples according to the input prompt words; And inputting the initial test sample into the target text video model to obtain the video to be tested.
3. The method of claim 1, wherein the multidimensional compliance analysis includes at least an image content dimension, a text information dimension, a voice content dimension, and a narrative logic dimension.
4. A method of testing the safety of a video model of a character according to claim 3, wherein the analysis of the logical dimension of the narrative comprises: Carrying out space-time Token representation on the video to be detected, dividing tensors of the video to be detected into characteristic blocks with local space-time correlation properties through a three-dimensional block embedding operator, and introducing three-dimensional position codes to construct a space-time Token sequence; Modeling the space-time Token sequence through a layered space-time attention mechanism to obtain a narrative logic analysis model, and respectively extracting spatial interaction characteristics and time evolution characteristics in a single frame of the video to be detected by the narrative logic analysis model; and constructing a narrative logic flow of the video to be detected based on the space interaction features and the time evolution features, and identifying a illegal narrative structure in the video to be detected by analyzing a target narrative feature vector in the narrative logic flow.
5. A method of testing a video model for a text-to-text security as claimed in claim 3, wherein said inputting the unified feature vector into a pre-trained decision tree audit model to obtain a comprehensive risk factor comprises: Inputting the unified feature vector into a pre-trained decision tree auditing model, and outputting independent risk scores under each dimension through a decision path of the decision tree auditing model; Based on the independent risk scores in each dimension, introducing correction vectors, and constructing multidimensional risk vectors; and carrying out aggregation calculation on the multidimensional risk vector to obtain the comprehensive risk factor.
6. The method for testing the security of a video model according to claim 1, wherein the mutating the initial test sample according to the comparison result to generate a new test sample comprises: When the comparison result is that the comprehensive risk factor is smaller than the preset threshold value, the initial test sample is mutated by adopting a first mutation strategy to generate a new test sample, wherein the first mutation strategy is used for increasing the diversity of the test sample by introducing random disturbance into a feature embedding space; And when the comparison result is that the comprehensive risk factor is greater than or equal to the preset threshold value, mutating the initial test sample by adopting a second mutation strategy to generate a new test sample, wherein the second mutation strategy directionally strengthens the feature of the triggering risk by directionally selecting mutation operators related to the triggering risk dimension and updating according to the gradient direction of the risk loss function.
7. The method of claim 6, wherein the first mutation strategy and the second mutation strategy comprise one or more of mutation operators of semantic contrast substitution, visual element enhancement, narrative chain extension, acoustic environment injection, and text embedding induction.
8. A video-in-text model security test system, comprising: The acquisition module is configured to acquire a video to be detected generated by the target text-to-video model based on the initial test sample; The analysis module is configured to carry out multidimensional compliance analysis on the video to be detected to obtain a plurality of analysis results corresponding to each analysis dimension; The calculation module is configured to construct a plurality of analysis results into unified feature vectors, and input the unified feature vectors into a pre-trained decision tree audit model to obtain comprehensive risk factors; The mutation module is configured to compare the comprehensive risk factors with a preset threshold value to obtain a comparison result, and mutate the initial test sample according to the comparison result to generate a new test sample; The execution module is configured to input the new test sample into the target text-to-video model to generate a new video to be tested, and the new video to be tested repeatedly passes through the analysis module, the calculation module and the mutation module until a preset stopping condition is met, and then a safety test result of the target text-to-video model is output.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the method according to any of claims 1-7.

Description

Text video model safety test method, system, equipment and medium Technical Field The present invention relates generally to the field of text-to-video model security testing technology, in particular to a method, a system, equipment and a medium for testing the safety of a video model of a text. Background With the rapid development of artificial intelligence technology, a text-to-video model has become one of important research directions in the fields of natural language processing and computer vision. The model can generate video content according to natural language description, and promotes development of application scenes such as automatic video production, virtual reality, entertainment content creation and the like. However, despite significant advances in the literature video model in a number of areas, security and compliance issues are exposed in practical applications. For example, the generated video may contain inappropriate or harmful content, such as violence, pornography, ethnic discrimination, etc., which presents compliance and ethical problems for practical applications. Therefore, effective evaluation on the safety and compliance of the video model of the literature has become a difficult problem to be solved. However, the safety test method of the conventional araneous video model prevents the test samples from lacking in diversity, and the problems that the conventional keyword rules or single-frame image features are relied on in the test method process are difficult to understand the complex narrative content in the video deeply, so that the test process has obvious defects in coverage breadth, mining depth and operation efficiency. Disclosure of Invention In view of the foregoing drawbacks or shortcomings of the prior art, it is desirable to provide a method, system, apparatus, and medium for a video model security test. In a first aspect, the present invention provides a method for testing the security of a video model, including: s1, acquiring a video to be tested generated by a target text video model based on an initial test sample; s2, carrying out multidimensional compliance analysis on the video to be detected to obtain a plurality of analysis results corresponding to each analysis dimension; s3, constructing a plurality of analysis results into unified feature vectors, and inputting the unified feature vectors into a pre-trained decision tree audit model to obtain comprehensive risk factors; s4, comparing the comprehensive risk factors with a preset threshold value to obtain a comparison result, and mutating an initial test sample according to the comparison result to generate a new test sample; S5, inputting the new test sample into the target text-to-text video model, generating a new video to be tested, and repeating the steps S2 to S4 until a preset stop condition is met, and outputting a safety test result of the target text-to-text video model. According to the technical scheme provided by the invention, the video to be tested generated by the acquisition target text-to-text video model based on the initial test sample comprises the following steps: Constructing an adaptive detection synthesis model, wherein the adaptive detection synthesis model is used for generating various initial test samples according to the input prompt words; And inputting the initial test sample into the target text video model to obtain the video to be tested. According to the technical scheme provided by the invention, the multidimensional compliance analysis at least comprises an image content dimension, a text information dimension, a voice content dimension and a narrative logic dimension. According to the technical scheme provided by the invention, the analysis of the narrative logic dimension comprises the following steps: Carrying out space-time Token representation on the video to be detected, dividing tensors of the video to be detected into characteristic blocks with local space-time correlation properties through a three-dimensional block embedding operator, and introducing three-dimensional position codes to construct a space-time Token sequence; Modeling the space-time Token sequence through a layered space-time attention mechanism to obtain a narrative logic analysis model, and respectively extracting spatial interaction characteristics and time evolution characteristics in a single frame of the video to be detected by the narrative logic analysis model; and constructing a narrative logic flow of the video to be detected based on the space interaction features and the time evolution features, and identifying a illegal narrative structure in the video to be detected by analyzing a target narrative feature vector in the narrative logic flow. According to the technical scheme provided by the invention, the step of inputting the unified feature vector into a pre-trained decision tree audit model to obtain the comprehensive risk factor comprises the following steps: Inputting the