CN-121982479-A - Layered fine granularity image counterfeiting detection method and system
Abstract
The invention discloses a layered fine granularity image counterfeiting detection method and system. And carrying out preliminary authenticity judgment on the input image through an authenticity judgment module. The method comprises the steps of constructing a multi-branch pixel-level forgery positioning network, obtaining multi-scale feature representation from low resolution to high resolution through a feature extraction backbone network, predicting to obtain a forgery mask with high resolution through the pixel-level positioning network, constructing a rough-to-fine reasoning path according to a multi-level label system by a decision network, sequentially completing forgery property judgment, forgery type judgment, method-level tracing and platform-level tracing, and outputting conditional probability of a corresponding level label as a tracing result. The invention has a unified multi-stage label system, remarkably improves the systematicness and accuracy of detection, visually presents the fake source and position through mask positioning and hierarchical classification, and has strong interpretability.
Inventors
- GE SHIMING
- ZENG DAN
- YANG CHEN
- GU YIXIAO
- SHI RUIXIN
- JIANG HAORAN
- Lv Weiyi
- HE JIAHUI
Assignees
- 杭州电子科技大学丽水研究院
Dates
- Publication Date
- 20260505
- Application Date
- 20260128
Claims (9)
- 1. A layered fine-grained image forgery detection method, characterized by comprising the steps of: step 1, constructing a multi-level label system to obtain a training data set; Step2, carrying out preliminary judgment on the authenticity of the input image through an authenticity judgment module; Step 3, constructing a multi-branch pixel-level fake positioning network, wherein the multi-branch pixel-level fake positioning network comprises a feature extraction backbone network, a pixel-level positioning network and a decision network; Step 4, obtaining multi-scale feature representation from low resolution to high resolution through a multi-resolution feature extraction mechanism of a feature extraction backbone network; step 5, predicting a high-resolution fake mask through a pixel-level positioning network based on the obtained multi-scale characteristics, wherein the fake mask is used for performing pixel-level positioning on a fake area; Step 6, the decision network builds a coarse-to-fine reasoning path according to a multi-level label system, and sequentially completes counterfeit property judgment, counterfeit type judgment, method level tracing and platform level tracing based on multi-scale feature representation, and outputs the conditional probability of the corresponding level label as a tracing result; and 7, training the multi-branch pixel-level fake positioning network based on the acquired training data set.
- 2. The layered fine-grain image forgery detection method according to claim 1, wherein step 1 specifically operates as follows: the method comprises the steps of 1.1, a multistage counterfeit label system, wherein the uppermost stage firstly carries out preliminary marking on the authenticity of a training sample and is used for dividing the training sample into a real image and an image containing counterfeit contents; Step 1.2, in the second-stage label, after the authenticity marking is finished, constructing a forging property label according to forging properties for a forging sample, and dividing the forging properties into two major categories of falsification type forging and generation type forging; the falsification type falsification is realized by editing, enhancing, repairing or replacing a local area based on a real image, and the generation type falsification is completely dependent on a model to generate a new image; Step 1.3, in the third-level label, after defining the forging property, further subdividing the forging sample along the technical characteristics of the forging mode to construct a forging type label; For falsification, the falsification sample is divided into two types of image enhancement and image editing, namely, according to the technical route of a bottom generation frame of the falsification sample, the falsification sample is divided into a generation sample based on GAN, a generation sample based on a diffusion model, a generation sample based on a transducer and a mixed generation sample fused with a plurality of generation mechanisms; Step 1.4, after the counterfeit type labeling is completed, a method-level label is further constructed on the counterfeit sample and used for identifying the corresponding specific generation or tampering method; step 1.5, constructing a platform-level label with the finest granularity on the basis of a method-level label, and realizing the final source attribution marking of the forged content; The training sample is respectively endowed with a real label, a fake type label, a method grade label and a platform grade label through the steps, so that a training data set containing multi-grade fake semantic information is formed.
- 3. The method for detecting forgery of layered fine-grain images according to claim 1, wherein the authentication module comprises: 1) The multi-domain cue extraction unit is used for carrying out normalization and scale unification on the input image to obtain color domain reference characteristic input as a color domain cue, carrying out enhancement characterization on edge texture fracture and frequency domain abnormality of the input image under different scales by adopting a multi-scale Laplace operator to obtain a multi-scale high-frequency response characteristic map as a frequency domain cue; 2) And the true and false primary screening unit constructs the color domain clues, the frequency domain clues and the noise residual error domain clues into multi-channel input features with the same spatial scale, inputs the multi-channel input features into the light-weight feature extraction network for feature learning, and outputs true and false confidence.
- 4. The method for detecting the forgery of the layered fine granularity image according to claim 3, wherein the feature extraction backbone network comprises four resolution feature branches, four resolution feature graphs are respectively obtained, when the resolution feature graphs are obtained, lower resolution feature graphs are obtained by downsampling the upper-level resolution feature graphs, specifically, when the resolution feature branches are processed, the resolution feature graphs output by the upper-level resolution feature branches are used as input, convolution with the step length of 2 is used as a downsampling operator, and the spatial size of the feature graphs is reduced to 1/2 of the original spatial size, so that lower resolution feature graphs are obtained.
- 5. The hierarchical fine-grained image forgery detection method according to claim 4, wherein the pixel-level positioning network is arranged on the highest resolution feature branch of the multi-branch pixel-level forgery positioning network, a forgery probability map spatially aligned to the input image is generated by performing convolution mapping and spatial modeling on the highest resolution feature map, and further the forgery probability map is thresholded to obtain a high-resolution binary forgery mask.
- 6. The method for detecting the forgery of layered fine-grained image according to claim 5, wherein the pixel-level positioning network specifically operates as follows: Inputting the highest resolution feature map into a pixel level positioning network, aligning the highest resolution feature map with space through convolution mapping, and outputting a pixel level embedded feature map aligned with the input space; then obtaining a true feature center vector by embedding the feature map at the pixel level and taking the average value of pixel embedding of all the true samples And calculating each pixel position in the pixel level embedded feature map Is embedded in vectors of (a) And a center vector Distance of (2) : By setting the distance scale parameter R, the pair Calculating the normalized distance between each pixel embedded vector and the true feature center vector: Distance map to be composed of distance values of all pixel positions Mapping to pixel-level forgery probability map through sigmoid function : Wherein, the Is a pixel-level forgery probability map At the pixel position A probability value at the location of the node, As a function of the sigmoid, As the scale factor of the dimension of the sample, Thresholding the probability map P to obtain a binary falsification mask M for indicating pixel level positions of falsified regions: Wherein the method comprises the steps of Representing pixels In order to forge an area, Is a probability threshold parameter for controlling the decision severity of the falsified region.
- 7. The method for detecting the forgery of the layered fine-grained image according to claim 6, wherein the decision network comprises a trans-scale fusion module, a global convergence module, a hierarchical classification output module and a gating constraint module; The cross-scale fusion module is composed of three cross-scale fusion units which are respectively arranged between adjacent resolution scale features and used for executing space size alignment and channel dimension alignment on the adjacent scale features and obtaining fusion features in an element-by-element addition mode; The global convergence module is used for executing global average pooling operation on the fusion characteristics to obtain global expression vectors of corresponding levels; the hierarchical classification output module comprises a property level classification output head, a type level classification output head, a method level classification output head and a platform level classification output head, wherein each classification output head adopts a full-connection mapping structure, maps global expression vectors of corresponding hierarchies to the hierarchical label space and outputs classification scores; The gating constraint module constructs a gating vector according to a label subordinate mapping relation preset by a multi-level label system so as to obtain a conditional probability vector of a corresponding level, and specifically, after the conditional probability vector of a property level label is obtained, a type level tracing, a method level tracing and a platform level tracing all construct the gating vector according to the conditional probability vector output by the previous level and the corresponding label subordinate mapping relation, and zero weight inhibition and normalization processing are carried out on the classification score of the current level classification output head so as to output the conditional probability vector of the current level label.
- 8. The hierarchical fine-grained image forgery detection method of claim 7, wherein the decision network employs a coarse-grained to fine-grained path inference structure consistent with a multi-level tag system: The quality level tracing comprises a decision network, a global representation vector, a quality level classification output head, a falsification class forging and generation class forging classification score, a normalization processing and a probability vector of forging properties, wherein the decision network takes a lowest resolution characteristic diagram L1 as input and obtains the quality level global representation vector through a cross-scale fusion module and a global convergence module; Type-level tracing: Firstly, up-sampling and aligning L1 features to the space size of a medium-low resolution feature map L2 through a cross-scale fusion unit, and completing channel alignment through 1X 1 convolution, then adding the L1 features element by element to obtain a type-level fusion feature F2, and obtaining a type-level global representation vector through global average pooling of a global convergence module; Performing full-connection mapping on the type-level global representation vector through a type-level classification output head, and outputting classification scores of all candidate fake types; Then, the gating constraint module constructs a type-level gating vector based on a property-level output probability vector according to the subordinate mapping relation of fake property-fake type in the multi-level label system, carries out zero weight suppression on type classification scores which do not accord with the current fake property, and carries out normalization processing on reserved branches, so as to obtain a type-level conditional probability vector; method level tracing: Firstly, up-sampling and aligning an L2 feature to the space size of a medium-high resolution feature map L3 through a cross-scale fusion unit, and completing channel alignment through 1X 1 convolution; the global pooling operation is carried out on the fusion features F3 through the global pooling module, and feature pooling is carried out on the pixel features predicted as the fake regions only under the constraint of the fake mask M, so that a method-level global representation vector is obtained; a method classification output head outputs classification scores of candidate forgery methods based on the global expression vector; The gating constraint module constructs a method-level gating vector according to the subordinate relation between the type-level conditional probability vector and a preset false type-false method, and performs zero weight suppression on the method classification score inconsistent with the current false type to obtain the method-level conditional probability vector, wherein the method-level conditional probability vector is obtained by the method classification score inconsistent with the current false type Representing a forgery method category; Platform-level tracing: Firstly, up-sampling and aligning the L3 features to the space size of a highest resolution feature map L4 through a cross-scale fusion unit, and completing channel alignment through 1X 1 convolution; the global pooling operation is carried out on the fusion features F4 through the global pooling module, and under the constraint of the fake mask M, feature pooling is carried out on the pixel features predicted as fake areas only, so that a platform-level global representation vector is obtained; The platform-class classification output head carries out full-connection mapping on the global representation vector and outputs classification scores of candidate platforms; the gating constraint module carries out zero weight suppression on the classification score of the platform which does not accord with the current forging method according to the method-level conditional probability vector and the subordinate mapping relation of the forging method-platform, and carries out normalization processing on the rest branches, so as to obtain the platform-level conditional probability vector; When the fake method category of the sample to be detected does not have a corresponding platform label set in a preset subordinate mapping relation of a fake method-a platform, a gating constraint module sets a platform level gating vector to activate only an unknown platform label, and carries out zero weight inhibition and normalization processing on classification scores of other platform labels so as to output a platform level conditional probability vector; The confidence coefficient of the final tracing result is obtained by combining all levels of conditional probabilities on the path where the final tracing result is located according to products, and the highest confidence coefficient is used as the final classification result.
- 9. A hierarchical fine-grained image forgery detection system, comprising the following modules: The data set construction module is used for constructing a multi-level label system and acquiring a training data set; the authenticity judging module is used for carrying out preliminary authenticity judgment on the input image; The feature extraction module is used for obtaining multi-scale feature representation from low resolution to high resolution through a multi-resolution feature extraction mechanism of a feature extraction backbone network; the pixel level positioning module predicts a high-resolution fake mask through a pixel level positioning network based on the obtained multi-scale characteristics and is used for carrying out pixel level positioning on a fake area; The decision module is used for constructing a coarse-to-fine reasoning path according to a multi-level label system through a decision network, and sequentially completing counterfeit property judgment, counterfeit type judgment, method level tracing and platform level tracing based on multi-scale feature representation, and outputting the conditional probability of a corresponding level label as a tracing result; and the training module is used for training the multi-branch pixel-level fake positioning network based on the training data set acquired by the data set construction module.
Description
Layered fine granularity image counterfeiting detection method and system Technical Field The invention belongs to the technical field of artificial intelligence and computer vision, and particularly relates to an image counterfeiting detection and positioning method and system based on hierarchical fine granularity characteristic characterization, which can be applied to scenes such as digital media evidence collection, deep counterfeiting identification and generation model tracing. Background In recent years, the rapid development of generative artificial intelligence has enabled the generation of countermeasure networks, diffusion models, convolutional neural network image editing tools, and the like to generate or tamper with image content with high fidelity. The fake image is highly realistic in sense of reality, semantic consistency, cross-mode matching and the like, and brings great challenges to image evidence obtaining and fake detection. The studies on image falsification detection are mainly focused on two directions, and the first type is a global detection type method, which detects falsification by full-view statistical features, frequency domain inconsistencies, or model residuals, but is difficult to cope with a tiny local falsified scene. The second category is local localization methods that rely on mask or pixel level contrast learning to achieve counterfeit region segmentation, but lack hierarchical characterization of counterfeit attributes. In addition, some researches show that different generation models contain unique source characteristics in frequency domain distribution, gradient response and noise residual error, so that counterfeiting tracing can be realized, but the existing scheme often carries out independent processing detection and attribution tasks and lacks a unified hierarchical structure. Therefore, a new detection framework combining hierarchical label semantics, fine-granularity counterfeit attribute classification and pixel-level positioning is needed, which not only can realize counterfeit detection and attribution, but also can consider the interpretability and robustness of different counterfeit sources. Disclosure of Invention Aiming at the defects existing in the prior art, the invention provides a layered fine-granularity image counterfeiting detection method and a layered fine-granularity image counterfeiting detection system, which are used for realizing counterfeiting type identification, source model attribution and local counterfeiting area positioning in a combined way by constructing a multi-stage counterfeiting label system, a multi-branch pixel-level counterfeiting positioning network and a counterfeiting mask, so that the precision, generalization and interpretability of counterfeiting detection are obviously improved. In a first aspect, an embodiment of the present application provides a layered fine-grained image forgery detection method, including the steps of: step 1, constructing a multi-level label system to obtain a training data set. And 2, carrying out preliminary judgment on the authenticity of the input image through an authenticity judgment module. And 3, constructing a multi-branch pixel-level fake positioning network, wherein the multi-branch pixel-level fake positioning network comprises a feature extraction backbone network, a pixel-level positioning network and a decision network. And 4, acquiring multi-scale feature representation from low resolution to high resolution through a multi-resolution feature extraction mechanism of a feature extraction backbone network. And 5, predicting and obtaining a high-resolution fake mask through a pixel-level positioning network based on the obtained multi-scale characteristics, wherein the fake mask is used for performing pixel-level positioning on the fake area. And 6, constructing a coarse-to-fine reasoning path by the decision network according to the multi-level label system, sequentially completing counterfeit property judgment, counterfeit type judgment, method level tracing and platform level tracing based on the multi-scale feature representation, and outputting the conditional probability of the corresponding level label as a tracing result. The tracing results of each stage after the falsification property judgment are all constrained by the previous stage tracing result as conditions, so that a stage-by-stage reasoning process with consistent layers and constrained paths is formed. And 7, training the multi-branch pixel-level fake positioning network based on the acquired training data set. In one possible embodiment, step1 specifically operates as follows: in the step 1.1, the highest stage firstly carries out preliminary marking on the authenticity of the training sample in the multi-stage counterfeit label system, and the training sample is divided into a real image and an image containing counterfeit contents. And 1.2, in the second-stage label, after the authenticity marking is