CN-122023100-A - Digital watermarking method and device for large-capacity encodable information based on diffusion model

CN122023100ACN 122023100 ACN122023100 ACN 122023100ACN-122023100-A

Abstract

The invention discloses a digital watermarking method and device for large-capacity encodable information based on a diffusion model, and relates to the technical field of image processing. The method comprises the steps of carrying out watermark embedding by using a robust watermark embedding module according to an original load based on a preset standard Gaussian noise vector to obtain a latent variable with the watermark, carrying out parameter optimization training on a distribution transformation module according to a preset normal priori sampling vector and the latent variable with the watermark based on a distribution loss function to obtain an optimized distribution transformation module, obtaining target metadata information, carrying out binary processing on the target metadata information to obtain a target load, and carrying out watermark image generation according to the target load based on the robust watermark embedding module, the optimized distribution transformation module and a diffusion model to obtain a target watermark image. The invention relates to a high-efficiency and stable digital watermarking method for large-capacity encodable information based on a diffusion model.

Inventors

LI YU
Chang Xuanhang
YANG ZHONGHAO
ZHUO CHENG

Assignees

浙江大学

Dates

Publication Date: 20260512
Application Date: 20260410

Claims (10)

1. A digital watermarking method of large-capacity encodable information based on a diffusion model, the method comprising: Binary coding is carried out on the metadata information to obtain an original load; Based on a preset standard Gaussian noise vector, performing watermark embedding by using a robust watermark embedding module according to an original load to obtain a latent variable with a watermark; based on the distribution loss function, carrying out parameter optimization training on the distribution transformation module according to a preset normal priori sampling vector and a latent variable with watermark to obtain an optimized distribution transformation module; The method comprises the steps of obtaining target metadata information, performing binary processing on the target metadata information to obtain target load, and generating watermark images according to the target load based on a robust watermark embedding module, an optimal distribution transformation module and a diffusion model to obtain target watermark images.
2. The digital watermarking method of large-capacity encodable information based on a diffusion model according to claim 1, wherein the watermark embedding is performed by using a robust watermark embedding module according to an original load based on a preset standard gaussian noise vector to obtain a watermarked latent variable, comprising: based on a preset error rate estimated value and a preset target reliable value, performing parameter optimization by using an ECC automatic parameter selection algorithm according to an original load to obtain an optimal super parameter; Based on the optimal super-parameters, according to the original load, performing block coding by using an ECC coder to obtain a check symbol sequence; Splicing is carried out according to the original load and the check symbol sequence, and an enhanced key is obtained; Watermark embedding is carried out according to the enhancement key based on a preset standard Gaussian noise vector, and a potential variable with watermark is obtained.
3. The digital watermarking method of large-capacity encodable information based on a diffusion model according to claim 2, wherein the parameter optimizing is performed by using an ECC automatic parameter selection algorithm according to an original load based on a preset bit error rate estimation value and a preset target reliability value, so as to obtain an optimal super parameter, and the method comprises the following steps: Cutting the incremental block length one by one according to the original load to obtain a load block sequence set and a corresponding total block number set; based on the RS candidate code length acquisition function, inquiring according to the number of data symbols corresponding to each block in the load block sequence set to obtain a candidate code length set; according to the block length of each block in the load block sequence set, obtaining a block length set; Based on a preset error rate estimation value, calculating according to the codeword configuration binary set and the total block number set to obtain a total message failure probability set; based on a preset target reliability value, performing reliability verification on the total message failure probability set to obtain a passing verification probability set; Selecting a corresponding codeword configuration binary group in the codeword configuration binary group set according to the passing check probability set, and constructing a preliminary optimization binary group set; And according to the initial optimized binary group set, performing redundancy cost calculation, selecting the binary group with the lowest redundancy cost as an optimal binary group, and determining an optimal super parameter according to the optimal binary group.
4. The digital watermarking method of large-capacity encodable information based on a diffusion model according to claim 2, wherein the watermark embedding is performed according to an enhancement key based on a preset standard gaussian noise vector to obtain a watermarked latent variable, comprising: According to the enhanced key, carrying out sign bit embedding on the Gaussian noise vector to obtain a Gaussian noise vector with modified sign bits; and carrying out boundary transition on the Gaussian noise vector after the modification of the sign bit based on a preset boundary parameter to obtain a potential variable with watermark.
5. The digital watermarking method of large-capacity encodable information based on a diffusion model according to claim 1, wherein the distribution transformation module is constructed based on a network structure of a reversible neural network; the distribution transformation module is used for embedding the watermark into the potential vector deviating from the standard Gaussian distribution, performing bijective mapping through the reversible neural network to lead the potential vector to be re-subjected to the standard Gaussian distribution, and supporting lossless recovery of the potential vector with the watermark in the inverse process.
6. The digital watermarking method of large-capacity encodable information based on a diffusion model according to claim 1, wherein the performing parameter optimization training on the distribution transformation module based on a distribution loss function according to a preset normal priori sampling vector and a latent variable with watermark to obtain an optimized distribution transformation module comprises: according to the latent variable with watermark, using a distribution transformation module to carry out reversible distribution mapping to obtain a balance latent variable; based on the distribution loss function, calculating according to a preset normal priori sampling vector and a balance potential variable to obtain distribution loss; And carrying out parameter optimization on the distribution transformation module according to the distribution loss to obtain an optimized distribution transformation module.
7. The digital watermarking method for large-capacity encodable information based on a diffusion model according to claim 6, wherein the distribution loss function is formed by weighting a maximum likelihood estimation term and a KL divergence regularization term.
8. A digital watermarking apparatus for large-volume encodable information based on a diffusion model for implementing the digital watermarking method for large-volume encodable information based on a diffusion model as defined in any one of claims 1 to 7, characterized in that the apparatus comprises: the original load acquisition module is used for acquiring metadata information for embedding the watermark, binary coding the metadata information and acquiring an original load; The robust watermark embedding module is used for watermark embedding by using the robust watermark embedding module according to the original load based on a preset standard Gaussian noise vector to obtain a potential variable with watermark; The distribution transformation learning module is used for carrying out parameter optimization training on the distribution transformation module based on the distribution loss function according to a preset normal priori sampling vector and a latent variable with watermark to obtain an optimized distribution transformation module; The system comprises a target watermark generation module, a robust watermark embedding module, an optimal distribution transformation module and a diffusion model, wherein the target watermark generation module is used for acquiring target metadata information, performing binary processing on the target metadata information to acquire a target load, and generating a watermark image according to the target load to acquire a target watermark image.
9. A digital watermarking apparatus for large-capacity encodable information, characterized in that the digital watermarking apparatus for large-capacity encodable information comprises: A processor; a memory having stored thereon computer readable instructions which, when executed by the processor, implement the method of any of claims 1 to 7.
10. A computer readable storage medium having stored therein program code which is callable by a processor to perform the method of any one of claims 1 to 7.

Description

Digital watermarking method and device for large-capacity encodable information based on diffusion model Technical Field The invention relates to the technical field of image processing, in particular to a digital watermarking method and device for large-capacity encodable information based on a diffusion model. Background Diffusion models have become the dominant paradigm of current generation artificial intelligence models, particularly in the field of image generation, by applying transformations that reverse the diffusion process, iteratively generating new samples from the initial noise. The latent diffusion model (Latent Diffusion Model, LDM) expands this framework, significantly reducing computational costs by diffusing in a low-dimensional latent space. The use of trained LDM samples may follow standard denoising diffusion probability model samples (Denoising Diffusion Probabilistic Models sampling, DDPM) or utilize faster variants such as denoising diffusion implicit model samples (Denoising Diffusion Implicit Models sampling, DDIM) samples, DDIM accelerating generation by removing redundant steps while maintaining quality, other accelerated samplers such as diffusion probability model solvers (Diffusion Probabilistic Model Solver, DPM-Solver) and predictive noise decomposition methods (PREDICTIVE NOISE DECOMPOSITION METHOD, PNDM) are equally applicable to potential diffusion models. As these models have now been used to generate massive amounts of content, tracking and verifying their output content has become increasingly important to avoid misuse. The digital watermarking technology provides a practical way for the generation of verifiable information by embedding. Standards such as the content source and authenticity alliance (Coalition for Content Provenance and Authenticity, C2 PA) emphasize the importance of LDM watermarking schemes that require robustness, are not perceptible, and can carry sufficient amounts of metadata to support the traceability of real world information, while satisfying the above functionalities without degrading image quality, i.e. such watermarks do not bring about macroscopic visual differences. The digital watermarking technology is realized by directly modifying an original image and adding a means with a copyrighted square mark in early stage, the method is simple and visual to realize, but the protection capability of the method is questioned, and the watermark can be modified or eliminated by simple operations such as cutting, picture trimming and the like. Therefore, invisible digital watermarking technology is proposed, and invisible watermark is added to an image by the technology, and the invisible watermark can be divided into two main types, namely a rule-based technology which is used for carrying out rule transformation in a frequency domain or a Fourier space and a neural network-based technology which is used for training a module for embedding and extracting the invisible watermark by utilizing deep learning. Both methods fall into the post-processing watermarking technical scheme, namely, watermarking is carried out on the image after the image is generated, so that the concealment of the watermark is improved to a certain extent, but the method still inevitably causes a certain disturbance on the image quality. In order to solve the limitation of post-processing watermarking technology, recent researches are turned to a native diffusion method, wherein the native diffusion method is used for directly embedding the watermark into the generation process of the LDM. Embedding the watermark during image generation makes the watermark more difficult to detect and more resistant to later tampering. Among other things, a technical idea is to modify the model itself, e.g. fine-tune the variators from the encoders (Variational Autoencoder, VAE), so that the generated image naturally carries the watermark. However, retraining the LDM model may result in unexpected changes in model behavior and is costly, especially each time retraining is required according to a different watermark load size. These limitations have prompted the development of latent variable based watermarking techniques that embed the watermark directly into the latent variable and recover the watermark by diffusion inversion while leaving the underlying model unchanged. Latent variable based techniques can generally be divided into two categories. The encoder-decoder architecture attempts to learn the explicit mapping between the potential representation and the embedded watermark, training the decoder to extract the signal after the DDIM inversion reconstructs the potential trajectories. The structured perturbation approach achieves watermark embedding by carefully selected modifications to the original underlying noise. Tree-Ring watermarks (Tree-RING WATERMARK, tree-Ring) store individual bits by perturbing the fourier representation of the original noise, which is then spread by the st