KR-20260065597-A - System for Protecting Weight Assets and Verifying Ownership of Mixture of Experts (MoE) Artificial Intelligence Models
Abstract
The present invention relates to a system that inserts statistical bias corresponding to unique identification information into the expert module weights of an AI model with a Mixture of Experts (MoE) structure, verifies ownership even in a black-box environment through the analysis of output probability distributions and token patterns for trigger input data, and tracks weight leakage paths through fingerprinting for individual users. The system includes a watermark insertion unit that inserts a watermark without loss of model performance through KL divergence loss function optimization, an expert module for users distributed with embedded watermarked weights, a response analysis unit that defends against reverse engineering attacks using dynamic trigger data, an ownership verification server that integrates black-box verification and white-box verification, and a fingerprinting management system equipped with a blockchain-based leakage tracking database.
Inventors
- 안범주
Assignees
- 안범주
Dates
- Publication Date
- 20260508
- Application Date
- 20260421
Claims (1)
- In a system for protecting the knowledge assets of a specific expert module in an artificial intelligence model with a Mixture of Experts (MoE) structure including multiple expert modules, A watermark insertion unit that adds a statistical bias corresponding to unique identification information to the weight parameters of the expert module within a fine range that does not affect the inference performance of the specific expert module; A user expert module comprising weights with the above-mentioned watermark inserted, distributed to an external user system to perform inference of a specific domain; A response analysis unit that injects Trigger Input Data, which is a specific data set defined in advance, into the above-mentioned expert module for users, and collects the probability distribution of output data derived from the expert module or a specific token pattern; and An ownership verification server that determines the ownership and license validity of weighted assets within the expert module for users by comparing the above-mentioned collected patterns with previously stored reference patterns; MoE expert weighted asset protection and anti-tampering system including
Description
System for Protecting Weight Assets and Verifying Ownership of Mixture of Experts (MoE) Artificial Intelligence Models System for Protecting Weight Assets and Verifying Ownership of Mixture of Experts (MoE) Artificial Intelligence Models The present invention relates to a technology for securing knowledge assets of an artificial intelligence model, and more specifically, to a system that inserts a statistical bias corresponding to unique identification information into the weight parameters of a specific expert module in an artificial intelligence model with a Mixture of Experts (MoE) structure composed of a plurality of expert modules, verifies weight ownership even in a black box environment through the analysis of output probability distributions and token patterns for predefined trigger input data, and tracks the weight leakage path through fingerprinting for each individual user. With the rapid advancement of artificial intelligence technology and the commercialization of Large Language Models (LLMs), weight parameters of deep neural networks are establishing themselves as core knowledge assets for enterprises. Large models, possessing billions to trillions of parameters, can only be acquired by investing massive computing resources and high-quality training data over several months, and the cost of generating them ranges from billions to tens of billions of won. A trained model containing these weight parameters is, in itself, an asset of immense economic value. In particular, the Mixture of Experts (MoE) architecture divides the entire model into multiple specialized expert modules and adopts a method where a gating network routes computations to the appropriate expert module based on the characteristics of the input data. This structure is attracting significant commercial attention because it offers high computational efficiency—as it allows for an explosive increase in the number of parameters in the overall model while limiting actual active parameters to only a fraction—and enables business models that selectively license only domain-specific expert modules to external users. Consequently, an AI knowledge asset trading market is emerging, characterized by the distribution of specialized expert modules to external companies or their provision in the form of APIs, covering professional domains such as law, medicine, finance, and engineering. However, weight parameters possess characteristics that are fundamentally different from software source code. While source code is written in a human-readable language that allows for the inference of attribution relationships through author attribution, comments, and code style, weight parameters are stored in the form of high-dimensional floating-point matrices, making it fundamentally impossible for the human eye to identify their contents or determine the attribution relationship between two sets of weights. Consequently, there is a fundamental problem in that even if the weight files of an expert module are copied or leaked without authorization, it is difficult to prove, at the level of legal evidence, that the module originated from one's own assets using only existing copyright protection systems or software identification technologies. Conventional digital watermarking technology has primarily been applied to media content such as images, audio, and video; however, there are specific technical barriers to directly applying this to high-dimensional numerical tensors, such as neural network weights. First, adding a watermark pattern to weights can distort the original weight distribution, potentially degrading the model's inference accuracy, which poses a critical problem for quality assurance of commercially distributed expert modules. Second, if users apply additional fine-tuning or transfer learning to distributed expert modules, or utilize model lightweighting techniques like quantization or pruning to improve inference speed, a robustness issue arises where watermarks inserted using conventional methods are damaged or lost, rendering them ineffective as proof of ownership. Third, in a black-box environment where access to the weight files themselves is restricted and only the model's inputs and outputs can be observed, there is no means to detect watermarks. Fourth, it is difficult to implement a fingerprinting function that tracks which user path caused the leakage among weights distributed to multiple users from the same original expert module using conventional technology. Due to these issues, there is a growing demand for integrated and robust technical solutions specialized in the protection of AI model weight assets and the verification of ownership. In particular, there is an urgent need for a system that integrates expert module-level weight protection in MoE structures, trigger-response-based ownership verification in black-box environments, resistance to model transformation operations, and fingerprinting for individual user identificat