CN-121984757-A - AI content security interception method and system based on plug-in forced access

CN121984757ACN 121984757 ACN121984757 ACN 121984757ACN-121984757-A

Abstract

The invention relates to the technical field of artificial intelligence safety, discloses an AI content safety interception method and system based on an externally-hung forced passage, and aims to solve the problems that the existing AI model is fragmented, non-forced and easy to bypass in safety protection. The key innovation point is to deploy an externally-hung forced passage independent of a model body at an AI model service inlet, force all input data to flow through the passage to finish feature extraction and security decision, and the key innovation point comprises an externally-hung architecture, a forced passage and a permission isolation mechanism. The invention realizes the unified safety interception of the multi-source heterogeneous AI model, forms a defending system which is perceivable but not interveneable, provides an infrastructure-level solution for AI safety control, and can be used as the industry minimum safety standard in the artificial intelligence field.

Inventors

Tu Yuchen

Assignees

涂宇辰

Dates

Publication Date: 20260505
Application Date: 20260204

Claims (10)

1. An AI content security interception method, comprising the steps of: S1, deploying a plug-in mandatory data path at a service request entry of an AI model, wherein the data path runs independently of the AI model body; s2, intercepting all service request data initiated to the AI model, and forcing the service request data to perform feature extraction through the data path; S3, carrying out security decision on the extracted features based on a preset security rule base; S4, if the risk of safety is judged to exist, intercepting or correcting the service request data, generating a safety instruction and transmitting the safety instruction to the AI model, and if the risk is judged to exist, forwarding the service request data to the AI model for reasoning.
2. An AI content security intercept system, comprising: The external access module is used for being deployed at a service inlet of the AI model to establish a forced data flow path; The feature extraction module is used for carrying out multidimensional feature analysis on the flowing data; The safety decision module is used for carrying out risk judgment based on the feature matching preset rule base; and the instruction execution module is used for executing interception, correction or forwarding operation according to the judgment result.
3. The method according to claim 1, wherein in step S1, the plug-in mandatory data path is implemented in the form of an API gateway plug-in, a reverse proxy service or Sidecar containers.
4. The method of claim 1, wherein the AI model is capable of perceiving the existence of the mandatory data path but is configured to be unauthorised to intervene, modify, close or scan the operating logic, feature extraction behavior and decision execution process of the data path, wherein the control authority of the data path is attributed to an external management entity independent of the AI model, and wherein the operating environment of the AI model is limited to a low authority sandbox, and wherein read-write, debug or network probing operations of the path module cannot be initiated.
5. The method of claim 1, wherein the preset security rule base comprises a sensitive word base based on regular expression matching, a semantic risk base based on vector similarity calculation, and a logic risk base based on knowledge graph.
6. The method of claim 1, wherein the security rule base supports hot updates that enable dynamic loading of new security policies without restarting AI model services.
7. The method of claim 1, wherein the data path records all passing request data, feature extraction results, and decision logs, and generates an auditable report.
8. The method of claim 1, wherein the data path is transparently transferred using zero copy or memory mapping techniques for request data determined to be risk free to reduce latency.
9. The method of claim 1, wherein the plug-in mandatory data path supports configuration of multiple adapters capable of adapting AI model interface standards of different vendors, different protocols.
10. The method of claim 1, wherein when the data path fails, the system enters a preset "security mode", defaults to block all new service requests, or switches to a degraded black-and-white list filtering mode, preventing security capability from failing.

Description

AI content security interception method and system based on plug-in forced access Technical Field The invention relates to the technical field of artificial intelligence safety, in particular to a method and a system for realizing forced content interception through an externally hung architecture independently of an AI model body, which are particularly suitable for general safety protection in a multi-source heterogeneous large model environment. Background With the rapid development of artificial intelligence technology, a large model has become a core driving force for promoting industry upgrading. Currently, there is a large number of AI models developed and deployed independently from different "model factories" worldwide, whose security policies exhibit highly fragmented, privately and non-standardized features. Implicit alignment or post-training fine tuning inside the existing security control multi-dependency model lacks transparency, auditability and cross-platform consistency. Particularly in highly sensitive scenes related to national information security, public order and the like, the existing distributed protection system has revealed serious defects. A more prominent problem is that there is no unified, cross-model, cross-industry mandatory security interception standard. The market has urgent need for a universal safety infrastructure which does not depend on a specific model, does not limit a deployment architecture, can independently operate and has forced execution force, and is used as an industry-level basic protection standard of artificial intelligence application. Under the background, the invention aims to provide an externally-hung forced data path, construct a security fence mechanism which is independent of an AI model body and penetrates through all data flow paths, and provide consistent, transparent and manageable basic security capability for the whole industry. Disclosure of Invention Object of the Invention The invention aims to solve the problems of fragmentation, non-mandatory and easy bypassing of the AI model safety protection in the prior art, and provides a general interception scheme which is used as the minimum safety standard of the industry. Technical proposal In order to achieve the above purpose, the invention adopts the following technical scheme: S1, deploying a plug-in mandatory data path at a service request entry of an AI model, wherein the data path runs independently of the AI model body; s2, intercepting all service request data initiated to the AI model, and forcing the service request data to perform feature extraction through the data path; S3, carrying out security decision on the extracted features based on a preset security rule base; S4, if the risk of safety is judged to exist, intercepting or correcting the service request data, generating a safety instruction and transmitting the safety instruction to the AI model, and if the risk is judged to exist, forwarding the service request data to the AI model for reasoning. The external access module is used for being deployed at a service inlet of the AI model to establish a forced data flow path; The feature extraction module is used for carrying out multidimensional feature analysis on the flowing data; The safety decision module is used for carrying out risk judgment based on the feature matching preset rule base; and the instruction execution module is used for executing interception, correction or forwarding operation according to the judgment result. The plug-in architecture is deployed in the form of API, reverse proxy or Sidecar container without modifying the internal parameters of the AI model, and has extremely strong universality and portability. Mandatory path-establishing a data flow path which can not be bypassed, ensuring that all input data must pass security check. The rights isolation mechanism-AI model operates in a low rights sandboxed environment and while being able to sense the presence of a pathway (e.g., sense that traffic is being proxied), is deprived of rights to modify, close, scan or probe the pathway operation logic. Drawings FIG. 1 is a diagram of a system architecture provided by an embodiment of the present invention; FIG. 2 is a flow chart of a method provided by an embodiment of the present invention; FIG. 3 is a schematic view of rights isolation provided by an embodiment of the present invention. Advantageous effects The method has the technical effects that the unified security interception of the AI model provided by all industries and all model factories is realized, a perceivable but non-intervention defense system is formed, a path that a user bypasses interception through prompt word engineering is effectively blocked, and a floor-type infrastructure level solution is provided for AI security management.