CN-121984788-A - AI acceleration card information detection and emergency response method based on intelligent agent chip

CN121984788ACN 121984788 ACN121984788 ACN 121984788ACN-121984788-A

Abstract

The invention discloses an AI acceleration card information detection and emergency response method based on an Agent chip, and relates to the technical field of information safety, wherein the method comprises the following steps of periodically executing the following operations when the Agent chip is in a starting state, collecting multidimensional end-side operation data, uploading the data to a cloud server for remote decision making when a network environment meets uploading conditions, and realizing attack identification and instruction issuing; by combining cloud decision and end side local self-decision, the high reliability of protection under a normal network is guaranteed, the safety under a weak network or no network environment is realized, meanwhile, the chip is deployed by a general computing unit, the core computing resource of an acceleration card is not required to be occupied, and the real-time performance, the continuity and the environmental adaptability of information detection are comprehensively improved on the premise of guaranteeing the efficient operation of an AI acceleration card.

Inventors

ZHANG XINYAN
ZHANG YIWEI
ZHANG YALIN

Assignees

上海燧原科技股份有限公司

Dates

Publication Date: 20260505
Application Date: 20260407

Claims (17)

1. An Agent-based AI acceleration card information detection and emergency response method, characterized in that it is executed by an Agent-based Agent chip disposed in a general purpose computing unit of an AI acceleration card, the method comprising: in the process that the Agent chip is in a starting state, the following operations are periodically executed: Acquiring end side operation data of the AI accelerator card in a current detection period, wherein the end side operation data comprises network communication data, main processing chip operation state information, data processing information and hardware interface operation state information; when the network environment of the current detection period meets the data uploading condition, the collected end-side operation data is sent to the cloud server to carry out far-end decision so that the cloud server can detect whether the AI acceleration card is attacked or not in the current detection period; When a function instruction fed back by the AI accelerator card after the AI accelerator card is attacked in the current detection period is received by the cloud server, the function instruction is analyzed and executed so as to protect the AI accelerator card; when the network environment of the current detection period does not meet the data uploading condition, the self-decision of whether the AI acceleration card is attacked in the current detection period is made according to the collected end-side operation data, and when the AI acceleration card is determined to be attacked in the current detection period, the preliminary protection operation of the AI acceleration card is executed.
2. The method according to claim 1, wherein the method further comprises: In the process that the Agent chip is in a starting state, if the network environment is detected to be switched from not meeting the data uploading condition to meeting the data uploading condition, detecting whether the Agent chip makes at least one self-decision in a time period when the network environment does not meet the data uploading condition; if yes, uploading the running data of each end side, the self-decision result and the preliminary protection operation in the time period as self-decision backtracking data to a cloud server so as to allow the cloud server to re-judge the execution correctness of each preliminary protection operation; and executing a final protection instruction fed back by the cloud server aiming at the re-judging result.
3. The method of claim 2, wherein executing the final protection instruction fed back by the cloud server for the re-determination result specifically includes: If the final protection instruction is a confirmation response instruction, confirming the execution correctness of each preliminary protection operation by executing the confirmation response instruction; If the final protection instruction comprises an instruction rollback instruction and a correction protection operation sub-instruction for canceling the preliminary protection operation, the instruction rollback sub-instruction and the correction protection operation sub-instruction are sequentially executed to correct the preliminary protection operation.
4. A method according to any one of claims 1-3, wherein parsing the functional instructions comprises: If the function instruction is judged to be an information isolation instruction after analysis, executing an operation corresponding to the information isolation instruction to cut off the association between attack flow and the AI acceleration card core component; And if the function instruction is judged to be a side channel attack protection instruction after analysis, executing an operation corresponding to the side channel attack protection instruction to interfere an information acquisition and analysis process of the side channel attack by regulating and controlling hardware operation parameters of an AI (advanced technology) accelerator card or adding interference data to a data transmission link, wherein the information isolation instruction comprises a self-destruction operation instruction and/or a data receiving and transmitting control instruction to the AI accelerator card, and the side channel attack protection instruction comprises an energy consumption temperature regulation operation instruction.
5. The method according to any one of claims 1 to 3, wherein the Agent chip comprises a firmware carrier loaded with an Agent program, the Agent program is specifically configured to execute the method for detecting AI accelerator card information and responding to an emergency based on the Agent chip, and a side-injection weight item is preconfigured in the Agent program so that the Agent chip obtains data processing information and hardware interface running state information of the AI accelerator card.
6. An AI acceleration card information detection and emergency response method based on an agent chip, which is characterized by being executed by a cloud server, the method comprising: The method comprises the steps of receiving collected end side operation data of a target Agent chip under a target detection period, wherein the Agent chip is deployed in a general calculation unit of an AI (advanced technology interface) accelerator card, and the end side operation data comprise network communication data, main processing chip operation state information, data processing information and hardware interface operation state information; After classification and carding, structural integration and encryption operations are carried out on the received end-side operation data according to preset classification standards, the encrypted end-side operation data are stored into an end-side operation database; judging whether an AI acceleration card to which a target Agent chip belongs is subjected to network attack or not based on each end side operation data stored in an end side operation database; and if the AI acceleration card is judged to be attacked, generating a matched functional instruction and sending the matched functional instruction to a target Agent chip.
7. The method of claim 6, wherein the method further comprises: The method comprises the steps of receiving self-decision backtracking data uploaded by a target Agent chip, wherein the self-decision backtracking data comprises running data of each end side, a self-decision result and preliminary protection operation in a time stage that a network environment does not meet data uploading conditions; Based on the operation data of each end side in the self-decision backtracking data, the self-decision research judgment result and the preliminary protection operation, the execution correctness of each preliminary protection operation is re-judged, and a final protection instruction is generated according to the re-judgment result and is fed back to the matched Agent chip.
8. The method of claim 7, wherein generating the final guard instruction based on the re-determination comprises: If the primary protection operation is correct, generating a confirmation response instruction as a final protection instruction; if the primary protection operation is judged to have misoperation, an instruction rollback instruction and a correction protection operation instruction are generated and used as final protection instructions.
9. The method of any of claims 6-8, wherein generating the matched functional instruction comprises: If the AI acceleration card is judged to be under network attack, an information isolation instruction is generated as a functional instruction; and if the AI accelerator card is judged to suffer from side channel attack, generating a side channel attack protection instruction as a functional instruction, wherein the information isolation instruction comprises a self-destruction operation instruction and/or a data receiving and transmitting control instruction for the AI accelerator card, and the side channel attack protection instruction comprises an energy consumption temperature regulation operation instruction.
10. The method of claim 6, wherein determining whether the AI acceleration card to which the target Agent chip belongs is subject to a network attack comprises: if the AI acceleration card is not attacked, executing dynamic calculation operation based on the end-side operation data in the end-side operation database; And judging whether the current attack flow judgment threshold is matched with the running state of the AI accelerator card according to the dynamic calculation result, and updating the attack flow judgment threshold based on the dynamic calculation result if the current threshold is not matched.
11. The AI acceleration card information detection and emergency response device based on the intelligent Agent chip is characterized in that the device is configured on an Agent chip arranged in a general computing unit of the AI acceleration card, and in the process that the Agent chip is in a starting state, the device periodically executes the following operations: The terminal side acquisition module is used for acquiring terminal side operation data of the AI accelerator card in the current detection period, wherein the terminal side operation data comprises network communication data, main processing chip operation state information, data processing information and hardware interface operation state information; The remote end sending module is used for sending the collected end side operation data to the cloud end server to carry out remote decision when the network environment of the current detection period meets the data uploading condition so as to enable the cloud end server to detect whether the AI acceleration card is attacked or not in the current detection period; The analysis instruction module is used for analyzing and executing the function instruction when the cloud server receives the function instruction fed back after the AI acceleration card is determined to be attacked in the current detection period, so as to protect the AI acceleration card; and the primary protection module is used for carrying out the decision of whether the AI acceleration card is attacked under the current detection period according to the collected terminal side operation data when the network environment of the current detection period does not meet the data uploading condition, and executing the primary protection operation on the AI acceleration card when the AI acceleration card is determined to be attacked under the current detection period.
12. AI acceleration card information detects and emergent response device based on agent chip, its characterized in that disposes in high in the clouds server, and the device includes: The system comprises a receiving data module, a target Agent chip, an AI acceleration card and a hardware interface operation state information, wherein the receiving data module is used for receiving collected end side operation data of the target Agent chip under a target detection period, and the Agent chip is deployed in a general calculation unit of the AI acceleration card; The data storage module is used for storing the encrypted end-side operation data into the end-side operation database after performing classification and carding, structural integration and encryption operation on the received end-side operation data according to a preset classification standard; the attack judging module is used for judging whether the AI acceleration card of the target Agent chip is subjected to network attack or not based on the end-side operation data stored in the end-side operation database; And the generation instruction module is used for generating a matched functional instruction and sending the matched functional instruction to the target Agent chip if the AI acceleration card is judged to be attacked.
13. An Agent chip, characterized in that the Agent chip comprises: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the smart chip-based AI acceleration card information detection and emergency response method of any of claims 1-5.
14. An AI accelerator card, wherein the general purpose computing unit of the AI accelerator card has the Agent chip of claim 13 disposed therein.
15. A terminal device comprising the AI acceleration card of claim 14.
16. A cloud server, the cloud server comprising: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the smart chip-based AI acceleration card information detection and emergency response method of any of claims 6-10.
17. The AI acceleration card information detection and emergency response system based on the intelligent agent chip is characterized by comprising at least one terminal device according to claim 15 and the cloud server according to claim 16, wherein communication connection is established between each terminal device and the cloud server through a preset communication link.

Description

AI acceleration card information detection and emergency response method based on intelligent agent chip Technical Field The invention relates to the technical fields of information security, data security and chips, in particular to an AI (ARTIFICIAL INTELLIGENCE ) acceleration card information detection and emergency response method based on an Agent chip. Background The AI accelerator card is used as a special hardware module designed for optimizing AI related operation tasks, and the security capability directly determines the credible boundary of AI technology application. With the development of technology, information security has been upgraded from auxiliary requirements to technical indexes necessary for AI acceleration cards. In the aspect of information security detection of an AI accelerator card, a hardware-level vulnerability detection module is generally embedded in an accelerator card architecture, a standardized vulnerability scanning tool is developed, an integrated detection flow of a chip, firmware, software is built, multidimensional monitoring of bottom-layer vulnerabilities, firmware vulnerabilities, interface risks and the like is realized, a hierarchical response mechanism is generally built in the aspect of emergency response of the AI accelerator card, quick treatment is realized by linkage hardware capability, a repair period is shortened through technologies such as firmware online upgrading and heat deployment, and continuous operation of AI services is ensured. However, the hardware-level detection module has insufficient specific risk recognition capability on novel attack and AI scenes, lacks uniform detection standards, reduces the accuracy of attack detection and protection, is difficult to adapt to a third party tool, is partially dependent on manual work and has low efficiency, the hardware cooperative capability of emergency response is weak, the safety disposal contradicts the calculation performance, the safety response capability and efficiency are reduced, the research and development and deployment costs of detection and response are too high, related functions are forced to be simplified, the safety risk is amplified, and the real-time performance, the continuity and the environmental adaptability of detection are reduced. Disclosure of Invention The invention provides an AI (advanced technology attachment) acceleration card information detection and emergency response method based on an intelligent agent chip, which aims to solve the problems of low real-time performance, continuity and environmental adaptability of AI acceleration card information detection, and low accuracy and low safety response capability and efficiency of attack detection and protection caused by the same. According to an aspect of an embodiment of the present invention, there is provided an Agent chip disposed in a general purpose computing unit of an AI accelerator card, and the method includes: The method comprises the steps of periodically executing the following operations in the process that an Agent chip is in a starting state, collecting end-side operation data of an AI accelerator card in a current detection period, wherein the end-side operation data comprise network communication data, main processing chip operation state information, data processing information and hardware interface operation state information, when a network environment in the current detection period meets data uploading conditions, sending the collected end-side operation data to a cloud server to conduct far-end decision so as to enable the cloud server to detect whether the AI accelerator card is attacked in the current detection period, when the cloud server receives a functional instruction fed back after the AI accelerator card is determined to be attacked in the current detection period, analyzing and executing the functional instruction so as to protect the AI accelerator card, and when the network environment in the current detection period does not meet data uploading conditions, conducting self-decision whether the AI accelerator card is attacked in the current detection period according to the collected end-side operation data, and executing preliminary protection operation on the AI accelerator card when the AI accelerator card is determined to be attacked in the current detection period. According to another aspect of the embodiment of the present invention, there is provided an AI acceleration card information detection and emergency response method based on an agent chip, which is executed by a cloud server, the method including: The method comprises the steps of receiving collected end side operation data of a target Agent chip under a target detection period, wherein the Agent chip is deployed in a general calculation unit of an AI accelerator card, the end side operation data comprise network communication data, main processing chip operation state information, data processing information a