CN-121743098-B - NPU processor architecture-based integrated circuit memory fault management method and system
Abstract
The invention belongs to the technical field of integrated circuit design and manufacture, and particularly relates to an integrated circuit memory fault management method and system based on an NPU processor architecture. And then, according to a fault detection result, a control signal is reversely output to intervene in the mapping logic, so that fault diffusion caused by continuous transmission of an abnormal address is effectively avoided, and when the address abnormality is detected, a subsequent physical address can be forcedly relocated to a preset safe address area, thereby providing a reliable safety bottom and improving the operation stability.
Inventors
- NING YAFEI
- WANG ZHAOSHENG
- SUN TAO
- SHENG XIAOFEI
- XING JIANPING
- QU JIANBO
Assignees
- 山东大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260302
Claims (5)
- 1. An integrated circuit memory fault management method based on an NPU processor architecture, comprising: s1, after outputting a virtual address according to a memory access request, combining memory size limiting parameters and base address parameters of address mapping, judging a storage area of the virtual address, and finishing conversion from the virtual address to a physical address according to a preset mapping rule; s2, based on the ordered physical address or original virtual address and the matched bus signal, fault detection is carried out, and the operation is as follows: If the fault detection switch signal value is 1, detecting the validity of a bus handshake protocol; If the protocol is valid, detecting the memory size of the physical address after sequencing and the storage area mapped by the physical address, identifying the fault type, and synchronously outputting fault identification data, wherein the method specifically comprises the following steps: Judging whether the memory size of the ordered physical addresses exceeds the memory size limiting parameter, and outputting an address size overrun fault identifier if the memory size exceeds the memory size limiting parameter; judging whether the mapped physical address accords with a preset region rule, and if so, outputting a region access error fault identifier; s2, after detecting a fault signal, taking a physical address corresponding to the fault signal as a node, and forcibly repositioning all physical addresses sequenced by the physical address to a preset safe address area; S3, generating corresponding regulation and control signals according to the fault identification data, reversely transmitting the signals to S1, and adjusting memory size limiting parameters of address mapping and storage area identification bits of replacement physical addresses, wherein the specific steps are as follows: aiming at the address size overrun fault identification, the memory size limiting parameter value of the address mapping is adjusted through the memory size limiting signal; and directly replacing the storage area identification bit of the physical address with an identifier of a preset safe address area aiming at the area access error fault identification.
- 2. The method for memory fault management of an integrated circuit based on an NPU processor architecture of claim 1, wherein S2 detects validity of a bus handshake protocol, and is that the protocol is valid if both a valid signal value and a ready signal value corresponding to the protocol are 1.
- 3. The method for memory fault management of an integrated circuit based on an NPU processor architecture of claim 1, wherein S2 further comprises buffering the ordered physical addresses or the original virtual addresses and the associated bus signals if the fault detection switch signal value is 0, and transmitting the buffered physical addresses or the original virtual addresses and the associated bus signals to the target storage terminal.
- 4. The method of claim 1, wherein S3 further comprises stopping the reverse transmission of the control signal when no fault signal is detected, and recovering the original memory size limit parameter and the storage area determination result of the address map.
- 5. An NPU processor architecture based integrated circuit memory fault management system, comprising: The address mapping module is used for judging the storage area of the virtual address by combining the memory size limiting parameter and the base address parameter of the address mapping after outputting the virtual address according to the memory access request, and finishing the conversion from the virtual address to the physical address according to the preset mapping rule; The fault detection module is used for detecting faults based on the ordered physical addresses or the original virtual addresses and the matched bus signals, and the operation is as follows: If the fault detection switch signal value is 1, detecting the validity of a bus handshake protocol; If the protocol is valid, detecting the memory size of the physical address after sequencing and the storage area mapped by the physical address, identifying the fault type, and synchronously outputting fault identification data, wherein the method specifically comprises the following steps: Judging whether the memory size of the ordered physical addresses exceeds the memory size limiting parameter, and outputting an address size overrun fault identifier if the memory size exceeds the memory size limiting parameter; judging whether the mapped physical address accords with a preset region rule, and if so, outputting a region access error fault identifier; The fault detection module further comprises a safety area orientation module, which is used for taking the physical address corresponding to the fault signal as a node after the fault signal is detected, and forcedly repositioning all the physical addresses after the physical address is sequenced to a preset safety address area; The regulation and control module generates corresponding regulation and control signals according to the fault identification data, reversely transmits the signals to the address mapping module, and adjusts memory size limiting parameters of address mapping and storage area identification bits of replacement physical addresses, wherein the regulation and control module comprises the following specific steps: aiming at the address size overrun fault identification, the memory size limiting parameter value of the address mapping is adjusted through the memory size limiting signal; and directly replacing the storage area identification bit of the physical address with an identifier of a preset safe address area aiming at the area access error fault identification.
Description
NPU processor architecture-based integrated circuit memory fault management method and system Technical Field The invention belongs to the technical field of integrated circuit design and manufacture, and particularly relates to an integrated circuit memory fault management method and system based on an NPU processor architecture. Background With the rapid development of the fields of artificial intelligence and deep learning, a neural Network Processor (NPU) is used as a special computing core of an artificial intelligent chip, and is required to process complex scenes such as large-scale tensor operation, multi-task concurrency reasoning and the like, so that extremely high requirements are put on bandwidth, delay and memory capacity of on-chip data transmission. The network on chip (NoC) is a core architecture for interconnecting functional modules such as a computation core cluster, a storage unit, an interface module and the like in the NPU by virtue of the advantages of distributed interconnection, expandable bandwidth and low delay transmission, and the AXI (Advanced eXtensible Interface) bus is a main stream interconnection bus protocol between the NoC of the NPU and each functional module due to the characteristics of standardized interfaces, supporting burst transmission and disordered access. In the Memory access flow of the integrated NPU NoC architecture, the Core link is Virtual Address (VA), fragment Memory management unit (SMMU, segment Memory Management Unit), physical Address (PA), target storage area, the hardware system comprises a NoC Core (including an AXI Master Master device), an SMMU (including a SMMU _core Core sub-module and a SMMU _in_order sequencing sub-module), a target storage area (such as DDR, local Memory, on-chip cache and the like) and a corresponding Slave device for Slave, wherein the SMMU is a Core component for realizing VA to PA address conversion. The traditional memory management comprises the specific processes that a Master of a NoC Core initiates a memory access request and outputs a VA, control signals limit_size and base_address of VA to PA mapping are input to smmu _core, smmu _core judges address areas of the VA and completes mapping conversion of the VA to the PA, after the converted PA is processed by a data selector of alternative type, smmu _in_order is ordered according to a receiving sequence, and finally data reading and writing are completed by a Slave device of Slave of the NoC Cluster. However, the conventional NPU memory management method has significant technical defects that the size and the area of the PA and the validity of an AXI handshake protocol cannot be actively detected, a fault identifier cannot be output, only abnormal signals can be passively received and transmitted, a fault active regulation mechanism is lacking, even if the problems of over-limit of address size, wrong mapping of the area and the like occur, mapping logic of smmu _core cannot be reversely interfered to adjust limit_size or correct the area identifier of the PA, only abnormal addresses can be continuously output, no active remedy measures are provided, meanwhile, due to the fact that no safety area switching logic is designed under a fault scene, the subsequent PA cannot be actively directed to a safety address area when the address is abnormal, and the system lacks safety bottom capability under a fault state and is insufficient in risk resistance. Disclosure of Invention In order to solve the problem of the background technology, the invention provides an integrated circuit memory fault management method and system based on an NPU processor architecture. The technical scheme of the invention is as follows: The invention provides an integrated circuit memory fault management method based on an NPU processor architecture, which comprises the following steps: s1, after outputting a virtual address according to a memory access request, combining memory size limiting parameters and base address parameters of address mapping, judging a storage area of the virtual address, and finishing conversion from the virtual address to a physical address according to a preset mapping rule; s2, based on the ordered physical address or original virtual address and the matched bus signal, fault detection is carried out, and the operation is as follows: If the fault detection switch signal value is 1, detecting the validity of a bus handshake protocol; If the protocol is effective, detecting the memory size of the physical address after sequencing and the storage area mapped by the physical address, identifying the fault type, and synchronously outputting fault identification data; And S3, generating corresponding regulation and control signals according to the fault identification data, reversely transmitting the signals to S1, and adjusting memory size limiting parameters of address mapping and replacing storage area identification bits of physical addresses. Based on the above-men