CN-122018991-A - Cache block clearing method, processor and chip
Abstract
The embodiment of the specification relates to a cache block clearing method, a processor and a chip, and relates to the technical field of computers, wherein the method comprises the steps of obtaining a cache block clearing micro instruction carrying parameters, wherein the parameters comprise a physical address and do not comprise data; writing the physical address into a storage address buffer, matching corresponding cache blocks in a cache according to the physical address in the storage address buffer, and writing data 0 into the cache blocks. By abstracting the cache block zero clearing operation into a special micro instruction which only carries a physical address and does not contain a data load, a series of operation links such as acquiring zero value data from a register file or a data bus, writing the zero value into a storage data buffer, reading the data from the data buffer and transmitting the data to the cache are eliminated. The simplified execution path directly shortens the processing period of the instruction in the pipeline, reduces the operation delay, and accordingly improves the instruction throughput rate and the overall execution efficiency of the processor.
Inventors
- GOU JUNLIN
Assignees
- 成都群芯微电子科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251217
Claims (16)
- 1. A method for clearing a cache block, comprising: Obtaining a cache block zero clearing micro instruction carrying parameters, wherein the parameters comprise a physical address and do not comprise data; writing the physical address into a memory address buffer; according to the physical address in the storage address buffer, matching corresponding buffer blocks in the buffer; Writing data 0 into the cache block.
- 2. The method of claim 1, wherein the parameters further comprise instruction type; the method further comprises the steps of: the instruction type is written to a memory address buffer.
- 3. The method of claim 1, wherein the caches comprise at least a primary cache, a secondary cache, and a tertiary cache; the matching corresponding cache blocks in the cache according to the physical addresses in the storage address buffer comprises: And searching the cache block which is matched with the physical address in the storage address buffer and is in a writable state in the first-level cache.
- 4. A method according to claim 3, wherein the method further comprises: If no cache block which is matched with the physical address in the storage address buffer and is in a writable state exists in the first-level cache, acquiring a first count value of a first counter, wherein the first counter is used for counting executed zero clearing instructions; Judging whether the first count value is smaller than a first writing threshold value or not, wherein the first writing threshold value is used for controlling the maximum number of writing of the continuous zero clearing instruction into the first-level cache; If yes, writing the data 0 into the first-level cache; If not, writing the data 0 into the secondary cache or the tertiary cache.
- 5. The method of claim 4, wherein writing data 0 to the level one cache comprises: reading the instruction type from the memory address buffer; and writing data 0 into the first-level cache according to the instruction type.
- 6. The method of claim 4, wherein writing data 0 to the secondary cache or the tertiary cache comprises: Judging whether the first count value is smaller than a second writing threshold value or not, wherein the second writing threshold value is used for controlling the maximum number of the continuous zero clearing instruction written into the first-level cache and the second-level cache; if yes, writing the data 0 into a secondary cache; If not, writing the data 0 into the three-level cache.
- 7. The method of claim 6, wherein writing data 0 to the tertiary cache comprises: If no cache block matched with the physical address in the storage address buffer exists in the secondary cache, reading the instruction type from the storage address buffer; and writing data 0 into a corresponding cache block in the three-level cache according to the instruction type.
- 8. The method of claim 7, wherein writing data 0 to the tertiary cache further comprises: If a cache block matched with the physical address in the storage address buffer exists in the secondary cache and the cache block is in a writable state, reading the instruction type from the storage address buffer; And writing data 0 into the cache block in the secondary cache according to the instruction type.
- 9. The method of claim 7, wherein the first write threshold is a degree of associativity of a level one cache; The second write threshold is determined by the following formula: alloc_l2_threshold=(L2_C/L1_C)*alloc_l1_threshold; In the formula, allocjl2_threshold represents a write threshold of the secondary cache, l2_c represents a cache capacity of the secondary cache, l1_c represents a cache capacity of the primary cache, and allocj1_threshold represents a write threshold of the primary cache.
- 10. The method of claim 4, wherein the method further comprises: acquiring a second count value of a second counter, wherein the second counter is used for counting instructions which are not clear instructions and are not write instructions; resetting the second counter if the first count value is greater than or equal to a first count threshold; And resetting the first counter if the second count value is equal to a second count threshold.
- 11. The method of claim 4, wherein the method further comprises: After writing data 0 into the primary cache, the secondary cache or the tertiary cache, the first count value of the first counter is increased by 1.
- 12. A method according to claim 3, wherein the method further comprises: if a cache block matched with the physical address in the storage address buffer exists in the first-level cache and the cache block is in a non-writable state, sending a cache miss instruction to a second-level cache; And writing the data 0 into the cache block in the first-level cache according to the writable instruction returned by the second-level cache.
- 13. The method of claim 3, wherein the writing data 0 to the cache block comprises: Determining a plurality of instructions to be executed accessing the cache block according to the physical address in a storage address buffer; arbitrating the instruction written by the data 0 with the plurality of instructions to be executed; After the instruction written with the data 0 wins arbitration, the data 0 is written into the cache block.
- 14. A method for implementing a read instruction, comprising: Matching and checking the target physical address of the read instruction and the physical address of the clear instruction recorded in the storage address buffer and preceding the read instruction; And if a zero clearing instruction matched with the physical address exists, returning the data 0 to the reading instruction according to the instruction type of the zero clearing instruction.
- 15. A processor for performing the method of any one of claims 1-14.
- 16. A chip is characterized in that, comprising a processor as claimed in claim 15.
Description
Cache block clearing method, processor and chip Technical Field The embodiment of the specification relates to the technical field of computers, in particular to a cache block clearing method, a processor and a chip. Background In the current CPU design, since the delay of directly accessing the memory is very large, a Cache (Cache) is used to Cache data, so as to speed up the data reading speed and improve the CPU performance. The use of a cache also presents some processing problems. For applications that handle highly sensitive information, such as encryption keys, to ensure that these data are completely deleted from the cache, to enhance security, the operating system needs to use an instruction to write the cache block to 0. In addition, in the hardware fault detection and recovery process, after detecting a hardware fault, the operating system may need to perform a series of maintenance operations, including writing 0 to the cache block to resume normal operation. And in the case of optimizing memory initialization operations, particularly in the case of frequently allocated memory, the cache block needs to be frequently written to 0. In mainstream CPU architectures such as RSIC-V (RV), the CBO.ZERO instruction is used to implement cache block flush 0. The cbo.zero instruction in RV, carrying Virtual Address (VA), needs to be converted into (PHYSICAL ADDRESS, PA) in the read-write Unit (LOAD STORE Unit, LSU), uses PA to compare with PA of the cache block in LSU, if matching (match) occurs, needs to write 0 into the cache block. As shown in fig. 1, in the prior art scheme, the cbo.zero instruction multiplexes the microarchitecture of the normal STORE instruction, STOREs the address related information of the cbo.zero instruction in an address Buffer (SB) in a STORE Buffer (STORE Buffer), STOREs data 0 in a data Buffer (data Buffer) of the SB, and links address entries (ADDRESS ENTRY) and data entries (DATA ENTRY) in the SB in a link manner instead of one-to-one correspondence, because some STORE instructions do not carry data, and the data size of STORE instructions is smaller, generally SB DATA ENTRY number is smaller than SB ADDRESS ENTRY number. Cbo.zero since it is necessary to write data 0 of the cache block size, the current cache block size is 64B, and the data buffer width of the existing SB is 16B, one cbo.zero instruction needs to be sent to the LSU 4 times. The cbo.zero instruction in SB will look up TAG, if the corresponding PA address hits (hit) and has write permission, then write 0 to the corresponding Cache block, if miss (miss), then take write permission from L2 Cache and then write 0 to the Cache block. In the existing cbo.zero technology, the LSU microarchitectural process flow of the normal STORE instruction is multiplexed, resulting in lower performance and higher power consumption. Disclosure of Invention The embodiment of the specification aims to provide a cache block clearing method, a processor and a chip, so as to solve the problems of large overlapping range and low feasibility of classification results due to dependence on manual experience in the existing method. In order to solve the above technical problems, the specific technical solutions of the embodiments of the present specification are as follows: on one hand, an embodiment of the present disclosure provides a method for clearing a cache block, including: Obtaining a cache block zero clearing micro instruction carrying parameters, wherein the parameters comprise a physical address and do not comprise data; writing the physical address into a memory address buffer; according to the physical address in the storage address buffer, matching corresponding buffer blocks in the buffer; Writing data 0 into the cache block. Further, the parameters further include an instruction type; the method further comprises the steps of: the instruction type is written to a memory address buffer. Further, the cache at least comprises a first-level cache, a second-level cache and a third-level cache; the matching corresponding cache blocks in the cache according to the physical addresses in the storage address buffer comprises: And searching the cache block which is matched with the physical address in the storage address buffer and is in a writable state in the first-level cache. Further, the method further comprises: If no cache block which is matched with the physical address in the storage address buffer and is in a writable state exists in the first-level cache, acquiring a first count value of a first counter, wherein the first counter is used for counting executed zero clearing instructions; Judging whether the first count value is smaller than a first writing threshold value or not, wherein the first writing threshold value is used for controlling the maximum number of writing of the continuous zero clearing instruction into the first-level cache; If yes, writing the data 0 into the first-level cache; If not, writing the data