Search

CN-121986327-A - Snoop filter entry using partial vectors

CN121986327ACN 121986327 ACN121986327 ACN 121986327ACN-121986327-A

Abstract

The described technology provides a method comprising generating a full trace vector, wherein each bit of the full trace vector indicates a cache validity state for a coherency granule (cogran) in a proxy cache for an associated proxy, dividing the trace vector into a plurality of Partial Vectors (PVEC), determining, for each PVEC, whether a cache validity state of at least one bit in PVEC is set to valid, and storing a given PVEC and its PVEC pointer in a tracking_info field for a base Snoop Filter (SFT) entry of cogran, wherein a PVEC pointer indicates a location of the given PVEC in the full trace vector, in response to determining that the cache validity state of at least one bit in the given PVEC is set to valid.

Inventors

  • E F Robinson

Assignees

  • 微软技术许可有限责任公司

Dates

Publication Date
20260505
Application Date
20240924
Priority Date
20231025

Claims (20)

  1. 1. A method, comprising: Generating a full trace vector (302-306), wherein each bit of the full trace vector indicates a cache validity state of a coherency granule (cogran) in a proxy cache for an associated proxy; dividing the tracking vector into a plurality of Partial Vectors (PVEC) (310-314); determining, for each PVEC (310-314), whether a cache validity status of at least one bit in the PVEC (310-314) is set to valid, and In response to determining that the cache validity status of at least one bit in a given PVEC (310-314) is set to valid, the given PVEC and its PVEC pointer (806) are stored in the tracking_info field of a base Snoop Filter (SFT) entry for the cogran, and a summary bit associated with the PVEC vector is set to one (1), wherein the PVEC pointer (806) indicates the location of the given PVEC in the full trace vector.
  2. 2. The method of claim 1, further comprising altering the tracking_mode of the base SFT entry to PVEC.
  3. 3. The method of claim 1, wherein dividing the full tracking vector into a plurality PVEC further comprises dividing the full tracking vector into four PVEC.
  4. 4. The method of claim 1, further comprising: Determining that more than one PVEC of the complete tracking vectors has at least one bit set to active, and In response to determining that more than one partial vector of the complete tracking vector has at least one bit set to be valid, Acquiring available SFT entries as additional SFT entries, and The append PVEC and its PVEC pointer are stored in the tracking_info field of the additional SFT entry.
  5. 5. The method of claim 4, further comprising altering the entry_state of the base SFT entry to SEARCHABLE _e and altering the entry_state of the additional SFT entry to EXTRA.
  6. 6. The method of claim 1, further comprising: PVEC determining that the additional SFT entry cannot store any more, and In response to determining that the additional SFT entry cannot store any more PVEC, the tracking_mode of the base SFT entry is changed to IMPRECISE.
  7. 7. The method of claim 1, further comprising setting the entry_state of the base SFT entry to SEARCHABLE _e in response to determining that more than one PVEC has the PVEC summary bit set to one.
  8. 8. The method of claim 1, wherein the full trace vector traces 128 agents and each PVEC has 32 bits.
  9. 9. One or more physically manufactured computer-readable storage media encoding computer-executable instructions for executing a computer process on a computer system (1900), the computer process comprising: Generating a full trace vector (302-306), wherein each bit of the full trace vector indicates a cache validity state of a coherency granule (cogran) in a proxy cache for an associated proxy; dividing the tracking vector into a plurality of Partial Vectors (PVEC) (310-314); determining, for each PVEC (310-314), whether a cache validity status of at least one bit in the PVEC (310-314) is set to valid, and In response to determining that the cache validity state of at least one bit in a given PVEC (310-314) is set to valid, the given PVEC and its PVEC pointer (806) are stored in the tracking_info field of a base Snoop Filter (SFT) entry for the cogran, and a summary bit associated with the PVEC vector is set to one (1), wherein the PVEC pointer (806) indicates the location of the given PVEC in the full trace vector.
  10. 10. The one or more physically manufactured computer-readable storage media of claim 9, wherein the computer process further comprises altering the tracking_mode of the base SFT entry to PVEC.
  11. 11. The one or more physically manufactured computer-readable storage media of claim 9, wherein dividing the complete tracking vector into a plurality PVEC further comprises dividing the complete tracking vector into four PVEC.
  12. 12. The one or more physically manufactured computer-readable storage media of claim 9, wherein the computer process further comprises: Determining that more than one PVEC of the complete tracking vectors has at least one bit set to active, and In response to determining that more than one partial vector of the complete tracking vector has at least one bit set to be valid, Acquiring available SFT entries as additional SFT entries, and The append PVEC and its PVEC pointer are stored in the tracking_info field of the additional SFT entry.
  13. 13. The one or more physically manufactured computer-readable storage media of claim 12, wherein the computer process further comprises altering the entry_state of the base SFT entry to SEARCHABLE _e and altering the entry_state of the additional SFT entry to EXTRA.
  14. 14. The one or more physically manufactured computer-readable storage media of claim 9, wherein the computer process further comprises: PVEC determining that the additional SFT entry cannot store any more, and In response to determining that the additional SFT entry cannot store any more PVEC, the tracking_mode of the base SFT entry is changed to IMPRECISE.
  15. 15. The one or more physically manufactured computer-readable storage media of claim 9, wherein the computer process further comprises setting the entry state of the base SFT entry to SEARCHABLE _e in response to determining that more than one PVEC has the PVEC summary bit set to one.
  16. 16. A system (1900), comprising: A memory; one or more processor units, and A cache coherence system (1900) (1910) (100), the cache coherence system (1900) (1910) (100) stored in the memory and executable by the one or more processor units, the cache coherence system (1900) (1910) (100) encoding computer executable instructions on the memory for executing computer processes on the one or more processor units, the computer processes comprising: Generating a full trace vector (302-306), wherein each bit of the full trace vector indicates a cache validity state of a coherency granule (cogran) in a proxy cache for an associated proxy; dividing the tracking vector into a plurality of Partial Vectors (PVEC) (310-314); determining, for each PVEC (310-314), whether a cache validity status of at least one bit in the PVEC (310-314) is set to valid, and In response to determining that the cache validity status of at least one bit in a given PVEC (310-314) is set to valid, the given PVEC and its PVEC pointer (806) are stored in the tracking_info field of a base Snoop Filter (SFT) entry for the cogran, and a summary bit associated with the PVEC vector is set to one (1), wherein the PVEC pointer (806) indicates the location of the given PVEC in the full trace vector.
  17. 17. The system of claim 16, wherein the computer process further comprises altering the tracking_mode of the base SFT entry to PVEC.
  18. 18. The system of claim 16, wherein dividing the full tracking vector into a plurality PVEC further comprises dividing the full tracking vector into four PVEC.
  19. 19. The system of claim 18, wherein the computer process further comprises: Determining that more than one PVEC of the complete tracking vectors has at least one bit set to active, and In response to determining that more than one partial vector of the complete tracking vector has at least one bit set to be valid, Acquiring available SFT entries as additional SFT entries, and The append PVEC and its PVEC pointer are stored in the tracking_info field of the additional SFT entry.
  20. 20. The system of claim 16, wherein the computer process further comprises storing the one or more PVEC in the base entry in response to determining that a cache validity state of one or more PVEC of the PVEC is set to valid.

Description

Snoop filter entry using partial vectors Background A processor-based device may include a plurality of Processing Elements (PEs) (e.g., processor cores as non-limiting examples), each PE providing one or more local caches for storing frequently accessed data. Because multiple PEs of a processor-based device may share memory resources such as system memory, multiple copies of shared data read from a given memory address may exist simultaneously within the system memory and local caches of the PEs. Thus, to ensure that all PEs have a consistent view of shared data, processor-based devices provide support for cache coherency protocols to enable local changes to shared data within one PE to be propagated to other PEs. Disclosure of Invention The described technology provides a method comprising generating a full trace vector, wherein each bit of the full trace vector indicates a cache validity state for a coherence granule (cogran) in a proxy cache for an associated proxy, dividing the trace vector into a plurality of Partial Vectors (PVEC), determining, for each PVEC, whether a cache validity state of at least one bit in PVEC is set to valid, and storing a given PVEC and its PVEC pointer in a tracking_info field of a base Snoop Filter (SFT) entry of cogran, wherein a PVEC pointer indicates a location of the given PVEC in the full trace vector, in response to determining that the cache validity state of at least one bit in the given PVEC is set to valid. This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Other implementations are also described and recited herein. Drawings FIG. 1 illustrates an implementation of a system that provides cache coherency using snoop filters. Fig. 2 illustrates an example structure of snoop filter entries implementing the techniques disclosed herein. FIG. 3 illustrates an example of a representation of a complete trace vector of the cache coherency system disclosed herein. FIG. 4 illustrates example entry states of logical SFT entries of the cache coherency system disclosed herein. FIG. 5 illustrates an example tracking pattern of logical SFT entries of the cache coherency system disclosed herein. FIG. 6 illustrates example values of the tracking_info field of the underlying SFT entry of the cache coherency system disclosed herein. FIG. 7 illustrates example values of the tracking_info field of the underlying SFT entry of the cache coherency system disclosed herein. FIG. 8 illustrates example values of the tracking_info field of additional SFT entries of the cache coherency system disclosed herein. FIG. 9 illustrates example operations for the case when a proxy new caches a copy of cogran that is currently tracked by the SFT. FIG. 10 illustrates example operations for the case when the SFT entry determines which agents it cannot accurately track cogran that the SFT is tracking. FIG. 11 illustrates example operations for the case when an SFT entry wants to attempt to change its tracking_mode to PVEC when adding a new agent to its tracking for cogran. FIG. 12 illustrates example operations for recording one or more PVEC into an SFT. FIG. 13 illustrates example operations for the case when the tracking_mode of an SFT entry is PVEC and the SFT needs to add an agent to its tracking for that SFT entry. FIG. 14 illustrates example operations to form a complete trace vector for an SFT entry when the SFT updates the SFT entry. FIG. 15 illustrates example operations for the case when an SFT needs to update an existing logical SFT entry to remove an agent from its trace. FIG. 16 illustrates example operations for the case when the tracking_mode of an SFT entry is PVEC and the SFT needs to remove the agent from its tracking of the SFT entry. FIG. 17 illustrates example operations for the case when cogran is accessed and an SFT lookup is performed to determine if interception is required. FIG. 18 illustrates example operations for the case when an agent may require exclusive access cogran to invalidate all other cached copies. FIG. 19 illustrates an example system that can be used to implement the cache coherency system disclosed herein. Detailed Description Implementations disclosed herein disclose multiprocessor systems employing Hardware (HW) forced cache coherency, wherein when an agent, such as a CPU, GPU, etc., wants to access a memory location, the HW automatically determines whether another agent is currently keeping a copy of the memory location. If the access is a read and the memory location is cached by another agent, the system memory may be outdated, in which case the access must be satisfied by obtaining data from the other agent's cache. If the access is a write, then other cached copies must typically