US-12625705-B1 - Cache line hold state for multiprocessing computing systems

US12625705B1US 12625705 B1US12625705 B1US 12625705B1US-12625705-B1

Abstract

A present invention embodiment provides a cache line hold state that reduces cache contention. A first processor executes a store instruction to acquire a lock on a cache line in a shared memory system of a multiprocessor computing system. The first processor sets the cache line to an exclusive state in a local cache of the first processor upon successful acquisition of the lock. A lock state indicative of a windowed hold on the cache line is recorded by the first processor, wherein the windowed hold enables non-exclusive fetch requests from one or more other processors.

Inventors

Michael Fee
Andrew Walter Piechowski
Deanna Postles Dunn Berger
Peter Kenneth Szwed
Seth E. Lederer
Aaron Tsai
Timothy J Slegel
Jason D Kohl
Robert J Sonnelitter, III
Gregory William Alexander

Assignees

INTERNATIONAL BUSINESS MACHINES CORPORATION

Dates

Publication Date: 20260512
Application Date: 20241213

Claims (20)

1 . A computer-implemented method comprising: executing, by a first processor, a store instruction to acquire a lock on a cache line in a shared memory system of a multiprocessor computing system; setting, by the first processor, the cache line to an exclusive state in a local cache of the first processor upon successful acquisition of the lock; and recording a lock state indicative of a windowed hold on the cache line by the first processor, wherein the windowed hold enables non-exclusive fetch requests from one or more other processors.
2 . The method of claim 1 , further comprising: receiving a non-exclusive fetch request from a second processor; in response to receiving the non-exclusive fetch request, demoting, by the first processor, the cache line to read-only; and returning, by the first processor, data of the cache line to the second processor.
3 . The method of claim 2 , wherein the returning further comprises returning an indication that the cache line is held in the windowed hold by the first processor.
4 . The method of claim 1 , further comprising: in response to receiving a non-exclusive fetch request from a second processor, setting, by the first processor, the cache line to a primed-for-long-hold state, wherein the primed-for-long-hold state causes an exclusive fetch request for the cache line by the one or more other processors to be rejected and permits a read-only request for the cache line by the one or more other processors.
5 . The method of claim 4 , further comprising: in response to a write request being issued by the first processor, reacquiring, by the first processor, the cache line in the exclusive state; and recording the lock state indicative of a long-hold on the cache line by the first processor, wherein the lock state indicative of the long-hold causes read and write requests to the cache line by the one or more other processors to be rejected.
6 . The method of claim 5 , further comprising: completing, by the first processor, a second store instruction and in response, releasing the lock and clearing the lock state.
7 . The method of claim 1 , wherein the store instruction comprises a hint instruction on a compare and swap instruction.
8 . The method of claim 1 , wherein the store instruction comprises a Next Instruction Access Intent (NIAI) hint instruction preceding the store instruction.
9 . A computer system comprising: a processor set; one or more computer-readable storage medium; and program instructions stored on the one or more computer-readable storage medium to cause the processor set to perform operations comprising: executing, by a first processor, a store instruction to acquire a lock on a cache line in a shared memory system of a multiprocessor computing system; setting, by the first processor, the cache line to an exclusive state in a local cache of the first processor upon successful acquisition of the lock; and recording a lock state indicative of a windowed hold on the cache line by the first processor, wherein the windowed hold enables non-exclusive fetch requests from one or more other processors.
10 . The computer system of claim 9 , wherein the program instructions further cause the processor set to perform operations comprising: receiving a non-exclusive fetch request from a second processor; in response to receiving the non-exclusive fetch request, demoting, by the first processor, the cache line to read-only; and returning, by the first processor, data of the cache line to the second processor.
11 . The computer system of claim 10 , wherein the returning further comprises returning an indication that the cache line is held in the windowed hold by the first processor.
12 . The computer system of claim 9 , wherein the program instructions further cause the processor set to perform operations comprising: in response to receiving a non-exclusive fetch request from a second processor, setting, by the first processor, the cache line to a primed-for-long-hold state, wherein the primed-for-long-hold state causes an exclusive fetch request for the cache line by the one or more other processors to be rejected and permits a read-only request for the cache line by the one or more other processors.
13 . The computer system of claim 12 , wherein the program instructions further cause the processor set to perform operations comprising: in response to a write request being issued by the first processor, reacquiring, by the first processor, the cache line in the exclusive state; and recording the lock state indicative of a long-hold on the cache line by the first processor, wherein the lock state indicative of the long-hold causes read and write requests to the cache line by the one or more other processors to be rejected.
14 . The computer system of claim 13 , wherein the program instructions further cause the processor set to perform operations comprising: completing, by the first processor, a second store instruction and in response, releasing the lock and clearing the lock state.
15 . The computer system of claim 9 , wherein the store instruction comprises a hint instruction on a compare and swap instruction.
16 . The computer system of claim 9 , wherein the store instruction comprises a Next Instruction Access Intent (NIAI) hint instruction preceding the store instruction.
17 . A method comprising: executing, by a first processor, a non-exclusive fetch request to fetch data from a cache line that is held in a windowed hold state by a second processor, wherein the windowed hold state enables the second processor to provide data in the cache line to the first processor; in response to receiving the non-exclusive fetch request, demoting, by the second processor, the cache line to read-only; and obtaining, by the second processor, the data in the cache line that is subject to the non-exclusive fetch request.
18 . The method of claim 17 , further comprising: in response to receiving the non-exclusive fetch request, receiving, by one or more other processors, an indication that the cache line is held in the windowed hold state by the second processor.
19 . The method of claim 18 , wherein the indication prevents the one or more other processors from issuing a fetch request for the cache line.
20 . The method of claim 17 , wherein the second processor sets the cache line to a primed-for-long-hold state in response to receiving the non-exclusive fetch request from the first processor, wherein the primed-for-long-hold state causes an exclusive fetch request for the cache line by one or more other processors to be rejected and permits a read-only request for the cache line by the one or more other processors.

Description

BACKGROUND 1. Technical Field Present invention embodiments relate to multiprocessing computing systems, and more specifically, to cache line hold states for multiprocessing computing systems. 2. Discussion of the Related Art In a multiprocessing computing system, some instructions can synchronize events across the processors. One type of synchronizing event performs a lock that permits only one core or thread to perform a particular task at a given time. This type of instruction may require a core or thread to exclusively hold a memory location (e.g., a cache line), which prevents the other cores or threads from accessing the memory location. When other processors compete to access a cache line at the same time, such locks can introduce significant delays in referencing and/or manipulating memory, a process known as cache line contention. SUMMARY According to one embodiment of the present invention, a system is provided utilizing a cache line hold state that reduces cache contention. A first processor executes a store instruction to acquire a lock on a cache line in a shared memory system of a multiprocessor computing system. The first processor sets the cache line to an exclusive state in a local cache of the first processor upon successful acquisition of the lock. A lock state indicative of a windowed hold on the cache line is recorded by the first processor, wherein the windowed hold enables non-exclusive fetch requests from one or more other processors. Embodiments of the present invention further include methods for utilizing a cache line hold state that reduces cache contention in substantially the same manner described above. BRIEF DESCRIPTION OF THE DRAWINGS Generally, like reference numerals in the various figures are utilized to designate like components. FIG. 1 is a diagrammatic illustration of an example computing environment according to an embodiment of the present invention. FIG. 2 is a block diagram of application code according to an embodiment of the present invention. FIG. 3 is a block diagram of a multiprocessing computing environment according to an embodiment of the present invention. FIGS. 4A-4D are block diagrams of a multiprocessing computing environment in which a windowed hold is performed according to an embodiment of the present invention. FIG. 5 is a flowchart of a method for utilizing a windowed hold according to an embodiment of the present invention. DETAILED DESCRIPTION A present invention embodiment relates to multiprocessing computing systems, and more specifically, to cache line hold states for multiprocessing computing systems. A cache line hold state refers to a status of a cache line, which is a portion of a multiprocessing system's cache memory. In order to maintain cache coherency across multiple processors, various cache line states can determine which processors can perform operations with respect to the subject cache line, such as read or write operations. A cache line can be held by a core in any of various states, including an exclusive state (in which only the holding processor can access or modify the cache line), a read-only state (in which other processors may read but not modify the cache line), and/or other states or variations thereof. In addition to determining whether cores can read, write, and/or share cache lines, cache line states can also indicate whether data in a cache line is up-to-date, and can enforce cache coherence rules. One cache line state is a long-hold state, which can be utilized in cases where a serialized semaphore lock is contested. Specifically, a long-hold state enables a core to hold a cache line as exclusive for a length of time, after which the core can release the lock without having to reacquire the line. However, conventional long-hold states may be counterproductive because the cache line is held as exclusive for a longer duration of time than is necessary (i.e., a length of time that is detrimental to other cores or threats that may attempt to observe the value of the lock). When a work queue involves several cores moving through the queue simultaneously, there can be contention over the same cache line that is subject to a long-hold. The embodiments presented herein address this problem by providing a mechanism in which contesting cores can move onto another work element should they encounter a busy (e.g., long-held) cache line, which also minimizes any interference to the owning process by the holding core. In particular, a new hold, referred to herein as a windowed hold, is provided that enables other cores to access the data of a held cache line, and optionally, to inform other cores that the cache line is held. Thus, present embodiments reduce cache line contention by enabling other cores to move forward in their processing queues rather than becoming stuck until the long-hold is released. Present invention embodiments therefore improve the technical field of computing by improving the performance of multiprocessing