US-20260127051-A1 - APPLICATION PROGRAMMING INTERFACE TO IDENTIFY MEMORY

US20260127051A1US 20260127051 A1US20260127051 A1US 20260127051A1US-20260127051-A1

Abstract

Apparatuses, systems, and techniques to execute one or more application programming interface (API) functions to facilitate parallel computing. In at least one embodiment, one or more APIs are to indicate one or more storage locations using various novel techniques described herein.

Inventors

Fnu Vishnuswaroop Ramesh
Houston Thompson Hoffman

Assignees

NVIDIA CORPORATION

Dates

Publication Date: 20260507
Application Date: 20250905

Claims (20)

1 - 28 . (canceled)
29 . One or more processors, comprising: circuitry to: in response to an application programming interface (API) call: identify whether a pool of memory addresses comprises an allocated portion of memory based, at least in part, on, a memory address and metadata associated with one or more pools of memory addresses, wherein the memory address is indicated by an input parameter of the API call; and indicate, as an output of the API call, an identifier of the pool of memory addresses based on whether the pool of memory addresses comprises the allocated portion of memory, wherein the identifier is different from the memory address.
30 . The one or more processors of claim 29 , wherein the metadata associated with the one or more pools of memory addresses comprises information indicating allocation status, memory type, or access permissions for the one or more pools of memory addresses.
31 . The one or more processors of claim 29 , wherein the identifier of the pool of memory addresses is represented as a handle that uniquely identifies the pool of memory addresses within a memory management system.
32 . The one or more processors of claim 29 , wherein the API call further comprises an input parameter indicating a memory operation type that comprises allocation, deallocation, or querying memory attributes.
33 . The one or more processors of claim 29 , wherein the circuitry is further configured to perform a validation check on the memory address indicated by the input parameter to determine whether the memory address conforms to a predefined memory address format.
34 . The one or more processors of claim 29 , wherein the identifier of the pool of memory addresses is used to facilitate inter-process communication (IPC) between software modules executing on different processors.
35 . The one or more processors of claim 29 , wherein in response to the API call, invoke the API to return a null identifier in response to determining that the memory address indicated by the input parameter does not belong to any pool of memory addresses.
36 . The one or more processors of claim 29 , wherein the metadata associated with the one or more pools of memory addresses includes historical usage statistics that comprises peak memory allocation or average memory utilization.
37 . A system, comprising: one or more processors to: in response to an application programming interface (API) call: identify whether a pool of memory addresses comprises an allocated portion of memory based, at least in part, on, a memory address and metadata associated with one or more pools of memory addresses, wherein the memory address is indicated by an input parameter of the API call; and indicate, as an output of the API call, an identifier of the pool of memory addresses based on whether the pool of memory addresses comprises the allocated portion of memory, wherein the identifier is different from the memory address.
38 . The system of claim 37 , wherein the metadata associated with the one or more pools of memory addresses comprises information indicating allocation status, memory type, or access permissions for the one or more pools of memory addresses.
39 . The system of claim 37 , wherein the identifier of the pool of memory addresses is represented as a handle that uniquely identifies the pool of memory addresses within a memory management system.
40 . The system of claim 37 , wherein the API call further comprises an input parameter indicating a memory operation type that comprises allocation, deallocation, or querying memory attributes.
41 . The system of claim 37 , wherein the one or more processors further perform a validation check on the memory address indicated by the input parameter to determine whether the memory address conforms to a predefined memory address format.
42 . The system of claim 37 , wherein the identifier of the pool of memory addresses is used to facilitate inter-process communication (IPC) between software modules executing on different processors.
43 . The system of claim 37 , wherein in response to the API call, invoke the API to return a null identifier in response to determining that the memory address indicated by the input parameter does not belong to any pool of memory addresses.
44 . The system of claim 37 , wherein the metadata associated with the one or more pools of memory addresses includes historical usage statistics that comprises peak memory allocation or average memory utilization.
45 . A method, comprising: in response to an application programming interface (API) call: identifying whether a pool of memory addresses comprises an allocated portion of memory based, at least in part, on, a memory address and metadata associated with one or more pools of memory addresses, wherein the memory address is indicated by an input parameter of the API call; and indicating, as an output of the API call, an identifier of the pool of memory addresses based on whether the pool of memory addresses comprises the allocated portion of memory, wherein the identifier is different from the memory address.
46 . The method of claim 45 , wherein the metadata associated with the one or more pools of memory addresses comprises information indicating allocation status, memory type, or access permissions for the one or more pools of memory addresses.
47 . The method of claim 45 , wherein the identifier of the pool of memory addresses is represented as a handle that uniquely identifies the pool of memory addresses within a memory management system.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. application Ser. No. 17/720,179, entitled “APPLICATION PROGRAMMING INTERFACE TO IDENTIFY MEMORY,” filed Apr. 13, 2022, which claims the benefit of U.S. Provisional Application No. 63/174,895, entitled “ENHANCEMENTS TO STREAM ORDERED ALLOCATORS,” filed Apr. 14, 2021, the entire contents of which is incorporated herein by reference. FIELD At least one embodiment pertains to processing resources used to execute one or more application programming interfaces (APIs) to facilitate parallel computing. For example, at least one embodiment pertains to processors or computing systems used to execute one or more programs that implement one or more APIs to facilitate parallel computing comprising various novel techniques described herein. BACKGROUND While the development of various accelerators (e.g., graphics processing units (GPUs)) has provided numerous advantages, with these advantages comes greater complexity. Generally, different programming models creates complexities that, if not managed effectively, can result in less than optimal performance. As one example, memory management can become complex, especially in contexts where one processor causes another processor to perform operations, such as by running kernels. Some techniques to address these issues either over allocate or under allocate memory, potentially causing inefficiencies and/or performance issues. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram illustrating a driver and/or runtime comprising one or more libraries to provide one or more application programming interfaces (APIs), in accordance with at least one embodiment; FIG. 2 is a block diagram illustrating an API to determine if a memory address is included in one or more memory blocks to be used by one or more computing devices, in accordance with at least one embodiment; FIG. 3 is a block diagram illustrating an API to determine if one or more memory blocks to be used by one or more computing devices are sharable during performance of one or more software modules, in accordance with at least one embodiment; FIG. 4 is a block diagram illustrating an API to determine one or more attributes of one or more memory blocks to be used by one or more computing devices, in accordance with at least one embodiment; FIG. 5 illustrates a process to determine, by one or more APIs, if a memory address is included in one or more memory blocks to be used by one or more computing devices, in accordance with at least one embodiment; FIG. 6 illustrates a process to determine, by one or more APIs, if one or more memory blocks to be used by one or more computing devices are to be shared and/or have been shared during performance of one or more software modules, in accordance with at least one embodiment; FIG. 7 illustrates a process to determine, in response to one or more calls to one or more APIs, one or more attributes of one or more memory blocks to be used by one or more computing devices, in accordance with at least one embodiment; FIG. 8 illustrates an exemplary data center, in accordance with at least one embodiment; FIG. 9 illustrates a processing system, in accordance with at least one embodiment; FIG. 10 illustrates a computer system, in accordance with at least one embodiment; FIG. 11 illustrates a system, in accordance with at least one embodiment; FIG. 12 illustrates an exemplary integrated circuit, in accordance with at least one embodiment; FIG. 13 illustrates a computing system, according to at least one embodiment; FIG. 14 illustrates an APU, in accordance with at least one embodiment; FIG. 15 illustrates a CPU, in accordance with at least one embodiment; FIG. 16 illustrates an exemplary accelerator integration slice, in accordance with at least one embodiment; FIGS. 17A and 17B illustrate exemplary graphics processors, in accordance with at least one embodiment; FIG. 18A illustrates a graphics core, in accordance with at least one embodiment; FIG. 18B illustrates a GPGPU, in accordance with at least one embodiment; FIG. 19A illustrates a parallel processor, in accordance with at least one embodiment; FIG. 19B illustrates a processing cluster, in accordance with at least one embodiment; FIG. 19C illustrates a graphics multiprocessor, in accordance with at least one embodiment; FIG. 20 illustrates a graphics processor, in accordance with at least one embodiment; FIG. 21 illustrates a processor, in accordance with at least one embodiment; FIG. 22 illustrates a processor, in accordance with at least one embodiment; FIG. 23 illustrates a graphics processor core, in accordance with at least one embodiment; FIG. 24 illustrates a PPU, in accordance with at least one embodiment; FIG. 25 illustrates a GPC, in accordance with at least one embodiment; FIG. 26 illustrates a streaming multiprocessor, in accordance with at least one embodiment; FIG. 27 illustrates a software stack of a programming platform, in accordan