US-20260127119-A1 - MEMORY CONTROLLER FOR PROCESSING IN MEMORY AND MEMORY GENERATION METHOD USING MEMORY CONTROLLER
Abstract
A processing in memory (PIM) command generator for a PIM device includes: an input buffer, wherein the PIM command generator is configured to: store input data into the input buffer in response to receiving, from a host, a first PIM request indicating to write the input data; receive, from the host, a second PIM request indicating a PIM operation between the input data and data stored in the PIM device; in response to the second PIM request, scanning elements of the input data stored in the input buffer to generate a PIM command corresponding to a non-zero element among the input data and to skip generating a PIM command corresponding to a zero element; and transmit the generated PIM command to the PIM device.
Inventors
- Seungwoo Seo
- Sunjung Lee
- Yeongon CHO
Assignees
- SAMSUNG ELECTRONICS CO., LTD.
Dates
- Publication Date
- 20260507
- Application Date
- 20251103
- Priority Date
- 20241105
Claims (20)
- 1 . A processing in memory (PIM) command generator for a PIM device comprising: an input buffer, wherein the PIM command generator is configured to: store input data into the input buffer in response to receiving, from a host, a first PIM request indicating to write the input data; receive, from the host, a second PIM request indicating a PIM operation between the input data and data stored in the PIM device; in response to the second PIM request, scanning elements of the input data stored in the input buffer to generate a PIM command corresponding to a non-zero element among the input data and to skip generating a PIM command corresponding to a zero element; and transmit the generated PIM command to the PIM device.
- 2 . The PIM command generator of claim 1 , further comprising: one or more registers configured to store scanning information comprising tiling information of the PIM device, hardware information of the PIM device, or address mapping information, wherein the scanning is performed based on the scanning information.
- 3 . The PIM command generator of claim 1 , wherein the second PIM request indicates a matrix-vector multiplication operation between a matrix stored in memory of the PIM device and an input vector stored as input data in the input buffer of the PIM device, and wherein the memory of the PIM device performs the matrix-vector multiplication operation.
- 4 . The PIM command generator of claim 3 , wherein the PIM command generator is configured to: in response to the second PIM request, determine, based on a first element among the elements of the input vector being non-zero, a memory address of at least a portion of the matrix to be multiplied by the first element; and based on a second element among the elements of the input vector being zero, skip determining a memory address of at least a portion of the matrix to be multiplied by the second element among the memory of the PIM device.
- 5 . The PIM command generator of claim 4 , wherein the PIM command generator is configured to: based on tiling information of the PIM device, divide the matrix into one or more memory tiles; with respect to each memory tile having an element to be multiplied by the first element among the one or more memory tiles, generate one PIM command indicating multiplication between the first element and a column of a corresponding memory tile corresponding to the first element and indicating accumulation of each multiplication result into an element of a corresponding output vector.
- 6 . The PIM command generator of claim 1 , wherein the PIM command generator is configured to: based on information indicating each non-zero element among the input data and based on information about the PIM device, determine an address of a memory area corresponding to a corresponding element among memory of the PIM device; and generate one or more PIM commands using the element and the address of the memory area.
- 7 . The PIM command generator of claim 1 , wherein the PIM command generator is configured to transmit one or more generated PIM commands to a memory command queue.
- 8 . An electronic device comprising: a processing in memory (PIM) command generator configured to generate, based on a PIM request received from a host, one or more PIM commands configured to implement the PIM request when executed by a PIM device, wherein the PIM command generator comprises an input buffer, and wherein the PIM command generator is configured to: store the input data in the input buffer in response to receiving a first PIM request indicating to write input data; receive a second PIM request indicating a PIM operation between the input data and data stored in a PIM device; in response to the second PIM request, scanning elements of the input data stored in the input buffer to determine to skip generating a PIM command corresponding to a first element of the input data based on the first element being zero and to determine to generate a PIM command corresponding to a second element of the input data based on the second element being non-zero; and transmit the generated PIM command to the PIM device.
- 9 . The electronic device of claim 8 , further comprising: an arbiter configured to determine whether to classify a memory request received from the host as a standard memory request or as a PIM request; and a standard command generator, wherein the arbiter is configured to: based on classifying a first memory request as a standard memory request, transmit the first memory request to the standard command generator; and based on classifying a second memory request as a PIM request, transmit the second memory request to the PIM command generator, wherein the standard command generator is configured to generate a standard memory command based on receiving the first memory request from the arbiter, and wherein the PIM command generator is configured to generate the one or more PIM commands based on receiving the second memory request from the arbiter.
- 10 . The electronic device of claim 8 , wherein the PIM command generator further comprises one or more registers configured to store scanning information comprising tiling information of the PIM device, hardware information of the PIM device, or address mapping information, wherein the scanning is performed based on the scanning information.
- 11 . The electronic device of claim 8 , wherein the second PIM request indicates a matrix-vector multiplication operation between a matrix stored in a memory of the PIM device and an input vector stored as input data in the input buffer.
- 12 . The electronic device of claim 11 , wherein generating the PIM command comprises determining a memory address of at least a portion of the matrix to be multiplied by the first element.
- 13 . The electronic device of claim 12 , wherein the PIM command generator is configured to: based on tiling information of the PIM device, divide the matrix into one or more memory tiles; and with respect to each memory tile having an element to be multiplied by the first element among the one or more memory tiles, generate one respective PIM command indicating multiplication between the first element and a column of a corresponding memory tile corresponding to the first element and indicating accumulation of each multiplication result into an element of a corresponding output vector.
- 14 . The electronic device of claim 8 , wherein the PIM command generator is configured to: determine, based on information indicating each non-zero element among the input data, information about the PIM device, an address of a memory area corresponding to a corresponding element, the memory area in memory of the PIM device; and generate one or more PIM commands using the element and an address of the memory area.
- 15 . The electronic device of claim 8 , further comprising: a memory command queue, wherein the PIM command generator is configured to transmit the generated one or more PIM commands to the memory command queue.
- 16 . The electronic device of claim 8 , further comprising: a standard command generator; a standard memory command queue; and a PIM command queue, wherein the PIM command generator is configured to transmit the generated one or more PIM commands to the PIM command queue, and wherein the standard command generator is configured to generate, based on a standard memory request received from the host, a standard memory command and then transmit the generated standard memory command to the standard memory command queue.
- 17 . The electronic device of claim 16 , further comprising: a scheduler connected to the standard memory command queue and the PIM command queue, wherein the scheduler is configured to: determine, through scheduling, one queue among the standard memory command queue or the PIM command queue; and transmit, to the PIM device, at least one of the standard memory command or the PIM command received from the determined one queue.
- 18 . A method of generating a memory command, the method performed by a processing in memory (PIM) command generator configured to generate commands to be executed by a PIM device, the method comprising: storing input data into an input buffer of the PIM command generator, the storing in response to receiving, from a host, by the PIM command generator, a first PIM request indicating to write input data; receiving, from the host, by the PIM command generator, a second PIM request indicating a PIM operation between the input data and data stored in the PIM device; in response to the second PIM request, scanning elements of the input data stored in the input buffer, and based on the scanning generating a PIM command corresponding to a non-zero element among the input data and further based on the scanning skipping generating a PIM command corresponding to a zero element; and transmitting, by the PIM command generator, the generated PIM command to the PIM device.
- 19 . The method of claim 18 , wherein the second PIM request is configured to indicate a matrix-vector multiplication operation between a matrix stored in a memory of the PIM device and the input vector stored as the input data in the input buffer.
- 20 . A non-transitory computer-readable storage medium storing commands that, when executed by one or more processors, cause the one or more processors to perform the method of claim 18 .
Description
CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit under 35 USC § 119(a) of Korean Patent Application No. 10-2024-0155324, filed on Nov. 5, 2024, in the Korean Intellectual Property Office, the entire disclosure of which is incorporated by reference herein for all purposes. BACKGROUND 1. Field The following description relates to a memory controller for a processing in memory (PIM) device. 2. Description of Related Art With the rise of large-scale artificial intelligence (AI) models (e.g., large language models (LLMs)), there is a growing trend toward increasingly larger AI models. An LLM may generally be divided into a summarization stage and a generation stage. However, as the size of AI models increases and the number of tokens generated based on the AI models grows, the processing time of the generation stage may dominate the overall operation time of an AI model due to high memory bandwidth demand significantly exceeding available memory bandwidth capacity. To address this memory bandwidth issue, processing-in-memory (PIM) is being studied. PIM devices not only function as memory devices by storing data but also include a function to process the data directly within the memory. Mathematical operations may be performed by a PIM device on data stored therein both before, during, and after the mathematical operations, and results of the mathematical operations may be stored in or outputted from the PIM device. PIM technology can improve overall system performance by performing computations closer to the memory, thereby reducing data transfer bottlenecks between the memory and a host processor. A memory device with PIM technology may integrate operation units or processing cores near memory cells, enabling a processing task to be performed within the memory without the need to temporarily move data in and out of the memory. As a result, the load on the host processor may be reduced, host processor idle time may be reduced, power consumption may be decreased, and data processing speed may be improved. The above description is information the inventor(s) acquired during the course of conceiving the present disclosure, or already possessed at the time, and is not necessarily art publicly known before the present application was filed. SUMMARY This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. In one general aspect, a processing in memory (PIM) command generator for a PIM device includes: an input buffer, wherein the PIM command generator is configured to: store input data into the input buffer in response to receiving, from a host, a first PIM request indicating to write the input data; receive, from the host, a second PIM request indicating a PIM operation between the input data and data stored in the PIM device; in response to the second PIM request, scanning elements of the input data stored in the input buffer to generate a PIM command corresponding to a non-zero element among the input data and to skip generating a PIM command corresponding to a zero element; and transmit the generated PIM command to the PIM device. The PIM command generator may further include: one or more registers configured to store scanning information including tiling information of the PIM device, hardware information of the PIM device, or address mapping information, wherein the scanning may be performed based on the scanning information. The second PIM request may indicate a matrix-vector multiplication operation between a matrix stored in memory of the PIM device and an input vector stored as input data in the input buffer of the PIM device, and wherein the memory of the PIM device performs the matrix-vector multiplication operation. The PIM command generator may be configured to: in response to the second PIM request, determine, based on a first element among the elements of the input vector being non-zero, a memory address of at least a portion of the matrix to be multiplied by the first element; and based on a second element among the elements of the input vector being zero, skip determining a memory address of at least a portion of the matrix to be multiplied by the second element among the memory of the PIM device. The PIM command generator may be configured to: based on tiling information of the PIM device, divide the matrix into one or more memory tiles; with respect to each memory tile having an element to be multiplied by the first element among the one or more memory tiles, generate one PIM command indicating multiplication between the first element and a column of a corresponding memory tile corresponding to the first element and indicating accumulation of each multiplication result into an element of