EP-4738107-A1 - SKEW RESISTANCE PROCESSING IN DIMM DEVICE AND PROCESSING METHOD
Abstract
The present disclosure relates to a skew resistance PID device and a processing method, and the device includes: Dual In-line Memory Modules (DIMMs) composed of multiple ranks respectively having multiple banks, and In-DIMM Processors (IDPs) that process internal memory operations; a memory controller; and a host CPU connected to the memory module through the memory controller and configured to enhance parallel processing performance of the IDPs by replicating a join key in units of bank sets and rank sets.
Inventors
- KIM, Youngsok
- LEE, Suhyun
- LIM, CHAEMIN
- CHOI, JINWOO
- KIM, HANJUN
Assignees
- University-Industry Foundation, Yonsei University
Dates
- Publication Date
- 20260506
- Application Date
- 20250207
Claims (10)
- A skew resistance Processing in DIMM (PID) device, comprising: Dual In-line Memory Modules (DIMMs) composed of multiple ranks respectively having multiple banks, and In-DIMM Processors (IDPs) that process internal memory operations; a memory controller; and a host CPU connected to the memory modules through the memory controller and configured to enhance parallel processing performance of the IDPs by replicating a join key in units of bank sets and rank sets.
- The skew resistance PID device of claim 1, wherein the host CPU analyzes a configuration of R and S tables and determines a cost model for determining a replication ratio based on a configuration of the PID device.
- The skew resistance PID device of claim 2, wherein the host CPU determines a bank set count and a rank set count by calculating an optimal join key replication ratio through the cost model.
- The skew resistance PID device of claim 2, wherein the host CPU performs a Host-to-DIMM Scatter operation to distribute the R and S tables to the DIMM.
- The skew resistance PID device of claim 1, wherein the IDPs perform a Bank and Rank Set-aware Partitioning operation to replicate the R table to bank sets and rank sets and distribute the S table to the bank sets and the rank sets based on the replication of the R table.
- The skew resistance PID device of claim 5, wherein the host CPU performs an All-to-All Inter-IDP Shuffle operation to transmit data of the R and S tables to each of the IDPs so that each of the IDPs exchanges and processes data of the R and S tables.
- The skew resistance PID device of claim 6, wherein each of the IDPs performs a Single-IDP Join operation to generate a join result by performing a join operation based on data of the R and S tables.
- The skew resistance PID device of claim 7, wherein each of the IDPs performs hash join or sort-merge join as the join operation.
- The skew resistance PID device of claim 7, wherein the IDPs transmit corresponding join results to the host CPU so that the host CPU collects the join results to generate a final result.
- A skew resistance Processing in DIMM (PID) processing method, performed by a skew resistance PID device which comprises: Dual In-line Memory Modules (DIMMs) composed of multiple ranks respectively including multiple banks, and In-DIMM Processors (IDPs) that process internal memory operations; a memory controller; and a host CPU connected to the memory modules through the memory controller and configured to enhance parallel processing performance of the IDPs by replicating a join key in units of bank sets and rank sets, the method comprising: determining a replication ratio based on a configuration of the PID device; and enhancing parallel processing performance of the IDPs by replicating a join key in units of bank sets and rank sets based on the determined replication ratio.
Description
This application claims under 35 U.S.C. §119(a) the benefit of Korean Patent Application No. 10-2024-0150358 filed on October 30, 2024, the entire contents of which is incorporated herein by reference. [TECHNICAL FIELD] The present disclosure relates to a PID technology, and more specifically, to a skew resistance PID device and processing method for enhancing parallel processing performance of In-DIMM Processor (IDPs) by replicating a join key on the basis of bank and rank units through a memory controller. [BACKGROUND] Recent advances in dual in-line memory modules (DIMMs) have enabled DIMMs to support Processing-In-DIMM (PID) by placing the In-DIMM Processors (IDP) closer to the memory banks. PID may accelerate applications suffering from memory wall problems by offloading the memory intensive tasks to IDP. Offloading work to the IDP allows applications to take advantage of the DIMM's high internal memory bandwidth, minimizing data movement between the host central processing unit (CPU) and the DIMM. Although commercial DIMMs supporting PID were not available until recently, the introduction of UPMEM DIMMs and Samsung AxDIMMs has led to growing interest in PID across a range of fields including bioinformatics, machine learning, and security. In-memory databases often suffer from the memory wall problem, which has been shown to be greatly improved with PID. In particular, previous studies have proposed a PID join algorithm to accelerate in-memory join operations. A join operation involves two tables, R and S. The CPU evenly distributes the tuples of R and S to each IDP, then allows each IDP to perform global partitioning independently. The CPU then reshuffles the tuples between IDPs, allowing each IDP to process its own partition, with each IDP performing a local join operation. The CPU then collects the output tuples from all IDPs and performs a fast in-memory join operation. However, the existing PID join algorithm suffers from poor performance and scalability when the input table is skewed. These algorithms use global partitioning per IDP to evenly distribute the computational load, but skewed input tables cause severe load imbalances, leading some IDPs to process while others remain idle. (Prior Art Literature) (Patent Document) Korean Patent Application Publication No. 2022-0062399 (May 16, 2022) [SUMMARY] In view of the above, the present disclosure provides a skew resistance Processing in DIMM (PID) device and processing method which enables enhancing parallel processing performance of In-DIMM Processors (IDPs) by replicating a join key in units of bank sets and rank sets. The present disclosure also provides a skew resistance PID device and processing method which enables determining a cost model for determining a replication ratio based on a configuration of the PID device. The present disclosure also provides a skew resistance PID device and processing method which enables determining a bank set count and a rank set count by calculating an optimal join key replication ratio RRoptimal through a cost model. In one aspect, there is provided a skew-resistant PID device, including: Dual In-line Memory Modules (DIMMs) composed of multiple ranks respectively having multiple banks, and In-DIMM Processors (IDPs) that process internal memory operations; a memory controller; and a host CPU connected to the memory module through the memory controller and configured to enhance parallel processing performance of the IDPs by replicating a join key in units of bank sets and rank sets. The host CPU may analyze a configuration of R and S tables and determines a cost model for determining a replication ratio based on a configuration of the PID device. The host CPU may determine a bank set count and a rank set count by calculating an optimal join key replication ratio through the cost model. The host CPU may perform a Host-to-DIMM Scatter operation to distribute the R and S tables to the DIMM. The IDPs may perform a Bank and Rank Set-aware Partitioning operation to replicate the R table to bank sets and rank sets and distribute the S table to the bank sets and the rank sets based on the replication of the R table. The host CPU may perform an All-to-All Inter-IDP Shuffle operation to transmit data of the R and S tables to each of the IDPs so that each of the IDPs exchanges and processes data of the R and S tables. Each of the IDPs may perform a Single-IDP Join operation to generate a join result by performing a join operation based on data of the R and S tables. Each of the IDPs may perform hash join or sort-merge join as the join operation. The IDPs may transmit corresponding join results to the host CPU so that the host CPU collects the join results to generate a final result. In another aspect, there is provided skew resistance Processing in DIMM (PID) processing method, performed by a skew resistance PID device which includes: Dual In-line Memory Modules (DIMMs) composed of multiple ranks respectively including