CN-122019403-A - Dynamic combination mapping method and system for 3D NAND flash memory
Abstract
The application belongs to the field of data storage, and particularly discloses a dynamic combination mapping method and a system for a 3D NAND flash memory, wherein the method comprises the steps of dividing a data stream received by an SSD solid state disk into time-limited data, space-localized data and random data; the method comprises the steps of adopting a current state and rewarding online to explore optimal actions, and aiming at maximization of a rewarding function, and configuring a mapping method for data, wherein the state comprises a data access type, a read-write proportion, an access random rate and a data stream size, the actions are page mapping and block mapping, and the rewarding comprises a block mapping random reading penalty item, a page mapping random reading rewarding item and a mapping method stability regular item. According to the application, a fixed mapping method is adopted in different time periods and the data are switched in due time, so that the balance of the ultra-large capacity SSD on the performance and the cost is satisfied by adopting the optimal mapping mode of data matching.
Inventors
- ZHOU YANG
- WANG FANG
- FENG DAN
- DONG CHAO
- ZHANG JIANSHUN
Assignees
- 华中科技大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260114
Claims (10)
- 1. The dynamic combination mapping method for the 3D NAND flash memory is characterized by comprising the following steps of: step S1, dividing a data stream received by an SSD solid state disk into time-limited data, space-localized data and random data; The method comprises the steps of S2, adopting a scheduling method based on a reinforcement learning model, adopting a current state and rewarding online exploration optimal action, and aiming at maximization of a rewarding function, and configuring a mapping method for data, wherein the state comprises access types corresponding to time-limited data, space-localized data and random data, a read-write ratio, an access random rate and a data stream size, the mapping method comprises page mapping and block mapping, the action is page mapping and block mapping selection, and rewarding comprises random reading penalty items, page mapping random reading rewarding items and regular items of mapping method stability.
- 2. The dynamic combination mapping method according to claim 1, wherein the dividing method for the data stream is: if the random rate of the current data in the data stream exceeds a first preset value, setting the current data as random data; If the current data is not random data, calculating the revisit rate of the current data, if the revisit rate of the current data exceeds a second preset value, judging the current data as time locality data, otherwise, judging the current data as space locality data.
- 3. The dynamic combination mapping method according to claim 1, wherein the updating method of the action is: Setting probability P, generating a random number R of 0 to 1, wherein if R < P, the action adopts page mapping, otherwise, the action adopts block mapping; The probability P is adjusted, then a random number R of 0 to 1 is randomly generated, and the action is updated again.
- 4. A dynamic combinatorial mapping method according to any one of claims 1 to 3, wherein the reward function is: ; Wherein, the Mapping random read penalty entries for the blocks; mapping random read bonus items for pages; A mapping method stability regular term; The random number of reads of the block mapping data; The total read times of the block mapping data; random read times for page mapping data; the total read times of the page mapping data; Is a minimum value; Representing the difference degree of the mapping method between the current period and the previous period; And Setting according to SSD solid state hardware characteristics; According to the workload volatility setting.
- 5. The method of claim 4, wherein the Q-table is updated by using a Q-Learning algorithm to implement a lightweight reinforcement-based Learning model, wherein the Q-table is constructed using a state and action set for guiding optimization of a reward function based on the reinforcement-based Learning model.
- 6. The dynamic combinational mapping method according to claim 5, wherein the updating method of the Q-table is as follows: ; Wherein, the Representing the value in the Q-table; Representing the current state; representing an action performed; And Respectively representing a learning rate and a discount factor; Representing the next state The reward r is used for updating the Q-table when the last action is executed; To be in the current state The lower selection has the maximum value Action of (2) So that the current state is changed to the next state And continue to select maximization Is the next action of (a) ; To record the next state And next action The value in the Q-table of (C).
- 7. The dynamic combination mapping method according to claim 1, further comprising the steps of: For a mapping table corresponding to random access, setting the buffer ratio of the mapping table in a DRAM buffer space as a saturation threshold, and distributing the rest space in the DRAM buffer space to a data buffer; and solving the optimal ratio of the data cache to the mapping table cache by adopting a greedy algorithm for the mapping tables corresponding to the space locality access and the time locality access.
- 8. The method of claim 7, wherein the method for solving the optimal ratio of the data cache to the mapping table cache by using a greedy algorithm is as follows: Taking the access random probability as input, and determining an optimal mapping table cache interval under the access random probability; Based on the optimal mapping table cache interval, the minimum mapping table cache size is selected in combination with the current cache hit rate, and the rest cache space in the DRAM cache space is distributed to the data cache.
- 9. A dynamic combination mapping system for a 3D NAND flash memory is characterized by comprising a feature collector and a mapping selector; the characteristic collector is used for dividing the data stream received by the SSD solid state disk into time-limited data, space-localized data and random data; The mapping selector is used for searching optimal actions on line by adopting a current state and rewards based on a reinforcement learning model, and is used for configuring a mapping method for data with the aim of maximization of a rewarding function, wherein the state comprises access types corresponding to time-localized data, space-localized data and random data, read-write proportion, access random rate and data stream size, the mapping method is page mapping and block mapping, the actions are page mapping and block mapping, and rewards comprise a block mapping random read penalty item, a page mapping random read rewarding item and a mapping method stability regular item.
- 10. The dynamic combination mapping system of claim 9, further comprising a buffer allocator configured to set a buffer ratio of the mapping table in the DRAM buffer space to a saturation threshold for the mapping table corresponding to the random access, allocate a remaining space in the DRAM buffer space to the data buffer, and solve an optimal ratio of the data buffer to the mapping table buffer for the mapping table corresponding to the space-localized access and the time-localized access using a greedy algorithm.
Description
Dynamic combination mapping method and system for 3D NAND flash memory Technical Field The application belongs to the field of data storage, and particularly relates to a dynamic combination mapping method and system for a 3D NAND flash memory. Background The rapid development of cloud computing, big data and artificial intelligence technology has led to an exponential increase in data volume. The demand for storage capacity by cloud service providers and enterprise users continues to rise. Solid State Drives (SSDs) have evolved from the first 100GB capacity to TB-level and PB-level storage. The high-capacity SSD can meet the long-term storage requirement of mass data, and can improve the calculation and storage cooperation efficiency through the scenes such as memory expansion, cache acceleration and the like, so that the SSD becomes a core component of a modern data center storage architecture. The early solid state disk mainly adopts a block mapping method to map continuous logic areas to the same physical block. While this approach simplifies the mapping table structure and reduces DRAM consumption, there are significant limitations. When random writing occurs, frequent updating of the non-consecutive logical addresses triggers repeated read-modify-write operations, resulting in severe write amplification. To make up for the shortfall of block mapping, page mapping techniques have gradually become the mainstream scheme. In page mapping, a flash memory page is used as a mapping unit, and a logical page can be mapped to any physical page in the flash memory. Thus, each page corresponds to a particular mapping. Since the number of flash pages is far beyond the number of flash blocks, a greater capacity is required for storing the mapping table. Disclosure of Invention Aiming at the defects of the prior art, the application aims to provide a dynamic combination mapping method and a system for a 3D NAND flash memory, which aim to solve the problems that when random writing occurs in the existing solid state disk by adopting a block mapping method, repeated read-change-write operation is triggered by frequently updating discontinuous logic addresses, serious write amplification phenomenon is caused, and the number of flash memory pages is far more than the number of flash memory blocks by adopting a page mapping method, and a storage mapping table needs larger capacity. The first aspect of the application relates to a dynamic combination mapping method for a 3D NAND flash memory, which specifically comprises the following steps: step S1, dividing a data stream received by an SSD solid state disk into time-limited data, space-localized data and random data; The method comprises the steps of S2, adopting a scheduling method based on a reinforcement learning model, adopting a current state and rewarding online to explore an optimal action, and aiming at maximization of a rewarding function, and configuring a mapping method for data, wherein the state comprises access types corresponding to time-limited data, space-localized data and random data, a read-write ratio, an access random rate and a data stream size, the mapping method comprises page mapping and block mapping, the action is page mapping and block mapping selection, and rewarding comprises a block mapping random read penalty item, a page mapping random read rewarding item and a mapping method stability regular item. In some embodiments, the method for partitioning the data stream is: If the random rate of the current data in the data stream exceeds a first preset value, judging the current data as random data; If the current data is not random data, calculating the revisit rate of the current data, if the revisit rate of the current data exceeds a second preset value, judging the current data as time locality data, otherwise, judging the current data as space locality data. In some embodiments, the method of updating the action is: Setting probability P, generating a random number R of 0 to 1, if R < P, adopting page mapping, otherwise adopting block mapping; The probability P is adjusted, then a random number R of 0 to 1 is randomly generated, and the action is updated again. In some embodiments, the reward function is: ; Wherein, the Mapping random read penalty entries for the blocks; mapping random read bonus items for pages; A mapping method stability regular term; The random number of reads of the block mapping data; The total read times of the block mapping data; random read times for page mapping data; the total read times of the page mapping data; Is a minimum value; Representing the difference degree of the mapping method between the current period and the previous period; And Setting according to SSD solid state hardware characteristics; According to the workload volatility setting. In some embodiments, to achieve a lightweight reinforcement-based Learning model, a Q-Learning algorithm is used to update a Q-table, wherein the Q-table