Search

US-20260127181-A1 - DATA ACCESS METHOD AND STORAGE DEVICE

US20260127181A1US 20260127181 A1US20260127181 A1US 20260127181A1US-20260127181-A1

Abstract

A data access method, includes: the storage device receives a write access request, where the write access request includes a first key and to-be-written first data; writes the first key and the first data into a memory of the storage device based on a first index structure; and records a correspondence between the first key and first metadata in an index file based on a second index structure, where the first metadata indicates a storage address of the first data in the memory or a disk of the storage device, the index file is stored in the memory of the storage device, and the index file is used for reading data stored in the memory or the disk.

Inventors

  • Xiaoyang Zhang
  • Zhenbo Zhang
  • Zheng Li

Assignees

  • HUAWEI TECHNOLOGIES CO., LTD.

Dates

Publication Date
20260507
Application Date
20251229
Priority Date
20230707

Claims (20)

  1. 1 . A data access method, applied to a storage device, wherein the method comprises: receiving a write access request, wherein the write access request comprises a first key and to-be-written first data; writing the first key and the first data into a memory of the storage device based on a first index structure; and recording a correspondence between the first key and first metadata in an index file based on a second index structure, wherein: the first data is stored in the memory and the first metadata indicates a storage address of the first data in the memory, or the first data is stored in a disk of the storage device and the first metadata indicates a storage address of the first data in the disk, the index file is stored in the memory of the storage device, and the index file is used for reading data stored in the memory or the disk.
  2. 2 . The method according to claim 1 , wherein the method further comprises: receiving a read access request, wherein the read access request comprises the first key; querying the index file for the first metadata based on the first key; and reading the first data from the memory or the disk based on the first metadata.
  3. 3 . The method according to claim 1 , wherein recording the correspondence between the first key and the first metadata in the index file based on the second index structure comprises: when the first key and the first data are written into the memory, recording the correspondence between the first key and the first metadata in the index file based on the second index structure, wherein the first metadata indicates a storage address of the first data in the memory.
  4. 4 . The method according to claim 1 , wherein recording the correspondence between the first key and the first metadata in the index file based on the second index structure comprises: when second data stored in the memory is compacted and stored in the disk, recording the correspondence between the first key and the first metadata in the index file based on the second index structure, wherein the second data comprises the first data, and the first metadata indicates the storage address of the first data in the disk.
  5. 5. The method according to claim 4 , the method further comprises: when an amount of data stored in the memory reaches a first threshold, compacting the second data stored in the memory and storing it in the disk.
  6. 6 . The method according to claim 4 , wherein the disk uses tiered storage, and the method further comprises: when third data stored at an i th level of the disk is compacted and stored to an (i+1) th level of the disk, recording the correspondence between the first key and the first metadata in the index file based on the second index structure, wherein the third data comprises the first data, and the first metadata indicates a storage address of the first data at the (i+1) th level.
  7. 7. The method according to claim 6 , the method further comprises: when an amount of data stored at the ith level reaches a second threshold, compacting the third data stored in the ith level and storing it in the (i+ 1 )th level of the disk.
  8. 8 . The method according to claim 1 , wherein the first index structure comprises a log-structured merge (LSM-tree) structure.
  9. 9 . The method according to claim 1 , wherein the second index structure comprises any one of the following: a hash structure, a B+ tree structure, or a learned index structure.
  10. 10 . A storage device, wherein the storage device comprises a least one processor, a first memory and a disk; the first memory is configured to store computer program instructions; and when the program instructions are executed by the processor, it enables the storage device to: receive a write access request, wherein the write access request comprises a first key and to-be-written first data; write the first key and the first data into a memory of the storage device based on a first index structure; and record a correspondence between the first key and first metadata in an index file based on a second index structure, wherein: the first data is stored in the memory and the first metadata indicates a storage address of the first data in the memory, or the first data is stored in the disk and the first metadata indicates a storage address of the first data in the disk, the index file is stored in the memory of the storage device, and the index file is configured to be used for reading data stored in the memory or the disk.
  11. 11 . The storage device according to claim 10 , when the program instructions are executed by the processor, it further enables the storage device to: receive a read access request, wherein the read access request comprises the first key; query the index file for the first metadata based on the first key; and read the first data from the memory or the disk based on the first metadata.
  12. 12 . The storage device according to claim 10 , wherein the recording the correspondence between the first key and the first metadata in the index file based on the second index structure comprises: when the first key and the first data are written into the memory, recording the correspondence between the first key and the first metadata in the index file based on the second index structure, wherein the first metadata indicates a storage address of the first data in the memory.
  13. 13 . The storage device according to claim 10 , wherein the recording the correspondence between the first key and the first metadata in the index file based on the second index structure comprises: when second data stored in the memory is compacted and stored in the disk, recording the correspondence between the first key and the first metadata in the index file based on the second index structure, wherein the second data comprises the first data, and the first metadata indicates the storage address of the first data in the disk.
  14. 14. The storage device according to claim 13 , when the program instructions are executed by the processor, it further enables the storage device to: when an amount of data stored in the memory reaches a first threshold, compact the second data stored in the memory and store it in the disk.
  15. 15 . The storage device according to claim 13 , wherein the disk uses tiered storage, and when the program instructions are executed by the processor, it further enables the storage device to: when third data stored at an i th level of the disk is compacted and stored to an (i+1) th level of the disk, record the correspondence between the first key and the first metadata in the index file based on the second index structure, wherein the third data comprises the first data, and the first metadata indicates a storage address of the first data at the (i+1) th level.
  16. 16. The storage device according to claim 15 , when the program instructions are executed by the processor, it further enables the storage device to: when an amount of data stored at the ith level reaches a second threshold, compact the third data stored in the ith level and store it in the (i+ 1 )th level of the disk.
  17. 17 . The storage device according to claim 10 , wherein the first index structure comprises a log-structured merge (LSM-tree) structure.
  18. 18 . The storage device according to claim 10 , wherein the second index structure comprises any one of the following: a hash structure, a B+ tree structure, or a learned index structure.
  19. 19 . A non-transitory computer-readable storage medium storing instructions, wherein when the instructions are executed by a computing device, it enables the computing device to: receive a write access request, wherein the write access request comprises a first key and to-be-written first data; write the first key and the first data into a memory of a storage device based on a first index structure; and record a correspondence between the first key and first metadata in an index file based on a second index structure, wherein: the first data is stored in the memory and the first metadata indicates a storage address of the first data in the memory, or the first data is stored in a disk of the storage device and the first metadata indicates a storage address of the first data in the disk, the index file is stored in the memory of the storage device, and the index file is configured to be used for reading data stored in the memory or the disk.
  20. 20 . The computer-readable storage medium according to claim 19 , it further enables the computing device to: receive a read access request, wherein the read access request comprises the first key; query the index file for the first metadata based on the first key; and read the first data from the memory or the disk based on the first metadata.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of International Application PCT/CN2024/098237, filed on June 7, 2024, which claims priority to Chinese Patent Application No. 202310835543.4, filed on July 7, 2023 and Chinese Patent Application No. 202311361220.2, filed on October 18, 2023. All of the aforementioned patent applications are hereby incorporated by reference in their entireties. TECHNICAL FIELD This application relates to the field of computer technologies, and in particular, to a data access method and a storage device. BACKGROUND With the rapid development of computer technologies, requirements for storage performance are increasingly high. Index information is a critical path of an input/output flow in the storage field, and is critical to read-write performance of a storage product. A log-structured merge-tree (LSM-tree) is a basic algorithm of an index technology and is widely used by large companies and in open-source products. However, a core of the LSM-tree is to give up a partial read capability in exchange for maximizing a write capability. As more data is written into a storage system, performing a read operation based on the LSM-tree requires a large quantity of disk read actions. This is time-consuming. Therefore, how to construct an index solution with excellent read-write performance is still an important problem to be urgently resolved. SUMMARY Embodiments of this application provide a data access method and a storage device, to provide an index solution with excellent read-write performance, so as to reduce overheads and a delay of data read access. According to a first aspect, an embodiment of this application provides a data access method. The method may be performed by a storage device, and the storage device may be a memory that stores data by using an LSM-tree or a storage node in a storage system. In the method, the storage device receives a write access request, where the write access request includes a first key and to-be-written first data; writes the first key and the first data into a memory of the storage device based on a first index structure; and records a correspondence between the first key and first metadata in an index file based on a second index structure, where the first metadata indicates a storage address of the first data in the memory or a disk of the storage device, the index file is stored in the memory of the storage device, and the index file is used for reading data stored in the memory or the disk. According to the foregoing method, an LSM-tree-based write I/O path of the storage device is not changed, but a quick read path different from a conventional LSM-tree-based read I/O path is additionally constructed, so that when receiving a read access request, the storage device may quickly access the memory or the disk of the storage device based on the constructed quick read path, to quickly find target data that needs to be read. For example, the storage device receives the read access request, where the read access request includes the first key; queries, for the first metadata based on the first key, the index file stored in the memory; and reads the first data from the memory or the disk based on the first metadata. In a possible implementation, if the first data is stored in the memory, the first metadata includes the storage address of the first data in the memory. In another possible implementation, if the first data is stored in the disk, the first metadata includes the storage address of the first data in the disk. In other words, the index file may be further correspondingly updated based on different write locations of the first data, to update, in the index file in a timely manner, the correspondence that is between the first key and the metadata and that corresponds to the first data, so as to ensure that the storage device can read correct data when processing the received read access request. In a possible implementation, recording the correspondence between the first key and the first metadata in the index file based on the second index structure may include: when the first key and the first data are written into the memory, recording the correspondence between the first key and the first metadata in the index file based on the second index structure. In a possible implementation, recording the correspondence between the first key and the first metadata in the index file based on the second index structure includes: when second data already stored in the memory is compacted and stored in the disk, recording the correspondence between the first key and the first metadata in the index file based on the second index structure, where the second data includes the first data, and the first metadata indicates that the storage address of the first data in the disk (for example, a level 0). According to the foregoing method, in the LSM-tree-based storage device, when data in the memory is compacted and stored in the