CN-122018794-A - Data reading method, device, equipment and storage medium
Abstract
The application discloses a data reading method, a device, computer equipment and a storage medium, wherein the method comprises the steps of setting a cache layer in a memory; the method comprises the steps of receiving a first data reading request, reading target data information from a hard disk layer, selecting a data block adjacent to the target data information from the hard disk layer according to the physical position of the target data information in the hard disk layer and a preloading algorithm, preloading the data block to a cache layer, receiving a second data reading request, reading the target data information from the cache layer, and reading the target data information from the hard disk layer if the target data information does not exist in the cache layer. By the method, the data reading efficiency can be greatly improved.
Inventors
- LI FENGQI
Assignees
- 济南浪潮数据技术有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260122
Claims (10)
- 1. A data reading method, comprising: Setting a cache layer in a memory; Reading target data information from the hard disk layer in response to receiving the first data read request; selecting a data block adjacent to the target data information position from the hard disk layer according to the physical position of the target data information on the hard disk layer and a preloading algorithm, and preloading the data block to the cache layer; and in response to receiving a second data reading request, reading the target data information from the cache layer, and if the target data information does not exist in the cache layer, reading the target data information from the hard disk layer.
- 2. A method of reading data according to claim 1, further comprising: Responding to the received second data reading request and starting the cache layer, and reading the target data information from the cache layer; and in response to receiving the second data reading request and closing the cache layer, reading the target data information from the hard disk layer.
- 3. The method for reading data according to claim 1, wherein in response to receiving the first data reading request, the target data information is read from a hard disk layer including a first hard disk layer and a second hard disk layer, comprising: inquiring the target data information in the first hard disk layer according to the first data reading request; responding to the first hard disk layer, and returning the target data information; And responding to the fact that the first hard disk layer does not exist the target data information, inquiring the target data information in the second hard disk layer, and returning the target data information.
- 4. A method of reading data according to claim 1, wherein the method further comprises: responding to the received data reading request, and inquiring the target data information in the cache layer; Reading the target data information from the hard disk layer and writing the record information into a cache layer queue head in response to the cache layer not having the target data information; responding to the target data information existing in the cache layer, and moving a record corresponding to the target data information to the head of the cache layer queue; And deleting the data information at the tail of the queue of the cache layer according to a least recently used algorithm in response to the cache capacity of the cache layer reaching a first threshold.
- 5. The data reading method according to claim 1, wherein selecting a data block adjacent to the target data information location from the hard disk layer according to the physical location of the target data information at the hard disk layer and a preloading algorithm, and preloading the data block to the cache layer, comprises: responding to the received first data reading request, and acquiring corresponding target data information; Obtaining an access log, extracting a data reading request history record in a preset time range from the access log, and calculating a physical address difference value between the current data reading request and the last data reading request; responding to the physical address difference value being smaller than a second threshold value, selecting the target data and a first preset number of adjacent data blocks to preload in a sequential access mode; And responding to the physical address difference value being larger than the second threshold value, selecting a second preset number of high-frequency access data blocks near the target data for preloading in a non-sequential access mode.
- 6. The method of claim 5, wherein selecting the target data and then the first predetermined number of adjacent data blocks for preloading for sequential access in response to the physical address difference being less than the second threshold comprises: Acquiring a physical address number of the target data information; based on the physical address numbers, sequentially selecting the data blocks corresponding to a first preset number of continuous physical addresses backwards; sequentially loading the continuous data blocks to preset positions of the cache layer; And deleting the data block at the tail of the cache queue according to a least recently used algorithm in response to the cache layer capacity reaching a first threshold.
- 7. The method of claim 5, wherein selecting a second predetermined number of high frequency access data blocks near the target data for preloading for non-sequential access in response to the physical address difference being greater than the second threshold comprises: Taking the physical address of the target data information as a center address, and determining a preset range of the physical address of the target data information; Screening at least one data block with access frequency higher than a preset frequency threshold value from the preset range, and taking the data block as a candidate high-frequency access data block set; Selecting a second preset number of high-frequency access data blocks from the candidate high-frequency access data block set; loading the high-frequency access data block to a preset position of the cache layer; And deleting the data block at the tail of the cache queue according to a least recently used algorithm in response to the cache layer capacity reaching a first threshold.
- 8. A data reading apparatus, the apparatus comprising: The cache setting module is used for setting a cache layer in the memory; the first data reading module is used for responding to the received first data reading request and reading target data information on the hard disk layer; The data preloading module is used for selecting a data block adjacent to the target data information position from the hard disk layer according to the physical position of the target data information on the hard disk layer and a preloading algorithm, and preloading the data block to the cache layer; and the second data reading module is used for responding to the received second data reading request, reading the target data information from the cache layer, and reading the target data information from the hard disk layer if the target data information does not exist in the cache layer.
- 9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 7 when the computer program is executed by the processor.
- 10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
Description
Data reading method, device, equipment and storage medium Technical Field The present application relates to the field of data transmission technologies, and in particular, to a data reading method, apparatus, device, and storage medium. Background In the context of rapid development of cloud computing and virtualization technologies, a super fusion infrastructure (HCI, hyper-ConvergedInfrastructure) is becoming a mainstream data center architecture. The super fusion integrates the computing, storage and network resources in a software definition mode, so that the deployment and operation and maintenance work is greatly simplified, and the super fusion has good expansibility and resource utilization rate. In the current mainstream super fusion system, an SSD or NVMe hard disk is generally used as a main storage medium, and support of upper-layer services such as a virtual machine, a container and the like is realized on the basis of the main storage medium. In order to improve the data access speed, the existing super fusion system generally adopts technologies such as layered storage (cold and hot data separation), data caching (such as read-write caching based on SSD), IO scheduling optimization and the like. For example, some systems may configure a layer of SSD cache in front of the NVMe hard disk or use a log-type writing mechanism to reduce random IO pressure, and part of the product supports using a software-defined manner to take part of the memory as a cache layer, so as to accelerate disk IO. In addition, techniques exist for utilizing data access hotness for data Prefetching (PREFETCHING) or for optimizing IO scheduling policies using IO prediction algorithms. Although the above approach alleviates the storage performance bottleneck to some extent, there are significant drawbacks. The method mainly comprises the steps that a large gap between a hard disk (NVMeSSD) and a memory in terms of IOPS, bandwidth and time delay is caused, a storage layer still becomes a performance bottleneck when facing high-concurrency low-delay data requests, the existing SSD or hard disk caching mechanism mostly fails to fully utilize the resources of a large-capacity memory in a modern server, the problem of low memory utilization rate exists, and in addition, the traditional cache management mode is insufficient in sequential/non-sequential access and data access mode identification capability, low in prefetching accuracy and easy to cause cache pollution or prefetching inefficiency, and the performance is affected. Disclosure of Invention In view of the foregoing, it is desirable to provide a data reading method, apparatus, device, and storage medium capable of improving data reading efficiency. In a first aspect, a data reading method is provided, including: Setting a cache layer in a memory; responding to the received first data reading request, and reading target data information on the hard disk layer; selecting a data block adjacent to the target data information position from the hard disk layer according to the physical position of the target data information in the hard disk layer and a preloading algorithm, and preloading the data block to a cache layer; And in response to receiving the second data reading request, reading target data information from the cache layer, and if the target data information does not exist in the cache layer, reading the target data information from the hard disk layer. In a second aspect, there is provided a data reading apparatus applied to the data reading method described in the first aspect, including: The cache setting module is used for setting a cache layer in the memory; the first data reading module is used for responding to the received first data reading request and reading target data information on the hard disk layer; the data preloading module is used for selecting a data block adjacent to the target data information position from the hard disk layer according to the physical position of the target data information on the hard disk layer and a preloading algorithm, and preloading the data block to the cache layer; and the second data reading module is used for responding to the received second data reading request, reading target data information from the cache layer, and reading the target data information from the hard disk layer if the target data information does not exist in the cache layer. In a third aspect, there is also provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the data reading method of the first aspect when executing the computer program. In a fourth aspect, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the data reading method of the first aspect. By implementing the data reading method, the device, the equipment and the storage medium, t