Search

CN-122019094-A - Data access method, device and storage medium

CN122019094ACN 122019094 ACN122019094 ACN 122019094ACN-122019094-A

Abstract

The application discloses a data access method, data access equipment and a storage medium, and relates to the technical field of storage. The data access method comprises the steps of obtaining a task request, generating a calculation job according to the task request, submitting the calculation job to a job queue, obtaining the calculation job covered by a pre-fetching window in the job queue as a job to be loaded, transmitting data to be called by the job to be loaded to a first storage device, determining the job to be processed in the job queue, establishing a corresponding relation between the job to be processed and a calculation unit, inquiring the storage state of the data to be called by the job to be processed to obtain a data inquiry result, and caching the data to be called by the job to be processed to a cache device corresponding to the calculation unit corresponding to the job to be processed according to the data inquiry result so as to enable the calculation unit to execute calculation on the data to be called by the job to be processed. By implementing the above, the access speed of the large model to the data can be improved, so that the reasoning speed of the model is improved.

Inventors

  • MU XIANGDONG
  • LI XUESHENG

Assignees

  • 济南浪潮数据技术有限公司

Dates

Publication Date
20260512
Application Date
20260130

Claims (10)

  1. 1. A method of data access, comprising: Acquiring a task request, generating a calculation job according to the task request, and submitting the calculation job to a job queue; Acquiring a computing job covered by a pre-fetching window in the job queue as a job to be loaded, and transmitting data to be called by the job to be loaded to a first storage device, wherein the pre-fetching window is used for indicating the job to be loaded in the job queue; Determining a job to be processed in the job queue, and establishing a corresponding relation between the job to be processed and a computing unit; Performing storage state query on the data to be called of the job to be processed to obtain a data query result; And according to the data query result, caching the data to be called by the job to be processed into a cache device corresponding to a computing unit corresponding to the job to be processed, so that the computing unit can execute computation on the data to be called by the job to be processed.
  2. 2. The method according to claim 1, wherein the obtaining, in the job queue, the computing job covered by the prefetch window as the job to be loaded, and transmitting the data to be invoked by the job to be loaded to the first storage device, includes: A pre-fetching window is set in the job queue, and the computing job covered by the pre-fetching window is used as the job to be loaded; obtaining a cache key value corresponding to the job to be loaded; Determining a storage address for storing data to be called of the operation to be loaded corresponding to the cache key value according to the cache key value through a key value mapping table, and determining storage equipment for storing the data to be called of the operation to be loaded according to the storage address, wherein the key value mapping table at least comprises the cache key value and the storage address stored by the data to be called of the operation to be loaded; And responding to the data to be called of the job to be loaded and storing the data to be called in the second storage device, and transmitting the data to be called of the job to be loaded to the first storage device.
  3. 3. The data access method of claim 2, wherein the setting the prefetch window comprises: And determining the length of the prefetch window according to L pw =C a /S kv , wherein L pw is the length of the prefetch window and represents the number of computing jobs covered by the prefetch window, C a is the available space size of the first storage device, and S kv is the cache size required by the computing jobs.
  4. 4. The method for accessing data according to claim 1, wherein the step of querying the storage state of the data to be invoked by the job to be processed to obtain a data query result includes: searching data to be called of the job to be processed in a key value mapping table; responding to the data to be called of the job to be processed hit in the key value mapping table, and taking a storage address of the data to be called of the job to be loaded stored in the key value mapping table as the data query result; Responding to the data to be called of the job to be processed which is not hit in the key value mapping table, and searching the data to be called of the job to be processed in the cache equipment corresponding to any computing unit; In response to the fact that the data to be called of the job to be processed is hit in the cache equipment corresponding to any computing unit, the storage address of the data to be called of the job to be processed in the cache equipment is used as the data query result; responding to the data to be called of the job to be processed which is not hit in the storage equipment corresponding to any computing unit, and searching the data to be called of the job to be processed in the first storage equipment; Responding to the data to be called of the job to be processed in the first storage device, and taking the storage address of the data to be called of the job to be processed in the first storage device as the data query result; Responding to the data to be called of the job to be processed, which is not missed in the first storage device, and searching the data to be called of the job to be processed in the second storage device; responding to the data to be called of the job to be processed in the second storage equipment, and taking the storage address of the data to be called of the job to be processed in the second storage equipment as the data query result; and responding to the data to be called of the job to be processed, which is not missed in the second storage device, and marking the data to be called of the job to be processed as new data as the data query result.
  5. 5. The method for accessing data according to claim 4, wherein the caching the data to be invoked by the job to be processed to the cache device corresponding to the computing unit corresponding to the job to be processed according to the data query result includes: Responding to the data query result as a storage address, and transmitting data corresponding to the storage address to a cache device corresponding to a computing unit corresponding to the job to be processed through the first storage device; And responding to the data query result as new data, calculating the data to be called by the job to be processed through a calculating unit corresponding to the job to be processed, and transmitting the data to be called by the job to be processed to a cache device corresponding to the calculating unit.
  6. 6. The method for accessing data according to any one of claims 1 to 5, wherein after caching the data to be invoked by the job to be processed in the cache device corresponding to the computing unit corresponding to the job to be processed, the method further comprises: Responding to the cache equipment to acquire the data to be called of the job to be processed, and calculating the data to be called of the job to be processed through the calculating unit to obtain a calculation result; And returning the calculation result to the job to be processed.
  7. 7. The data access method according to claim 6, wherein after the calculation result is returned to the job to be processed, further comprising: acquiring the unit time reference times of the cache key values stored in the key value mapping table; marking a cache key value with the reference times of unit time smaller than a preset threshold value as to-be-recovered; setting a recovery window; and releasing the cache key value covered by the recycling window and marked to be recycled from the key value mapping table to update the key value mapping table.
  8. 8. The data access method of claim 7, wherein the setting the reclamation window comprises: And determining the length of the recycling window according to L ev =(C dram +C ssd )/S kv , wherein L ev is the length of the recycling window, the number of cache key values covered by the recycling window is represented, C dram is the available space of the first storage device, C ssd is the available space of the second storage device, and S kv is the cache size required by the computing operation.
  9. 9. A computer device comprising a memory, a processor and a data access program stored on the memory and executable on the processor, the processor implementing the data access method of any one of claims 1 to 8 when the data access program is executed.
  10. 10. A computer-readable storage medium, on which a data access program is stored, which, when executed by a processor, implements the data access method of any one of claims 1 to 8.

Description

Data access method, device and storage medium Technical Field The present invention relates to the field of storage technologies, and in particular, to a data access method, device, and storage medium. Background In the course of dialog processing using a large language model (Large Language Model, LLM), quick calls to the processed data are required. However, the high bandwidth memory (High Bandwidth Memory, HBM) corresponding to the graphics processing unit (Graphics Processing Unit, GPU) has limited capacity, and cannot store key value data generated by large-scale conversations for a long period of time, so that large-scale operations can cause frequent repeated computation of historical data, and reduce the model reasoning speed. Disclosure of Invention The application provides a data access method, equipment and a storage medium, which at least solve the problem of the decline of the reasoning speed caused by the performance of the storage equipment and the data call in the reasoning process adopting a large language model. In a first aspect, the present application provides a data access method, including: acquiring a task request, generating a calculation job according to the task request, and submitting the calculation job to a job queue; Acquiring a computing job covered by a pre-fetching window in a job queue as a job to be loaded, and transmitting data to be called by the job to be loaded to a first storage device, wherein the pre-fetching window is used for indicating the job to be loaded in the job queue; Determining a job to be processed in a job queue, and establishing a corresponding relation between the job to be processed and a computing unit; carrying out storage state query on data to be called of the job to be processed to obtain a data query result; According to the data query result, caching the data to be called by the job to be processed into a cache device corresponding to the computing unit corresponding to the job to be processed, so that the computing unit can execute the computation on the data to be called by the job to be processed. In a second aspect, the present application also provides a data access apparatus, including: The job processing module is used for acquiring a task request, generating a calculation job according to the task request and submitting the calculation job to a job queue; The first data loading module is used for acquiring a computing job covered by the pre-fetching window in the job queue as a job to be loaded, and transmitting data to be called by the job to be loaded to the first storage device, wherein the pre-fetching window is used for indicating the job to be loaded in the job queue; the association establishing module is used for determining the to-be-processed job in the job queue and establishing the corresponding relation between the to-be-processed job and the computing unit; The data query module is used for querying the storage state of the data to be called of the job to be processed to obtain a data query result; and the second data loading module is used for caching the data to be called by the job to be processed into the cache equipment corresponding to the computing unit corresponding to the job to be processed according to the data query result so as to enable the computing unit to execute the computation on the data to be called by the job to be processed. In a third aspect, the present application further provides a computer device, including a memory, a processor, and a data access program stored in the memory and capable of running on the processor, where the processor implements the data access method described in the first aspect when executing the data access program, and the method includes: acquiring a task request, generating a calculation job according to the task request, and submitting the calculation job to a job queue; Acquiring a computing job covered by a pre-fetching window in a job queue as a job to be loaded, and transmitting data to be called by the job to be loaded to a first storage device, wherein the pre-fetching window is used for indicating the job to be loaded in the job queue; Determining a job to be processed in a job queue, and establishing a corresponding relation between the job to be processed and a computing unit; carrying out storage state query on data to be called of the job to be processed to obtain a data query result; According to the data query result, caching the data to be called by the job to be processed into a cache device corresponding to the computing unit corresponding to the job to be processed, so that the computing unit can execute the computation on the data to be called by the job to be processed. In a fourth aspect, the present application further provides a computer-readable storage medium having stored thereon a data access program which, when executed by a processor, implements the data access method described in the first aspect, comprising: acquiring a task request, gene