Search

CN-121979669-A - Data processing method and device, computer equipment and storage medium

CN121979669ACN 121979669 ACN121979669 ACN 121979669ACN-121979669-A

Abstract

The embodiment of the application provides a data processing method and device, computer equipment and a storage medium, and belongs to the technical field of data processing. The method comprises the steps of obtaining total data quantity and available memory capacity of data to be processed of a plurality of tasks, calculating a numerical relation between the total data quantity and the available memory capacity, and determining a data loading strategy of the data to be processed according to the numerical relation, wherein the data loading strategy comprises a full-quantity loading strategy and a fragmentation loading strategy, loading the data to be processed according to the data loading strategy, and executing post-processing operation on the loaded data to be processed. According to the method and the device, the memory requirement of the data loading stage can be calculated in advance, the memory occupation condition can be planned in advance, the data loading strategy can be dynamically determined, excessive memory consumption of all data to be processed in one-time loading is effectively avoided, the available memory of the data computing stage is reduced, and further the computing efficiency and the stability are improved.

Inventors

  • ZHANG XUEGUI

Assignees

  • 远景能源有限公司

Dates

Publication Date
20260505
Application Date
20251225

Claims (10)

  1. 1. A method of data processing, the method comprising: Acquiring total data quantity and available memory capacity of data to be processed of a plurality of tasks; Calculating the numerical relation between the total data quantity and the available memory capacity, and determining a data loading strategy of the data to be processed according to the numerical relation, wherein the data loading strategy comprises a full-load strategy and a fragment loading strategy; and loading the data to be processed according to the data loading strategy, and executing post-processing operation on the loaded data to be processed.
  2. 2. The method of claim 1, wherein said calculating a numerical relationship of said total data amount to said available memory capacity and determining a data loading policy for said data to be processed based on said numerical relationship comprises: Calculating a first ratio of the total data amount to the available memory capacity, and judging whether the first ratio exceeds a first safety threshold; And under the condition that the first ratio exceeds the first safety threshold, adopting a slicing loading strategy for the data to be processed.
  3. 3. The method of claim 2, wherein after calculating the ratio of the total amount of data to the available memory capacity and determining whether the first ratio exceeds a first security threshold, further comprising: and under the condition that the first ratio does not exceed the first safety threshold, adopting a full-load strategy for the data to be processed.
  4. 4. The method of claim 2, wherein prior to applying a slice loading policy to the data to be processed, further comprising: Acquiring a single task data volume corresponding to any task in the data to be processed; calculating a second ratio of the single task data amount to the available memory capacity, and calculating a third ratio of the first security threshold to the second ratio; and determining the concurrent number of tasks of the slicing loading strategy according to the third ratio.
  5. 5. The method according to any one of claims 2-4, wherein after loading the data to be processed, further comprising: Acquiring the real-time occupation amount of the memory; Calculating a fourth ratio of the real-time occupied amount of the memory to the available memory capacity, and judging whether the fourth ratio exceeds a second safety threshold, wherein the second safety threshold is higher than the first safety threshold; and reducing the concurrent number of tasks under the condition that the fourth ratio exceeds a second safety threshold.
  6. 6. The method of claim 5, wherein, in the case where the fourth ratio exceeds a second safety threshold, reducing the number of task concurrency is followed by: and after the current task is executed and the memory is released, increasing the concurrency quantity of the tasks.
  7. 7. The method of claim 1, wherein the shard loading policy includes loading single task data corresponding to each task in the data to be processed one by one.
  8. 8. A data processing apparatus, comprising: the data statistics module is used for acquiring the total data quantity and the available memory capacity of the data to be processed; The data statistics module is further used for calculating a numerical relation between the total data quantity and the available memory capacity and determining a data loading strategy of the data to be processed according to the numerical relation, wherein the data loading strategy comprises a full loading strategy and a fragmentation loading strategy; the data loading module is used for loading the data to be processed and executing post-processing operation on the loaded data to be processed.
  9. 9. A computer device comprising a processor and a memory, wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method according to any of claims 1-7.
  10. 10. A non-transitory computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method of any of claims 1-7.

Description

Data processing method and device, computer equipment and storage medium Technical Field The present application relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, a computer device, and a storage medium. Background Early computer systems were unable to process multiple computing tasks simultaneously, and were able to execute each computing task individually, and this serial processing approach resulted in low utilization of computer resources, and overall operating efficiency was greatly limited. With the development of hardware technology, multitasking concurrent computation becomes feasible. For the simulation technology, the data to be processed in the post-simulation processing stage often has a larger data volume. And, when there are a plurality of simulation tasks, the total data amount of the data to be processed of the plurality of tasks is more huge. When the computer system carries out simulation post-processing on a plurality of tasks concurrently, the tasks are generally operated according to the set number of threads or processes, and the data to be processed corresponding to all the tasks are loaded into the memory at one time, namely the data to be processed are loaded in full. When the loaded data volume exceeds the current available memory capacity of the system, a memory overflow error is triggered, so that the task is abnormally terminated. Meanwhile, under the condition of strong memory resource competition, part of sub-processes can enter a long-time waiting state because the needed memory cannot be acquired in time, and even the sub-processes are forcedly terminated by a system, so that the calculation efficiency of the whole task is seriously influenced. Disclosure of Invention The embodiment of the application provides a data processing method and device, computer equipment and a storage medium, which can solve part or all of the problems in the prior art. In order to achieve the above object, the technical solution provided by the embodiments of the present application is as follows: In a first aspect, an embodiment of the present application provides a data processing method, including: acquiring total data quantity and available memory capacity of data to be processed; Calculating the numerical relation between the total data quantity and the available memory capacity, and determining a data loading strategy of the data to be processed according to the numerical relation, wherein the data loading strategy comprises a full-load strategy and a fragment loading strategy; and loading the data to be processed according to the data loading strategy, and executing post-processing operation on the loaded data to be processed. In a second aspect, an embodiment of the present application provides a data processing apparatus, including: the data statistics module is used for acquiring the total data quantity and the available memory capacity of the data to be processed; the data statistics module is also used for calculating the numerical relation between the total data quantity and the available memory capacity and determining a data loading strategy of the data to be processed according to the numerical relation, wherein the data loading strategy comprises a full loading strategy and a fragmentation loading strategy; the data loading module is used for loading the data to be processed and executing post-processing operation on the loaded data to be processed. In a third aspect, an embodiment of the present application provides a computer device, which is characterized by comprising a processor and a memory, wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method according to the first aspect. In a fourth aspect, embodiments of the present application provide a non-transitory computer readable storage medium, wherein the storage medium stores a plurality of instructions adapted to be loaded and executed by a processor, such as A computer storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor and to carry out the above-described method steps. The intelligent interactive tablet may comprise a processor and a memory, wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps described above. In the embodiments of the present application, before loading the data to be processed of a plurality of tasks, the total data amount of all the data to be processed is estimated, the total data amount is compared with the available memory capacity of the system, and the data loading strategy is determined according to the numerical relationship between the total data amount and the available memory capacity of the system. If the available memory is sufficient, a full load strategy can be adopted, and if the available memory is insufficient, a slice load strategy can be adopted, so that sm