CN-122019077-A - Task processing method and device, electronic equipment and storage medium
Abstract
The disclosure provides a task processing method, a task processing device, electronic equipment and a storage medium, and relates to the technical field of computers. The method is applied to electronic equipment, a first processor of the electronic equipment comprises a plurality of computing cores, under the condition that a shader task is received, shared resources among thread bundles contained in the shader task are determined, the shared resources are written into a shared storage space of the first processor, the thread bundles are distributed to one or more target computing cores of the plurality of computing cores, and when the computing task in the shader task is executed, the shared resources are acquired from the shared storage space through the target computing cores, and calculation is performed based on the shared resources. According to the scheme, a large amount of computation core resources can be saved, and the hardware resource utilization rate of the first processor is improved.
Inventors
- Request for anonymity
- Request for anonymity
- Request for anonymity
- Request for anonymity
- Request for anonymity
Assignees
- 摩尔线程智能科技(北京)股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251226
Claims (15)
- 1. A method of task processing applied to an electronic device, a first processor of the electronic device comprising a plurality of computing cores, the method comprising: under the condition that a shader task is received, determining shared resources among thread bundles contained in the shader task, and writing the shared resources into a shared storage space of the first processor; Distributing each of the thread bundles to one or more target compute cores of the plurality of compute cores; and respectively acquiring the shared resources from the shared storage space when the target computing cores execute the computing tasks in the shader tasks, and performing computation based on the shared resources.
- 2. The method according to claim 1, wherein the electronic device further comprises a second processor, and wherein writing the shared resource into the shared memory space of the first processor comprises: Constructing an asynchronous shared task of the shared resource by the second processor; Issuing the asynchronous shared task to the first processor in response to an execution instruction for the shader task; And when the asynchronous shared task is executed by the first processor, the shared resource is written into a shared memory space of the first processor.
- 3. The task processing method according to claim 2, wherein the constructing, by the second processor, the asynchronous shared task of the shared resource includes: creating a global memory write instruction of the shared resource by the second processor, wherein the global memory write instruction is used for indicating to write the shared resource into a shared storage space of the first processor; And constructing an asynchronous sharing task of the shared resource based on the global memory write instruction.
- 4. The task processing method according to claim 2, wherein the writing the shared resource into the shared memory space of the first processor when the asynchronous shared task is executed by the first processor includes: and when the asynchronous shared task is executed through a shared computing core in the first processor, the shared resource is written into a shared storage space of the first processor, wherein the shared computing core is a computing core different from the target computing core in the plurality of computing cores.
- 5. The task processing method according to claim 2, wherein the second processor is operated with a GPU driver, the method further comprising: generating a synchronous instruction between the computing task and the corresponding asynchronous sharing task through the GPU driver, and issuing the synchronous instruction to the first processor; and executing the corresponding computing task after the execution of the asynchronous sharing task is completed based on the synchronous instruction through the first processor.
- 6. The method according to any one of claims 1 to 5, wherein the obtaining, by each of the target computing cores, the shared resource from the shared memory space when executing a computing task of the shader task, respectively, includes: Determining a preset register read instruction from a calculation task of the shader task, wherein the register read instruction is an instruction for the calculation task to read shared data from a shared register; Converting the register read instruction into a global memory read instruction for the shared memory space to obtain an updated computing task; And respectively acquiring the shared resources from the shared storage space according to the global memory read instruction when the updated computing task is executed through each target computing core.
- 7. The task processing method according to any one of claims 1 to 5, characterized in that the calculation based on the shared resource includes: For each target computing core, loading the shared resource into a cache of the target computing core; And acquiring the shared resource from the cache by each computing unit in the target computing core for computing.
- 8. The method according to any one of claims 1 to 5, wherein the electronic device further comprises a second processor, the second processor running a GPU driver and a GPU compiler, and wherein determining shared resources between thread bundles included in a shader task if the shader task is received comprises: under the condition that a compiling request of a shader task is received, the compiling request is sent to the GPU compiler through a user mode driver in the GPU driver, wherein the compiling request comprises a binary file of the shader task; and analyzing the binary file of the shader task through the GPU compiler, and determining the shared resources among the thread bundles contained in the shader task.
- 9. A method of task processing, characterized by a second processor applied to an electronic device, the method comprising: Under the condition that a shader task is received, determining shared resources among thread bundles contained in the shader task, and constructing an asynchronous shared task of the shared resources; the asynchronous shared task is used for the first processor to write the shared resource into a shared storage space of the first processor; And each thread bundle is used for acquiring the shared resources from the shared storage space and performing calculation based on the shared resources when each target computing core executes the calculation tasks in the shader tasks.
- 10.A method of task processing, characterized by a first processor applied to an electronic device, the first processor comprising a plurality of target computing cores, the method comprising: executing the asynchronous sharing task in response to the received asynchronous sharing task, and writing a shared resource among thread bundles contained in a shader task into a shared storage space of the first processor, wherein the shared resource is determined by a second processor under the condition of receiving the shader task; And each thread bundle received by each target computing core acquires the shared resource from the shared storage space when executing the computing task corresponding to the thread bundle, and performs computation based on the shared resource.
- 11. A task processing device, characterized in that it is applied to an electronic apparatus, a first processor of which includes a plurality of computing cores, the device comprising: The shared resource writing module is used for determining shared resources among thread bundles contained in the shader task and writing the shared resources into a shared storage space of the first processor under the condition that the shader task is received; a thread bundle distribution module for distributing each of the thread bundles to one or more target compute cores of the plurality of compute cores; and the calculation task execution module is used for acquiring the shared resources from the shared storage space and performing calculation based on the shared resources when each target calculation core executes the calculation task in the shader task.
- 12. A task processing device, characterized by a second processor applied to an electronic apparatus, the device comprising: The system comprises an asynchronous shared task construction module, a shader task generation module and a storage module, wherein the asynchronous shared task construction module is used for determining shared resources among thread bundles contained in the shader task under the condition of receiving the shader task and constructing asynchronous shared tasks of the shared resources; The asynchronous sharing task is used for the first processor to write the shared resource into a shared storage space of the first processor; The thread bundle distributing module is used for distributing each thread bundle to one or more target computing cores of the first processor, and each target computing core is used for acquiring the shared resources from the shared storage space and performing computation based on the shared resources when executing the computing tasks in the shader tasks.
- 13. A task processing device, characterized by a first processor applied to an electronic apparatus, the first processor including a plurality of target computing cores, the device comprising: The system comprises a shared resource writing module, a shared resource writing module and a shared resource processing module, wherein the shared resource writing module is used for responding to a received asynchronous shared task and executing the asynchronous shared task and writing shared resources among thread bundles contained in a shader task into a shared storage space of a first processor; And the calculation task execution module is used for acquiring the shared resources from the shared storage space and performing calculation based on the shared resources when executing the calculation tasks corresponding to the thread bundles through the thread bundles received by the target calculation cores.
- 14. An electronic device, comprising: a first processor; A second processor, and A memory having stored thereon computer readable instructions which, when executed by the electronic device, implement the task processing method of any of claims 1 to 8.
- 15. A computer-readable storage medium, on which a computer program is stored, which computer program, when executed by an electronic device, implements the task processing method as claimed in any one of claims 1 to 8.
Description
Task processing method and device, electronic equipment and storage medium Technical Field The disclosure relates to the technical field of computers, and in particular relates to a task processing method, a task processing device, electronic equipment and a storage medium. Background With the development of computer technology, processors with multi-core parallel computing architecture are widely used in the fields of graphics rendering, artificial intelligence reasoning, scientific computing and general parallel computing. Processors with multi-core parallel computing architecture include, for example, graphics processors (Graphics Processing Unit, GPUs), tensor processors (Tensor Processing Unit, TPUs), neural network processors (Neural network Processing Unit, NPUs), AI accelerators, and the like. Within the processor, the shared data for each thread bundle is deposited into a shared register in the compute core unit, facilitating access to this shared register by the thread bundles belonging to the same shader task in the compute core unit, which helps reduce the register usage of the thread bundles. In the conventional task processing method, when a processor processes a hardware instruction stream and distributes a shader task to each computing core, the front end of a GPU cannot know how many thread bundles the shader task can generate in advance, and then all the computing cores in the processor need to be broadcasted, so that each computing core can initialize a shared register in the thread bundles to finish the initialization of shared data. However, for a scenario where the shader task includes a smaller number of thread bundles, such as vertices to be rendered or fewer pixels to be rasterized, the thread bundles included in the shader task can only be allocated to one or several computing cores for execution, and then all the computing cores initialize the shared registers, which can result in the multi-core processor executing a large number of invalid shared tasks, wasting a large amount of computing core resources, and having a problem of low resource utilization of the multi-core processor. It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art. Disclosure of Invention The embodiment of the disclosure aims to provide a task processing method, a device, electronic equipment and a storage medium, which can save a large amount of computation core resources and improve the resource utilization rate of a first processor. Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure. According to a first aspect of embodiments of the present disclosure, there is provided a task processing method applied to an electronic device, where a first processor of the electronic device includes a plurality of computing cores, the method including: under the condition that a shader task is received, determining shared resources among thread bundles contained in the shader task, and writing the shared resources into a shared storage space of the first processor; Distributing each of the thread bundles to one or more target compute cores of the plurality of compute cores; and respectively acquiring the shared resources from the shared storage space when the target computing cores execute the computing tasks in the shader tasks, and performing computation based on the shared resources. In some example embodiments of the present disclosure, based on the foregoing, the electronic device further includes a second processor, the writing the shared resource into the shared memory space of the first processor includes: Constructing an asynchronous shared task of the shared resource by the second processor; Issuing the asynchronous shared task to the first processor in response to an execution instruction for the shader task; And when the asynchronous shared task is executed by the first processor, the shared resource is written into a shared memory space of the first processor. In some example embodiments of the disclosure, based on the foregoing solution, the constructing, by the second processor, an asynchronous shared task of the shared resource includes: creating a global memory write instruction of the shared resource by the second processor, wherein the global memory write instruction is used for indicating to write the shared resource into a shared storage space of the first processor; And constructing an asynchronous sharing task of the shared resource based on the global memory write instruction. In some example embodiments of the disclosure, based on the foregoing solution, the writing, by the first processor, the shared resource into the shared memory space of the first processor