US-20260127024-A1 - DATA HANDLING
Abstract
An apparatus comprising storage, an execution unit and a handling unit. The handling unit is configured to obtain task data that describes a task to be executed. The task comprises a plurality of operations representable as a directed graph of operations. The task data comprises task-specific variable data representative of a task-specific variable for use in executing an operation of the plurality of operations. The handling unit is configured to obtain a data move instruction and, based on the data move instruction, move the task-specific variable data into a physical storage location of the storage. The handling unit is configured to dispatch invocation data, based on the task data and the physical storage location, to the execution unit to cause the execution unit to execute the operation.
Inventors
- Rune Holm
- Dominic Hugo Symes
- Elliot Maurice Simon ROSEMARINE
Assignees
- ARM LIMITED
Dates
- Publication Date
- 20260507
- Application Date
- 20250307
Claims (20)
- 1 . An apparatus comprising storage, an execution unit and a handling unit, wherein the handling unit is configured to: obtain task data that describes a task to be executed, the task comprising a plurality of operations representable as a directed graph of operations, the task data comprising task-specific variable data representative of a task-specific variable for use in executing an operation of the plurality of operations; obtain a data move instruction; based on the data move instruction, move the task-specific variable data into a physical storage location of the storage; and dispatch invocation data, based on the task data and the physical storage location, to the execution unit to cause the execution unit to execute the operation.
- 2 . The apparatus of claim 1 , wherein the invocation data comprises at least one of: the task-specific variable data; or a pointer to the physical storage location storing the task-specific variable data.
- 3 . The apparatus of claim 1 , wherein the task data defines a multi-dimensional nested loop defining an operation space, the handling unit is configured to iterate over the operation space in blocks, the storage comprises, for each dimension of the multi-dimensional nested loop, a respective boundary register for storing, for a given block of the blocks, range data defining a range of the given block in the respective dimension, and the physical storage location comprises at least a field of a particular boundary register of the boundary registers.
- 4 . The apparatus of claim 3 , wherein the boundary register for each respective dimension comprises: a low bound field for storing a low bound of the given block in the respective dimension; and a high bound field for storing a high bound of the given block in the respective dimension, and the physical storage location comprises at least one of the low bound field or the high bound field of the particular boundary register.
- 5 . The apparatus of claim 4 , wherein the handling unit is configured to, in dependence on a boundary register modifier associated with the data move instruction, at least one of: set a particular low bound field of the particular boundary register to a particular value and move the task-specific variable data to a particular high bound field of the particular boundary register; or move the task-specific variable data to the particular low bound field.
- 6 . The apparatus of claim 3 , wherein the handling unit is configured to set a value of the task-specific variable on a per-block basis for at least a plurality of the blocks.
- 7 . The apparatus of claim 3 , wherein, for at least one of the blocks, the handling unit is configured to modify the range data based on the task-specific variable, to modify a range of the at least one block in at least one dimension.
- 8 . The apparatus of claim 1 , wherein the task data defines a multi-dimensional nested loop defining an operation space, the handling unit is configured to iterate over the operation space in blocks, comprising mapping respective blocks in the operation space to different local blocks in an operation-specific local space, based on the task-specific variable data, wherein the invocation data for each respective block of the blocks specifies a local range of a local block, in the operation-specific local space, to be operated on for the respective block.
- 9 . The apparatus of claim 1 , wherein the task data comprises compiled task data compiled prior to setting a value of the task-specific variable.
- 10 . The apparatus of claim 1 , wherein the handling unit is configured to, after moving the task-specific variable data into the physical storage location, modify the task-specific variable data stored in the physical storage location based on the task data.
- 11 . The apparatus of claim 1 , wherein the apparatus is configurable to execute the task on behalf of a processor and the apparatus comprises control interface circuitry configured to receive, from the processor, at least one command message to instruct execution of the task by the apparatus, and wherein the handling unit is configured to: obtain, from the at least one command message, a set of fields of an instruction to execute the task, the set of fields comprising a task-specific variable field comprising the task-specific variable data.
- 12 . The apparatus of claim 1 , wherein the operation comprises processing of an input feature map, and a padding to be applied to at least a portion of the input feature map in executing the operation is based on the task-specific variable.
- 13 . The apparatus of claim 1 , wherein the task-specific variable corresponds to a predetermined value to be used in response to an attempt to access an out-of-bounds value during execution of the operation.
- 14 . The apparatus of claim 1 , comprising: a core for executing the task, the core comprising the handling unit, the storage and the execution unit; and a further core for executing a further task of a job comprising the task and the further task, the further core comprising further storage, a further execution unit and a further handling unit configured to: obtain further task data that describes the further task, the further task data comprising further task-specific variable data representative of a further task-specific variable for use in executing a further operation; based on the data move instruction, move the further task-specific variable data into a further physical storage location of the further storage; and dispatch further invocation data, based on the further task data and the further physical storage location, to the further execution unit to cause the further execution unit to execute the further operation.
- 15 . The apparatus of claim 14 , wherein the task comprises applying the operation to a first portion of a tensor and the further task comprises applying the operation to a second portion of the tensor.
- 16 . The apparatus of claim 15 , wherein: the task data comprises reference data defining a reference portion of the tensor and the handling unit is configured to process the reference data based on the task-specific variable data to obtain first tensor data defining the first portion of the tensor; and the further task data comprises the reference data and the further handling unit is configured to process the reference data based on the further task-specific variable data to obtain second tensor data defining the second portion of the tensor.
- 17 . A system comprising: the apparatus of claim 1 , implemented in at least one packaged chip; at least one system component; and a board, wherein the at least one packaged chip and the at least one system component are assembled on the board.
- 18 . A chip-containing product comprising the system of claim 17 , wherein the system is assembled on a further board with at least one other product component.
- 19 . A non-transitory computer-readable medium having stored thereon computer-readable code for fabrication of the apparatus of claim 1 .
- 20 . A method comprising: obtaining, by handling circuitry, task data that describes a task to be executed, the task comprising a plurality of operations representable as a directed graph of operations, the task data comprising task-specific variable data representative of a task-specific variable for use in executing an operation of the plurality of operations; obtaining, by the handling circuitry, a data move instruction; based on the data move instruction, the handling circuitry moving the task-specific variable data into a physical storage location of storage accessible to the handling circuitry; and dispatching, by the handling circuitry, invocation data, based on the task data and the physical storage location, to execution circuitry for execution of the operation.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation-in-part under 35 U.S.C. § 120 of U.S. application Ser. No. 18/939,277, filed Nov. 6, 2024. Each of the above-referenced patent applications is incorporated by reference in its entirety. BACKGROUND Technical Field The disclosure herein relates to apparatuses and methods for use in executing an operation, such as a data processing operation. Description of the Related Technology Certain data processing techniques, such as neural network processing and graphics processing, involve the processing and generation of considerable amounts of data using operations. It is desirable to handle data such as this in an efficient and/or flexible manner. SUMMARY According to a first aspect of the present disclosure, there is provided an apparatus comprising storage, an execution unit and a handling unit, wherein the handling unit is configured to: obtain task data that describes a task to be executed, the task comprising a plurality of operations representable as a directed graph of operations, the task data comprising task-specific variable data representative of a task-specific variable for use in executing an operation of the plurality of operations; obtain a data move instruction; based on the data move instruction, move the task-specific variable data into a physical storage location of the storage; and dispatch invocation data, based on the task data and the physical storage location, to the execution unit to cause the execution unit to execute the operation. According to a second aspect of the present disclosure, there is provided a method comprising: obtaining, by handling circuitry, task data that describes a task to be executed, the task comprising a plurality of operations representable as a directed graph of operations, the task data comprising task-specific variable data representative of a task-specific variable for use in executing an operation of the plurality of operations; obtaining, by the handling circuitry, a data move instruction; based on the data move instruction, the handling circuitry moving the task-specific variable data into a physical storage location of storage accessible to the handling circuitry; and dispatching, by the handling circuitry, invocation data, based on the task data and the physical storage location, to execution circuitry for execution of the operation. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a flow diagram of a method of moving data based on a data move instruction according to an example; FIG. 2 is a schematic representation of an example apparatus; FIG. 3 illustrates an example directed graph; FIG. 4 is a schematic diagram of a data processing system; FIG. 5 is a schematic diagram of an example neural engine; FIG. 6 is a schematic diagram of an example system for allocating handling data; FIG. 7 is a schematic diagram of a data processing apparatus including a central processing unit (CPU) and a hardware accelerator; FIG. 8 is a schematic diagram of a data structure for storing an instruction; FIG. 9 is a schematic representation of a simulator implementation according to an example; and FIG. 10 is a schematic diagram of manufacture of a system and a chip-containing product. DETAILED DESCRIPTION Data Move Instruction FIG. 1 is a flow diagram 100 showing a method of moving data based on a data move instruction. The method may be performed by a handling unit of an apparatus comprising storage and an execution unit. The handling unit may be implemented by handling circuitry so that the method is executed by the handling circuitry. The execution unit may be implemented by execution circuitry, which may be considered to be an example of processing circuitry. At item 102 of the flow diagram 100, task data that represents a task to be executed is obtained. The task comprises a plurality of operations representable as a directed graph of operations, as explained further with reference to FIG. 3. The task data comprises task-specific variable data representative of a task-specific variable for use in executing an operation of the plurality of operations, for example by processing circuitry such as the execution circuitry implementing the execution unit. The task-specific variable is for example a number or bit pattern, which may be a constant, which can be used for various purposes to execute a variety of different types of operation. At item 104, a data move instruction is obtained. At item 106, the task-specific variable is moved into a physical storage location of the storage of the apparatus, based on the data move instruction. At item 108, invocation data, based on the task data and the physical storage location, is dispatched to the execution unit to cause the execution unit to execute the operation. The invocation data for example includes, or otherwise indicates, the data that is to be processed in executing the operation (which for example includes the task-specific variable data and input data t