CN-122018984-A - Instruction processing method, programmable processor, electronic device and storage medium

CN122018984ACN 122018984 ACN122018984 ACN 122018984ACN-122018984-A

Abstract

An instruction processing method for a programmable processor, an electronic device and a storage medium. The programmable processor has a virtual machine kernel function running thereon, the virtual machine kernel function comprising at least one pre-generated sequence of machine instructions adapted for the programmable processor, each of the at least one pre-generated sequence of machine instructions having a corresponding array of bytecodes. The instruction processing method comprises the steps of compiling a target kernel function corresponding to source codes into a target byte code array, wherein N target byte codes are included in the target byte code array, determining a pre-generated machine instruction sequence corresponding to the kth target byte code from a virtual machine kernel function by using the kth target byte code in the N target byte codes, and executing the pre-generated machine instruction sequence corresponding to the kth target byte code. The instruction processing method realizes the cross-platform efficient operation of the source codes and remarkably improves the flexibility and efficiency of the deployment of the processor programs.

Inventors

Request for anonymity

Assignees

上海壁仞科技股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260410

Claims (17)

1. An instruction processing method for a programmable processor, wherein a virtual machine kernel function is run on the programmable processor, the virtual machine kernel function comprising at least one pre-generated sequence of machine instructions adapted for the programmable processor, each pre-generated sequence of machine instructions in the at least one pre-generated sequence of machine instructions having a corresponding array of bytecodes, The instruction processing method comprises the following steps: Compiling a target kernel function corresponding to a source code into a target byte code array, wherein the target byte code array comprises N target byte codes, and N is a positive integer; Determining a pre-generated machine instruction sequence corresponding to a kth target byte code from the virtual machine kernel function by using the kth target byte code in the N target byte codes, wherein k=0, 1, & gt, N-1; and executing the pre-generated machine instruction sequence corresponding to the kth target byte code.
2. The method of claim 1, wherein the kth target bytecode includes a first instruction address of a pre-generated machine instruction sequence corresponding to the kth target bytecode, Using a kth target byte code in the N target byte codes, determining a pre-generated machine instruction sequence corresponding to the kth target byte code from the virtual machine kernel function, including: And determining the position of the pre-generated machine instruction sequence corresponding to the kth target byte code in the virtual machine kernel function by using the first instruction address.
3. The method of claim 2, wherein a jump table is maintained in the virtual machine kernel, the jump table including a mapping relationship between the first instruction address and a second instruction address in the virtual machine kernel of a pre-generated machine instruction sequence corresponding to the kth target bytecode, Determining a position of a pre-generated machine instruction sequence corresponding to the kth target byte code in the virtual machine kernel function by using the first instruction address, wherein the method comprises the following steps: determining the second instruction address using the first instruction address based on the jump table; And jumping to the position of the pre-generated machine instruction sequence corresponding to the kth target byte code in the virtual machine kernel function based on the second instruction address.
4. The method of claim 1, wherein the kth target bytecode includes a first bytecode address of the kth target bytecode in the target bytecode array, The virtual machine kernel function comprises a first jump instruction, wherein the first jump instruction is used for determining the position of the kth target byte code in the target byte code array based on the first byte code address before determining a pre-generated machine instruction sequence corresponding to the kth target byte code from the virtual machine kernel function.
5. The instruction processing method of claim 1 wherein the shared memory space of the programmable processor is created with a local variable table, the local variable table including at least one local variable, the at least one local variable including a target local variable, the pre-generated machine instruction sequence corresponding to the kth target bytecode corresponding to a first operation of the target local variable, The kth target byte code includes a first variable address corresponding to the target local variable, Executing the pre-generated machine instruction sequence corresponding to the kth target byte code, including: determining a position of the target local variable in the local variable table by using the first variable address; and executing the first operation on the target local variable.
6. The method of claim 1, wherein a local variable table is created in the shared memory space of the programmable processor, the local variable table including at least one local variable therein, The instruction processing method further comprises the following steps: Loading at least one kernel parameter of the target kernel and operands of the kernel parameter to the local variable table to take the at least one kernel parameter as a local variable in the local variable table.
7. The instruction processing method of claim 1 wherein the shared memory space of the programmable processor is created with a local variable table and an operand stack, the local variable table including at least one local variable including a first local variable, an operand of the first local variable being stored in the local variable table, In response to a load operation of the first local variable corresponding to a pre-generated sequence of machine instructions corresponding to the kth target bytecode, executing the pre-generated sequence of machine instructions corresponding to the kth target bytecode, comprising: the operands of the first local variables are copied from the local variable table to the operand stack.
8. The instruction processing method of claim 1 wherein the shared memory space of the programmable processor is created with a local variable table and an operand stack, the local variable table including at least one local variable including at least one second local variable, the operands of the at least one second local variable being stored in the operand stack, In response to a first arithmetic operation of which the pre-generated sequence of machine instructions corresponding to the kth target bytecode corresponds to the at least one second local variable, executing the pre-generated sequence of machine instructions corresponding to the kth target bytecode, comprising: fetching operands of the at least one second local variable from the operand stack; Performing the first operation by using the operand of the at least one second local variable to obtain a first operation result; The first operation result is stored to the operand stack.
9. The instruction processing method of claim 8 wherein the shared memory space of the programmable processor is created with a local variable table and an operand stack, the local variable table including at least one local variable, the at least one local variable including a third local variable, an operand of the third local variable corresponding to a first operation result from a first operation performed in the programmable processor, the first operation result stored in the operand stack, Executing the pre-generated sequence of machine instructions corresponding to the kth target bytecode in response to the pre-generated sequence of machine instructions corresponding to the kth target bytecode corresponding to the store operation of the third local variable, comprising: Copying the first operation result from the operand stack to a location of an operand of the third local variable in the local variable table.
10. The instruction processing method of claim 1, wherein in response to no exit instruction being included in the pre-generated sequence of machine instructions corresponding to the kth target bytecode and k being less than N-1, the instruction processing method further comprises, after executing the pre-generated sequence of machine instructions corresponding to the kth target bytecode: Jumping to the (k+1) th target byte code in the target byte code array; determining a pre-generated machine instruction sequence corresponding to the (k+1) th target byte code from the virtual machine kernel function by using the (k+1) th target byte code; and executing the pre-generated machine instruction sequence corresponding to the (k+1) th target byte code.
11. The method of claim 10, wherein the (k+1) -th target bytecode includes a second bytecode address of the (k+1) -th target bytecode in the target bytecode array, the virtual machine kernel includes a second jump instruction corresponding to the second bytecode address, After executing the pre-generated machine instruction sequence corresponding to the kth target byte code, jumping to the (k+1) th target byte code in the target byte code array, including: the second jump instruction is executed to determine a position of the (k+1) th target bytecode in the target bytecode array based on the second bytecode address.
12. The instruction processing method according to claim 1, characterized in that the instruction processing method further comprises: Compiling the source code into an intermediate form representation; Responsive to the intermediate form expression including a hotspot kernel, just-in-time compiling the hotspot kernel into a sequence of native machine instructions suitable for the programmable processor; Executing the sequence of native machine instructions.
13. The instruction processing method of claim 12, wherein compiling the target kernel function corresponding to the source code into the target bytecode array comprises: responsive to the intermediate form expression including the target kernel function, compiling the target kernel function into the target bytecode array.
14. The instruction processing method according to claim 1, characterized in that the instruction processing method further comprises: Compiling the source code into an intermediate form representation; determining the intermediate form expression as the objective kernel function; while compiling the target kernel function corresponding to the source code into the target bytecode array, initiating just-in-time compilation of the intermediate form representation to just-in-time compile the intermediate form representation in the background of the programmable processor into a sequence of native machine instructions suitable for the programmable processor.
15. A programmable processor, wherein a virtual machine kernel function is run on the programmable processor, the virtual machine kernel function comprising at least one pre-generated sequence of machine instructions adapted for the programmable processor, each of the at least one pre-generated sequence of machine instructions having a corresponding array of bytecodes, The programmable processor includes: The compiling module is configured to compile a target kernel function corresponding to a source code into a target byte code array, wherein the target byte code array comprises N target byte codes, and N is a positive integer; A determining module configured to determine, from the virtual machine kernel function, a pre-generated sequence of machine instructions corresponding to a kth target bytecode, using the kth target bytecode of the N target bytecodes, wherein k=0, 1, N-1; and the execution module is configured to execute the pre-generated machine instruction sequence corresponding to the kth target byte code.
16. An electronic device, the electronic device comprising: one or more processors; A memory including one or more computer program modules; wherein the one or more computer program modules are stored in the memory and configured to be executed by the one or more processors, the one or more computer program modules configured to implement the instruction processing method of any of claims 1-14.
17. A storage medium storing non-transitory computer readable instructions which, when executed by a computer, implement the instruction processing method of any one of claims 1-14.

Description

Instruction processing method, programmable processor, electronic device and storage medium Technical Field Embodiments of the present disclosure relate to the field of computers, and in particular, to an instruction processing method, a processor, an electronic device, and a storage medium. Background With the rapid development of artificial intelligence (ARTIFICIAL INTELLIGENCE, AI), text processing, image processing, and other technologies, the requirements on computing performance and data processing efficiency are increasing. In these scenarios, processors such as graphics processors (Graphic Processing Unit, GPUs), general-purpose graphics processors (GPGPUs), and the like, are core acceleration devices due to their powerful parallel computing capabilities. In order to adapt to efficient execution of parallel computing hardware platforms such as GPUs and GPGPGPUs in the scenes such as AI, image processing and the like, a programming language and a programming model facing the processor become a key bridge for connecting algorithm logic and hardware computing power. The current general programming language can map tasks such as matrix operation of a deep neural network, pixel-level operation of image processing and the like into kernel functions executable by a processor, and the computing potential of the multi-core parallel architecture is fully released. Along with the diversification of AI, models and application scenes, different hardware architectures provide differentiated requirements on aspects such as programming interfaces, memory management, execution scheduling and the like, the general programming language is difficult to consider the execution efficiency and the hardware suitability, and the special programming language for a specific processor hardware platform becomes a core support for guaranteeing the efficient deployment and operation of AI computing tasks. Disclosure of Invention At least one embodiment of the present disclosure provides an instruction processing method for a programmable processor, where the programmable processor has a virtual machine kernel function running thereon, the virtual machine kernel function including at least one pre-generated machine instruction sequence applicable to the programmable processor, each of the at least one pre-generated machine instruction sequence having a corresponding bytecode array, the instruction processing method including compiling a target kernel function corresponding to source code into a target bytecode array, where N is a positive integer, where N is a target bytecode array, determining a pre-generated machine instruction sequence corresponding to a kth target bytecode from the virtual machine kernel function using the kth target bytecode, where k=0, 1..n-1, and executing the pre-generated machine instruction sequence corresponding to the kth target bytecode. For example, in an instruction processing method provided in at least one embodiment of the present disclosure, the kth target byte code includes a first instruction address of a pre-generated machine instruction sequence corresponding to the kth target byte code, and determining, from the virtual machine kernel function, the pre-generated machine instruction sequence corresponding to the kth target byte code using the kth target byte code in the N target byte codes includes determining, using the first instruction address, a position of the pre-generated machine instruction sequence corresponding to the kth target byte code in the virtual machine kernel function. For example, in an instruction processing method provided in at least one embodiment of the present disclosure, a jump table is maintained in the virtual machine kernel function, where the jump table includes a mapping relationship between the first instruction address and a second instruction address of a pre-generated machine instruction sequence corresponding to the kth target byte code in the virtual machine kernel function, and determining, using the first instruction address, a position of the pre-generated machine instruction sequence corresponding to the kth target byte code in the virtual machine kernel function includes determining, based on the jump table, the second instruction address using the first instruction address, and jumping to, based on the second instruction address, a position of the pre-generated machine instruction sequence corresponding to the kth target byte code in the virtual machine kernel function. For example, in an instruction processing method provided in at least one embodiment of the present disclosure, the kth target bytecode includes a first bytecode address of the kth target bytecode in the target bytecode array, and the virtual machine kernel function includes a first jump instruction, where the first jump instruction is used to determine, before determining, from the virtual machine kernel function, a pre-generated machine instruction sequence corresponding to the kth