CN-122018987-A - RISC-V vector sliding method and execution unit
Abstract
The application relates to the field of RISC-V architecture, and provides a RISC-V vector sliding method and an execution unit. The method comprises the steps of receiving sliding configuration information and a plurality of micro-operation requests sent by an instruction scheduling module, splitting vector sliding instructions by an instruction decoding module to obtain the micro-operation requests, indicating a first vector register and a second vector register, performing splicing processing on data in the first vector register indicated by all the micro-operation requests to obtain spliced data, performing sliding processing on the spliced data by using a cyclic shifter according to the sliding configuration information to obtain a cyclic shift result, splitting the cyclic shift result, and storing the split cyclic shift result in the second vector register indicated by the micro-operation requests. The application provides a method for realizing RISC-V vector sliding by using a cyclic shifter to respond to a vector sliding instruction, which can reduce the circuit area and the data parallelization processing, improve the system performance and only needs 1 execution period under the condition of different micro-operation request numbers.
Inventors
- Xian Youlong
Assignees
- 成都群芯微电子科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260121
Claims (18)
- 1. A RISC-V vector sliding method, applied to an execution unit, comprising: The method comprises the steps of receiving sliding configuration information and a plurality of micro-operation requests sent by an instruction scheduling module, wherein the micro-operation requests are obtained by splitting vector sliding instructions by an instruction decoding module and are used for indicating a first vector register and a second vector register; splicing the data in the first vector registers indicated by all the micro-operation requests to obtain spliced data; and performing sliding treatment on the spliced data by using a cyclic shifter according to the sliding configuration information to obtain a cyclic shift result, splitting the cyclic shift result, and storing the cyclic shift result into a second vector register indicated by the micro-operation request.
- 2. The method of claim 1, wherein performing sliding processing on the spliced data according to the sliding configuration information by using a cyclic shifter to obtain a cyclic shift result, comprises: determining element bit width, offset and sliding type according to the sliding configuration information; Inputting the element bit width, the offset, the sliding type and the spliced data to the cyclic shifter, so that the cyclic shifter performs sliding processing on the spliced data according to the element bit width, the offset and the sliding type to obtain a cyclic shift result.
- 3. The method of claim 2, wherein performing sliding processing on the spliced data according to the sliding configuration information by using a cyclic shifter to obtain a cyclic shift result, further comprises: When the sliding type is sliding downwards, calculating a first mask according to the maximum effective element quantity, the offset and the quantity to be shifted, wherein the quantity to be shifted is all 1 quantity, and the data quantity of the quantity to be shifted is equal to the maximum effective element quantity; And carrying out masking operation on the cyclic shift result by using the first mask to obtain a first correction shift result.
- 4. The method of claim 3, wherein calculating the first mask based on the maximum effective element amount, the offset, and the amount to be shifted comprises: Subtracting the offset from the maximum effective element amount to obtain a left offset; shifting the left shift amount to the left of the to-be-shifted amount to obtain first data; and performing inverting operation on the first data to obtain a first mask.
- 5. The method of claim 3, wherein performing sliding processing on the spliced data according to the sliding configuration information by using a cyclic shifter to obtain a cyclic shift result, further comprises: And when the sliding type is up sliding, the offset is 1 and the initial sliding index is 0, replacing the minimum index element in the cyclic shift result with an integer or floating point value to obtain a second correction shift result.
- 6. The method of claim 3, wherein performing sliding processing on the spliced data according to the sliding configuration information by using a cyclic shifter to obtain a cyclic shift result, further comprises: When the sliding type is sliding and the offset is 1, calculating a second mask and a third mask according to the maximum effective element quantity, the actual effective element quantity and the quantity to be shifted, wherein the second mask is used for indicating the index position for removing the maximum index in the actual effective element; masking the second mask and the cyclic shift result to obtain second data; performing mask operation on the third mask and the integer or floating point value to obtain third data; And performing OR operation on the second data and the third data to obtain a third correction shift result.
- 7. The method of claim 6, wherein calculating the second mask and the third mask based on the maximum effective element amount, the actual effective element amount, and the amount to be shifted comprises: subtracting the actual effective element amount from the maximum effective element amount to obtain a right shift amount; Right shifting the to-be-shifted amount by the right shifting amount to obtain fourth data; Shifting the fourth data by 1 bit to the right to obtain a second mask; and performing exclusive or operation on the fourth data and the second mask to obtain a third mask.
- 8. An execution unit, comprising: the device comprises an instruction scheduling module, a receiving module, a vector sliding module and a vector decoding module, wherein the instruction scheduling module is used for scheduling a vector sliding instruction according to a sliding configuration information and a plurality of micro-operation requests; The splicing module is used for carrying out splicing processing on the data in the first vector register indicated by all the micro-operation requests to obtain spliced data; And the cyclic shifter is used for carrying out sliding treatment on the spliced data according to the sliding configuration information to obtain a cyclic shift result, splitting the cyclic shift result and storing the cyclic shift result into a second vector register indicated by the micro-operation request.
- 9. The execution unit of claim 8, further comprising a first correction module; The first correction module comprises first mask generation logic and AND logic; The first mask generation logic is used for calculating a first mask according to the maximum effective element quantity, the offset and the quantity to be shifted when the sliding type is sliding downwards, wherein the quantity to be shifted is all 1 quantity, and the data quantity of the quantity to be shifted is equal to the maximum effective element quantity; and the AND logic is used for performing AND operation on the first mask and the cyclic shift result to obtain a first correction shift result.
- 10. The execution unit of claim 9, wherein the first mask generation logic comprises: A first calculator for subtracting the offset from the maximum effective element amount to obtain a left shift; the left shifter is used for shifting the left shift amount to the left of the to-be-shifted amount to obtain first data; and the negation device is used for performing negation operation on the first data to obtain a first mask.
- 11. The execution unit of claim 9, wherein the execution unit further comprises a second correction module; and the second correction module is used for replacing the minimum index element in the cyclic shift result with an integer or floating point value when the sliding type is upward sliding, the offset is 1 and the initial sliding index is 0, so as to obtain a second correction shift result.
- 12. The execution unit of claim 11, wherein the second correction module comprises a first selector and a splicer; One input of the first selector is the minimum index element in the cyclic shift result, and the other input is an integer or floating point value; and the input of the splicer is the element of the cyclic shift result after the minimum index element is removed and the output of the first selector, and the input data is subjected to splicing processing to obtain a second correction shift result.
- 13. The execution unit of claim 9, wherein the execution unit further comprises a third correction module; The third correction module comprises second mask generation logic, first mask processing logic, second mask processing logic and OR logic; the second mask generation logic is used for calculating a second mask and a third mask according to the maximum effective element quantity, the actual effective element quantity and the quantity to be shifted when the sliding type is sliding and the offset quantity is 1, wherein the second mask is used for indicating the index position for removing the maximum index in the actual effective element; the first mask processing logic is configured to perform a mask operation on the second mask and the cyclic shift result to obtain second data; The second mask processing logic is configured to perform a mask operation on the third mask and the integer or floating point value to obtain third data; and the OR logic is used for performing OR operation on the second data and the third data to obtain a third correction shift result.
- 14. The execution unit of claim 13, wherein the second mask generation logic comprises: a second calculator for subtracting the actual effective element amount from the maximum effective element amount to obtain a right shift amount; the first right shift logic is used for shifting the right shift amount to the right of the to-be-shifted amount to obtain fourth data; Second right shift logic for shifting the fourth data by 1 bit to the right to obtain a second mask; exclusive-or logic configured to exclusive-or the fourth data and the second mask to obtain a third mask.
- 15. The execution unit of claim 9, wherein the execution unit further comprises a second selector; The second selector is connected to the cyclic shifter and the first mask generation logic, and is configured to select a cyclic shift result from an output of the cyclic shifter and an output of the first mask generation logic.
- 16. The execution unit of claim 8, wherein the cyclic shifter comprises a controller, a plurality of input modules, a third selector, and a shifter, wherein the element bit widths of the input modules are different; The controller is used for generating a first signal according to the element bit width and the sliding type in the sliding configuration information and sending the first signal to the input module, generating a second signal according to the element bit width and sending the second signal to the third selector, and sending an offset to the input module, wherein the first signal is used for indicating an effective input module and the sliding type; The input module is used for carrying out steering processing on the offset, the first signal indicates that the offset is converted into the displacement according to the element bit width of the input module when the input module is effective and the sliding type is upward sliding, the first signal indicates that the offset is converted into the displacement according to the element bit width of the input module when the input module is effective and the sliding type is downward sliding, and the displacement is output; The third selector is connected with the plurality of input modules and is used for determining and outputting the effective displacement amount of the input modules according to the second signals; The shifter is connected with the third selector and is used for performing cyclic sliding processing on the spliced data according to the shift quantity to obtain a cyclic shift result.
- 17. The execution unit of claim 16, wherein the input module comprises a first conversion circuit, a fourth selector, and a second conversion circuit, wherein the element bit widths of the fourth selectors are different; The first conversion circuit comprises a first branch and a second branch, wherein the input end of the first branch is connected with the input end of the second branch and is used for inputting offset; Two input ends of the fourth selector are respectively connected with the output end of the first branch and the output end of the second branch; the output end of the fourth selector is connected with the second conversion circuit; the control end of the fourth selector is connected with the controller and is used for outputting the offset of the second branch when the first signal indicates that the input module is effective and the sliding type is upward sliding; The second conversion circuit is connected with the fourth selector and is used for converting the offset output by the fourth selector into the shift according to the element bit width of the input module.
- 18. The execution unit of claim 17, wherein the first branch comprises an inverter and an adder; The reverser is used for reversing the displacement; The adder is connected with the reverser and is used for adding 1 to the data after the reversing processing.
Description
RISC-V vector sliding method and execution unit Technical Field The application relates to the field of RISC-V architecture, and provides a RISC-V vector sliding method and an execution unit. Background Vector slide instructions in the RISC-V architecture indicate sliding elements up and down a vector register set, but no specific implementation of element sliding in a vector register based on vector slide instructions is seen in the prior art. Disclosure of Invention The application provides a RISC-V vector sliding method and an execution unit, which are used for solving the problem that a specific implementation scheme for realizing element sliding in a vector register based on a vector sliding instruction is not found in the prior art. The first aspect of the present application provides a RISC-V vector sliding method, applied to an execution unit, comprising: The method comprises the steps of receiving sliding configuration information and a plurality of micro-operation requests sent by an instruction scheduling module, wherein the micro-operation requests are obtained by splitting vector sliding instructions by an instruction decoding module and are used for indicating a first vector register and a second vector register; splicing the data in the first vector registers indicated by all the micro-operation requests to obtain spliced data; and performing sliding treatment on the spliced data according to the sliding configuration information by using a cyclic shifter to obtain a cyclic shift result, splitting the cyclic shift result, and storing the cyclic shift result into a second vector register indicated by the micro-operation request. A second aspect of the application provides an execution unit comprising: the device comprises an instruction scheduling module, a receiving module, a vector sliding module and a vector decoding module, wherein the instruction scheduling module is used for scheduling a vector sliding instruction according to a sliding configuration information and a plurality of micro-operation requests; The splicing module is used for carrying out splicing processing on the data in the first vector register indicated by all the micro-operation requests to obtain spliced data; and the cyclic shifter is used for carrying out sliding treatment on the spliced data according to the sliding configuration information to obtain a cyclic shift result, splitting the cyclic shift result and storing the cyclic shift result into the second vector register indicated by the micro-operation request. The application receives the sliding configuration information and a plurality of micro-operation requests sent by the instruction scheduling module, performs splicing processing on the data in the first vector register indicated by the micro-operation requests to obtain spliced data, performs sliding processing on the spliced data according to the sliding configuration information by using the cyclic shifter to obtain a cyclic shift result, splits the cyclic shift result and stores the cyclic shift result into the second vector register indicated by the micro-operation requests, thereby realizing RISC-V vector sliding, realizing RISC-V vector sliding by using the cyclic shifter, reducing circuit area, realizing parallelization processing of the data, improving system performance, and only needing 1 execution period under the condition of different micro-operation request numbers. The foregoing and other objects, features and advantages of the application will be apparent from the following more particular description of preferred embodiments, as illustrated in the accompanying drawings. Drawings In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. FIG. 1 shows a first flowchart of a RISC-V vector slide method of an embodiment of the present application; FIG. 2 shows a flow chart of a first modification process of an embodiment of the present application; FIG. 3 shows a flow chart of a first mask calculation process of an embodiment of the present application; FIG. 4 shows a flow chart of a third modification process of an embodiment of the present application; FIG. 5 is a flowchart of a second mask and third mask calculation process according to an embodiment of the present application; FIG. 6 is a schematic diagram showing two offsets to slide up in an embodiment of the application; FIG. 7 is a schematic diagram showing two offsets slid down in accordance with an embodiment of the present application; FIG. 8 is a diagram showing a slide up by an offset when the start slide index is 0 according to an