CN-122018986-A - Floating point disorder addition operation method and processor

CN122018986ACN 122018986 ACN122018986 ACN 122018986ACN-122018986-A

Abstract

The embodiment of the application discloses a floating point disordered addition operation method and a processor. According to the method, the vector source operands are obtained, each first floating point element in the vector source operands is subjected to precision expansion to obtain a first vector operand composed of a plurality of second floating point elements, meanwhile, the first vector operands are written into the logic vector register to perform unordered sum reduction of vector magnitude, so that a vector reduction result of the vector source operands is obtained, and finally, the vector reduction result is accumulated with the element at the lowest bit in the scalar register to obtain an operation result of the vector source operands, so that the numerical stability of a reduction process can be remarkably improved, the technical problem of performance bottleneck of vector reduction operation in a RISC-V vector expansion instruction set can be effectively solved, and a better hardware implementation scheme can be provided for a vector floating point accelerator with high performance, high precision and high compatibility.

Inventors

XUE YUAN

Assignees

广东跃昉科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260414

Claims (10)

1. A floating point disorder addition method, comprising: obtaining a vector source operand, wherein the vector source operand comprises a plurality of first floating point elements; The precision expansion is carried out on each first floating point element to obtain a first vector operand, wherein the first vector operand comprises a plurality of second floating point elements, and the ratio between the floating point number of the second floating point elements and the floating point number of the first floating point elements is an integer greater than or equal to 2; writing the first vector operand into a logical vector register, the logical vector register comprising at least one physical vector register; Carrying out unordered summation reduction of vector magnitude on a plurality of second floating point elements in the logic vector register to obtain a vector reduction result of the vector source operand; And accumulating the vector reduction result with the element with the lowest bit in the scalar register to obtain the operation result of the vector source operand.
2. The floating-point out-of-order addition method of claim 1, wherein said precision expanding each of said first floating-point elements to obtain a first vector operand comprises: Performing offset adjustment on the exponent fields of each first floating point element to obtain extended exponent fields, and performing zero extension on the lowest bit of the mantissa of each first floating point element to obtain extended mantissa; And combining sign bits, expanded exponent fields and expanded digits of the first floating point element to form the second floating point element.
3. The floating point out of order addition method of claim 1, further comprising, prior to said writing said first vector operand into a logical vector register: determining and controlling the vector length multiplying power of the logic vector register according to the vector bit width of the first vector operand and the physical bit width of the physical vector register; And selecting at least one physical vector register to form the logic vector register according to the vector length multiplying power.
4. A floating point disordered addition method in accordance with claim 3, characterized in that said writing said first vector operand into a logical vector register includes: if the vector length multiplying power is smaller than a preset length multiplying power threshold, writing the first vector operand into one physical vector register; and if the vector length multiplying power is greater than or equal to the length multiplying power threshold, writing the first vector operand into a plurality of physical vector registers.
5. The floating point out of order addition method of claim 4, wherein said writing said first vector operand into a plurality of said physical vector registers comprises: Splitting the first vector operand into a plurality of groups of floating point numbers based on the vector length multiplying power, wherein each group of floating point numbers comprises a plurality of second floating point elements; And inputting a plurality of groups of floating point numbers into one physical vector register, wherein the plurality of groups of floating point numbers correspond to one physical vector register.
6. The floating point out of order addition method of claim 4, wherein the logical vector registers comprise a first physical vector register and a second physical vector register, the physical bit width of the first physical vector register being equal to the physical bit width of the second physical vector register; The performing unordered sum reduction of vector magnitude on the plurality of second floating point elements in the logic vector register to obtain a vector reduction result of the vector source operand, including: If the vector length multiplying power is greater than or equal to the length multiplying power threshold, respectively carrying out disordered summation reduction on vector magnitude of the second floating point elements in the first physical vector register and the second physical vector register to obtain a plurality of third floating point elements; And writing each third floating point element into a first buffer register, and carrying out unordered summation reduction of vector magnitude on the third floating point elements in the first buffer register to obtain a vector reduction result of the vector source operand.
7. The floating point out of order addition method of claim 4, wherein the logical vector registers comprise a first physical vector register, a second physical vector register, and a third physical vector register, the physical bit width of the first physical vector register being twice the physical bit width of the second physical vector register, the physical bit width of the second physical vector register being equal to the physical bit width of the third physical vector register; The performing unordered sum reduction of vector magnitude on the plurality of second floating point elements in the logic vector register to obtain a vector reduction result of the vector source operand, including: If the vector length multiplying power is greater than or equal to the length multiplying power threshold, floating point disorder addition operation is carried out on the second floating point element in the first physical vector register, and a plurality of fourth floating point elements are obtained; Writing each fourth floating point element into a second buffer register, wherein the physical bit width of the second buffer register is equal to that of the second physical vector register; Performing floating-point disordered addition operation on the fourth floating-point element in the second buffer register and the second floating-point element in the second physical vector register to obtain a plurality of fifth floating-point elements; writing each fifth floating point element into a third buffer register, wherein the physical bit width of the third buffer register is equal to that of the second buffer register; And carrying out unordered summation reduction of vector magnitude on the fifth floating point element in the third buffer register and the second floating point element in the third physical vector register to obtain a vector reduction result of the vector source operand.
8. The floating point out of order addition method of claim 4, wherein said performing an out of order sum reduction of vector magnitude on said plurality of said second floating point elements in said logical vector register results in a vector reduction result for said vector source operand, comprising: And if the vector length multiplying power is smaller than the length multiplying power threshold, carrying out unordered summation reduction of vector magnitude on a plurality of second floating point elements in the logic vector register by adopting a multi-stage parallel floating point adder to obtain the vector reduction result.
9. The floating point disordered addition method of any of claims 1-8, further comprising, prior to said vector magnitude unordered sum reduction of said second plurality of floating point elements in said logical vector register to obtain a vector reduction result of said vector source operand: and preprocessing the first vector operand to obtain a preprocessed first vector operand.
10. A processor wherein the floating point out of order addition method of any one of claims 1-9 is performed using RISC-V vector out of order reduction floating point summation instructions.

Description

Floating point disorder addition operation method and processor Technical Field The present application relates to the field of computer technologies, and in particular, to a floating-point disordered addition method and a processor. Background In the parallel computing field, vector reduction operation is a basic operation in high-performance computing, artificial intelligence and scientific computing, and is widely applied to scenes such as matrix operation, statistical analysis, neural network training and the like. Vector reduction operation realizes efficient computation of operations such as addition, maximum value, minimum value and the like by merging a plurality of operands step by step according to a tree structure. In the RISC-V vector extended instruction set RVV (RISC-V Vector Extension), the vfredusum.vs. instruction is used as an unordered reduction sum instruction, allowing the implementation of a reduction tree structure that does not guarantee the order of operations to reduce the operation depth and improve the parallelism. However, in the process of performing unordered reduction and summation by adopting the vfeedusb.vs instruction, the required hardware structure is complex, the time sequence path is long, the expansibility is poor, the calculation precision is low, and the performance requirements of high throughput and low delay are difficult to be met. Disclosure of Invention Based on the above, the application provides a floating-point disordered addition operation method and a processor, which can not only effectively solve the technical problem of performance bottleneck of vector reduction operation in a RISC-V vector expansion instruction set, but also improve the stability and accuracy of a calculation result. In a first aspect, the present application provides a floating-point out-of-order addition method, including: obtaining a vector source operand, wherein the vector source operand comprises a plurality of first floating point elements; The method comprises the steps of carrying out precision expansion on each first floating point element to obtain a first vector operand, wherein the first vector operand comprises a plurality of second floating point elements, and the ratio of the floating point number of the second floating point elements to the floating point number of the first floating point elements is an integer greater than or equal to 2; writing the first vector operand into a logical vector register, the logical vector register comprising at least one physical vector register; Carrying out unordered summation reduction of vector magnitude on a plurality of second floating point elements in the logic vector register to obtain a vector reduction result of a vector source operand; and accumulating the vector reduction result with the element at the lowest bit in the scalar register to obtain the operation result of the vector source operand. Further, in the floating-point disordered addition operation method provided by the present application, precision expansion is performed on each first floating-point element to obtain a first vector operand, including: performing offset adjustment on the exponent field of each first floating point element to obtain an expanded exponent field, and performing zero expansion on the lowest position of the mantissa of each first floating point element to obtain an expanded mantissa; the sign bit, the extended exponent field, and the extended digit number of the first floating point element are combined to form a second floating point element. Further, in the floating-point disordered addition operation method provided by the present application, before writing the first vector operand into the logic vector register, the method further includes: Determining the vector length multiplying power of the control logic vector register according to the vector bit width of the first vector operand and the physical bit width of the physical vector register; At least one physical vector register is selected to form a logical vector register according to the vector length multiplying power. Furthermore, in the floating-point disordered addition operation method provided by the present application, writing the first vector operand into the logic vector register includes: If the vector length multiplying power is smaller than a preset length multiplying power threshold, writing the first vector operand into a physical vector register; if the vector length multiplier is greater than or equal to the length multiplier threshold, the first vector operand is written into a plurality of physical vector registers. Furthermore, in the floating-point disordered addition operation method provided by the present application, writing the first vector operand into the plurality of physical vector registers includes: Splitting the first vector operand into a plurality of groups of floating point numbers based on the vector length multiplying power, wherein each