CN-122019949-A - Binary partial product-based high-precision random calculation method and multiplier
Abstract
The invention belongs to the field of heterogeneous approximation calculation, and relates to a binary partial product-based high-precision random calculation method and a multiplier. The high-precision random calculation method based on the binary partial product directly uses the weight of the bit corresponding to the binary number to generate a random bit stream. The bit streams are combined in a uniform distribution mode, the bit phases are mutually combined, and the phase-phase results are all added to obtain the random calculation multiplier with the optimal precision. The method combines binary weights, replaces addition by a shifting and splicing mode, realizes random calculation by partial product addition, and adopts an approximate parallel counter to optimize. The multiplier provided by the invention fully combines the advantages of random calculation and binary calculation, has hardware cost less than half of that of the binary multiplier, and has lower delay. When the proposed multiplier is applied in neural network estimation, the accuracy of model loss is almost negligible.
Inventors
- ZHAO YUDI
- HE QIANG
- ZHAO KAI
Assignees
- 北京信息科技大学
Dates
- Publication Date
- 20260512
- Application Date
- 20251224
Claims (8)
- 1. The high-precision random calculation method based on the binary partial product is characterized by comprising the following steps of: acquiring two input multipliers A and B, and correspondingly generating parallel random bit streams based on A, B binary weights; fixing the random bit stream generated by the multiplier A, combining the parallel random bit stream generated by the multiplier B with the random bit stream of the multiplier A according to a uniform distribution rule, and obtaining the optimal random calculation combination according to the bit and the random bit stream; Performing phase-separating on the optimal random calculation combination, shifting and splicing the same AND gate result to form binary numbers; the binary numbers are mapped to partial products of binary multiplication, and the partial products are added to obtain a multiplication result.
- 2. The method of claim 1, wherein the A, B-based binary weight correspondence generates parallel random bit streams, comprising: When A is In which Is given by the weight of I is 1,2,..n, the correspondingly generated random bit stream contains Personal (S) I.e. sequentially comprises Personal (S) , Personal (S) ,..., Personal (S) ; When B is In which Is given by the weight of I is 1,2,..n, the correspondingly generated random bit stream contains Personal (S) I.e. sequentially comprises Personal (S) , Personal (S) ,..., Personal (S) 。
- 3. The method of claim 1, wherein the uniform distribution rule comprises: When the random bit stream of A is fixed as In the case of B is I.e. A Personal (S) Respectively allocated to Personal (S) 、 Personal (S) ... 1.1 And 1 , Personal (S) Assigned to Personal (S) 、 Personal (S) ... 1.1 And 1 ,... Last 1 Assigned to 。
- 4. The method of claim 1, wherein adding the partial products to obtain the multiplication result comprises approximately accumulating a plurality of 1-weighted bits in the partial products using an approximately parallel counter that alternates with an AND gate and an OR gate in place of a full adder.
- 5. The method of claim 4, wherein the approximately parallel counter generates a 0 phase with a 0 phase, generates a "-1" error, generates a1 phase with a 0 phase, generates a "+1" error, and performs error compensation by alternating the occurrence of and or gates.
- 6. A binary partial product based high precision random computation multiplier comprising: the random bit stream generating module is responsible for acquiring two input multipliers A and B and correspondingly generating parallel random bit streams based on the binary weight of A, B; The partial product represents a random calculation module, which is used for fixing a random bit stream generated by a multiplier A, combining a parallel random bit stream generated by a multiplier B with the random bit stream of the multiplier A according to a uniform distribution rule, obtaining an optimal random calculation combination according to a bit sum, performing phase-mixing on the optimal random calculation combination, shifting and splicing the same AND gate result to form a binary number, corresponding the formed binary number with the partial product of binary multiplication, and obtaining a multiplication result by adding the partial product.
- 7. The random computing multiplier of claim 6, further comprising an approximate parallel counter module responsible for approximately accumulating a plurality of 1 weighted bits in the partial product using an approximate parallel counter alternating with and or gates instead of a full adder.
- 8. The random computing multiplier of claim 6, wherein the partial product of an n-bit binary multiplier Is that Wherein i is 1-8, and the following formula is adopted to represent an n-bit random calculation multiplier: Wherein, the Representing the result of the random calculation multiplication.
Description
Binary partial product-based high-precision random calculation method and multiplier Technical Field The invention belongs to the field of heterogeneous approximation calculation, and particularly relates to a binary partial product-based high-precision random calculation method and a multiplier. Background The random computation (Stochastic Computing) is an approximate computation method that uses a random bit stream with a weight of 1 to represent the magnitude of the value. As shown in fig. 1 (a), the random bit stream is obtained by comparing a random number generated by a random number generator (Stochastic Number Generator, SNG) with an actual binary number, and when the generated random number is smaller than the actual number, a random bit stream "1" is output, and the random bit stream indicates a value in which the number of "1" is divided by the bit stream length. In view of the above features, random computation is also called probability computation, and its input and output are expressed as probabilities that a random number sequence is smaller than an actual value. Thus, the longer the random bit stream, the closer the result is to the true probability and the higher the precision. The N-bit operation generally corresponds to a bit stream length of 2 N, and can be adjusted according to the required precision requirement in practical application. The random number sequences generated by different SNGs also affect the accuracy of random calculation, and low-difference SNGs are mainly used at present, so that higher accuracy can be achieved when the random bit stream is shorter. The operation unit for random calculation is extremely simple, AND multiplication can be completed by only one AND gate. Fig. 1 shows an example of multiplication based on random computation, in which a 3-bit binary number is converted into a random bit stream with a length of 8 to participate in the operation, for example, when a is 4, the corresponding bit stream is {1,0,1,0,0,0,1,1}, the corresponding probability value P (a) is 4/8, and the random bit stream of B is then summed to obtain P (AB) =p (a) & P (B) =4/8×4/8= {0,0,1,0,0,0,1,0} =2/8. Although random computation has the advantages of small area and low power consumption, in practical application, binary numbers are usually required to be converted into random bit streams, and then converted into binary numbers after the operation is completed. Therefore, the hardware cost of the conversion units such as SNG, the comparator, the counter and the like is also considered in practical application. Furthermore, the random bit stream length of the random calculation means a calculation period, and thus it takes a long time to complete one calculation. The above features make random computation challenging in practical applications. Disclosure of Invention In view of the above problems, the present invention aims to construct a multiplier that combines the advantages of binary computation and random computation, can directly use binary partial product addition to realize random computation, so that the multiplier can be directly applied to a binary computing system, and proposes a two-layer approximate parallel counter to further optimize hardware resources. In order to achieve the above object, the technical scheme of the present invention includes the following: a high-precision random calculation method based on binary partial product comprises the following steps: Acquiring two input multipliers A and B, and correspondingly generating parallel random bit streams based on binary weights of A, B respectively; Fixing the random bit stream generated by the multiplier A, combining the parallel random bit stream generated by the multiplier B with the random bit stream of the multiplier A according to a uniform distribution rule, and obtaining the optimal random calculation combination according to the bit and the bit; Performing phase-separating on the optimal random calculation combination, shifting and splicing the same AND gate result to form binary numbers; the binary numbers are mapped to partial products of binary multiplication, and the partial products are added to obtain a multiplication result. Further, when adding binary partial products, it is found that there are a large number of bit additions with a weight of 1, and an approximate parallel counter (using and gate and or gate alternation instead of full adder) can be used to further reduce the resource consumption. Further, the binary weight correspondence based on A, B generates parallel random bit streams, and generates (represented as a link in hardware) for weight links carried by bits directly according to multipliers. For example when A isAt this timeIs given by the weight of,Is given by the weight of,Is given by the weight of,Is given by the weight ofThe corresponding generated random bit stream is. In general, when A isIn whichIs given by the weight ofI is 1,2,..n, the correspondingly generated ran