US-20260128874-A1 - Secured Hardware Processing Device
Abstract
A hardware processing device is provided comprising (i) several MAC units arranged to be operable in a secure mode conducting at least one addition of a first value and a second value, wherein the first value is represented by a number of shares and the second value is represented by the same number of shares; and at least one multiplication of the first value and the second value based on their shares and a random number; (ii) a multiplexer to switch between the secure mode and a normal mode, wherein the several MAC units are arranged to operate in the normal mode on the first value and the second value instead of the shares of the first value and the shares of the second value.
Inventors
- Bernd Meyer
- Florian Mendel
Assignees
- INFINEON TECHNOLOGIES AG
Dates
- Publication Date
- 20260507
- Application Date
- 20251105
- Priority Date
- 20241105
Claims (4)
- 1 . A hardware processing device, comprising a plurality of multiply-accumulate units (MAC units), configured so as to conduct, in a secure mode, at least one addition of a first value and a second value, wherein the first value is represented by a number of shares and the second value is represented by the same number of shares; at least one multiplication of the first value and the second value based on their shares and a random number; a multiplexer to switch between the secure mode and a normal mode, wherein the plurality of MAC units are configured so as to, in the normal mode, operate on the first value and the second value instead of the shares of the first value and the shares of the second value.
- 2 . The hardware processing device of claim 1 , wherein the number of shares is two; wherein the first value is x with a length n, represented by the shares x 0 and x 1 such that x = x 0 + x 1 mod 2 n wherein the second value is y with the length n, represented by the shares y 0 and y 1 such that y = y 0 + y 1 mod 2 n ; wherein the addition is conducted according to ( x 0 , x 1 ) + ( y 0 , y 1 ) = ( x 0 + y 0 mod 2 n , x 1 + y 1 mod 2 n ) ; wherein the multiplication is conducted according to ( x 0 , x 1 ) · ( y 0 , y 1 ) = ( r + x 0 · y 0 + x 0 · y 1 mod 2 n , - r + x 1 · y 1 + x 1 · y 0 mod 2 n ) , with r being the random number.
- 3 . The hardware processing device of claim 1 , further comprising a random generator configured to determine the random number.
- 4 . The hardware processing device of claim 1 , wherein the hardware processing device is a hardware accelerator for neural networks.
Description
TECHNICAL FIELD The present disclosure is related to secure processing in hardware devices. BACKGROUND An Artificial Intelligence (AI) accelerator, deep learning processor or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. An exemplary AI integrated circuit chip may contain tens of billions of MOSFETs. This sort of dedicated hardware is one particular example of a hardware processing device, also referred to herein as accelerator. Such an accelerator is typically used to speed up the computation of a neural network during training or inference. The accelerator may be subject to attacks, e.g., side channel analysis (SCA). For example, timing analysis (TA) and simple power analysis (SPA) may reveal at least a portion of the topology of the neural network. A differential power analysis (DPA) or differential fault analysis (DFA) may give away weights, bias constants and/or activation functions of the neural network. Moreover, SCA may also be used to extract or modify data processed by the accelerator during training or inference. Existing approaches provide no or insufficient protection against any attacks based on SCA, TA, SPA, DPA or DFA. Such attacks may also be referred to as side channel attacks. It is therefore an objective to secure or harden a hardware processing device, in particular said accelerator, against any such attack in an cost-efficient way. SUMMARY This objective may be achieved with the embodiments described herein. The examples suggested herein may be based on at least one of the following solutions. In particular, combinations of the following features could be utilized in order to reach a desired result. A hardware processing device is suggested, comprising several MAC units arranged to be operable in a secure mode conducting at least one addition of a first value and a second value, wherein the first value is represented by a number of shares and the second value is represented by the same number of shares; at least one multiplication of the first value and the second value based on their shares and a random number;a multiplexer to switch between the secure mode and a normal mode, wherein the several MAC units are arranged to operate in the normal mode on the first value and the second value instead of the shares of the first value and the shares of the second value. It is noted that “random” or “randomized” used in the context of this application may in particular refer to true randomness, pseudo randomness or even to some deterministic approach that may introduce a sufficient level of entropy. Toggling between the secure mode and the normal mode introduces a flexibility to only conduct those operations in the secure mode that need to be obfuscated due to potential side channel attacks. This allows adjusting the efficiency of the hardware processing device according to a predefined need or demand. According to an embodiment, the number of shares is two;the first value is x with a length n, represented by the shares x0 and x1 such that x=x0+x1mod2n;the second value is y with the length n, represented by the shares y0 and y0 such that y=y0+y1mod2n;the addition is conducted according to (x0,x1)+(y0,y1)=(x0+y0mod2n,x1+y1mod2n)the multiplication is conducted according to (x0,x1)·(y0,y1)=(r+x0·y0+x0·y1mod2n,-r+x1·y1+x1·y0mod2n),with r being the random number. According to an embodiment, the hardware processing device further comprises a random generator determining the random number. The random generator mentioned herein may in particular provide a predefined level of entropy. According to an embodiment, the hardware processing device is a hardware accelerator for neural networks. BRIEF DESCRIPTION OF THE FIGURES Embodiments are shown and illustrated with reference to the drawings. The drawings serve to illustrate the basic principle, so that only aspects necessary for understanding the basic principle are illustrated. The drawings are not to scale. In the drawings the same reference characters denote like features. FIG. 1 shows a block diagram visualizing how to implement a multiplication in a secure way. FIG. 2 shows an exemplary implementation in an accelerator utilizing pipelining. FIG. 3 shows a diagram of an alternative accelerator without pipelining. DETAILED DESCRIPTION Examples presented herein in particular allow for a randomized masking of data processed by an accelerator, which may be used for quantized neural network inference. An exemplary accelerator for inference is a DMA-capable (DMA: direct memory access) peripheral for autonomous evaluation of quantized neural networks. It may comprise a single-instruction-multiple-data (SIMD) concept. Several multiply-accumulate (MAC) units may work in parallel on integer data and fixed point or floating point data. Integer data may have a leng