Correlated Power Noise Generator as a Low Costs DPA Countermeasures to Secure Hardware AES Cipher
Najeh Masmoudi Kamoun, Lilian Bossuet, Adel Ghazel

To cite this version:

HAL Id: hal-00679934
https://hal.archives-ouvertes.fr/hal-00679934
Submitted on 16 Mar 2012

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Correlated Power Noise Generator as a Low Cost DPA Countermeasure to Secure Hardware AES Cipher

Najeh Kamoun¹, Lilian Bossuet², and Adel Ghazel¹
¹CIRTA'COM, SUP'COM
Tunis, Tunisia
najeh.kamoun@insat.rnu.tn, adel.ghazel@supcom.rnu.tn
²IMS, University of Bordeaux
Bordeaux, France
lilian.bossuet@ims-bordeaux.fr

Abstract — To secure cryptography hardware implementation many works are focusing on side-channels attacks. For such attacks, several different countermeasures can be done at different levels abstraction. But all published countermeasures lead to a significant area and power consumption overhead. In this paper, we present a new countermeasure against DPA attack which also leads to very small implementation compared to existing countermeasures such as the most used: masking schemes. The proposed approach is to use a correlated power noise generator to remove the design power correlation with the secret key. Its efficiency is proved with a practical DPA attack realization on Actel Fusion FLASH FPGA and Xilinx Virtex 4 SRAM FPGA. With the proposed countermeasure, the full 128-bits AES implementation on Xilinx Virtex 4 has a smaller area overhead (12.78 times less) than masking scheme countermeasure.

Keywords: DPA, AES, FPGA, countermeasures, hardware security.

I. INTRODUCTION

The Advanced Encryption Standard (AES) [1] will replace the old DES algorithm [2] in a broad variety of applications. Now, AES has become a world-wide standard with an even wider use than the DES. During the AES selection process, the security of AES was evaluated with respect to all types of attacks.

While being resistant to the classical cryptanalytic methods, it turned out that so-called implementation attacks are a serious threat against naive implementations of the AES algorithm. Implementation attacks refer to a new class of cryptanalysis methods, which are aimed against implementations of cryptographic primitives. Power analysis attacks are passive non invasive attacks. Kocher et al. have shown they are very effective and relatively cheap to conduct in practice [3].

In this paper, we will focus on Differential Power Analysis DPA [3]. In this attack, an adversary predicts target device power consumption with an hypothetical model. Those predictions are then compared to the real, measured power consumption in purpose to recover secret key. DPA has been shown efficient to defeat processor, ASIC and even FPGA implementations [4–5].

To overcome this attack, countermeasures can be implemented to secure AES implementations. They can be at the gate, system, or algorithmic levels. The first ones are based on the logic styles that aim to achieve independence between secret key and the power consumption. The second ones focus on into deteriorating side-channel signal quality. The third ones mask the manipulated data to remove this dependence. At the gate level, they are the most costly for area and power consumption. In embedded design case, all this countermeasures are too area and power consuming. To answer to this issue, we propose a new low-cost architectural countermeasure.

In this paper, we investigate the possibility to counteract these attacks by using a new correlated power noise generator. This new countermeasure name is “interfering countermeasure”. Its basic idea is to add an interfering power signal which depends on the manipulated data and an interfering key. This key value has to be different from the secret key value.

Paper is organized as follows. In the second section we present some interesting previous works. We explain, in section 3, our new approach to secure AES implementation based on power interfering. Section 4 gives the experimental results that permit us to conclude to the countermeasure efficiency. In section 5, we discuss the FPGA implementation performance comparison between our solution and the masking scheme. Based on these results we draw some conclusions in section 6.

II. PREVIOUS WORKS

Power analysis attacks work because the cryptographic devices power consumption depends on the executed cryptographic algorithms intermediate values. Hence, countermeasure goal against such attacks is to make the power consumption independent of those intermediate values. Several ones are suggested in the literature [4]. At the security point of view, the more efficient countermeasures acts at the algorithmic level or at the logic level.

Algorithmic countermeasures can be restricted to the masking schemes. Its basic idea is to randomize the intermediate results that are produced during the computation of a cryptographic algorithm [6]. Such solutions need some modification in the design conception and a True Random Number Generator (TRNG) to generate the random masked [7]. Masking scheme is a stronger countermeasure against first order DPA. Nevertheless this countermeasure area overhead could be significant as we have shown in [8]. Moreover, such countermeasure is sensible to the glitch issue. It is inefficient if
the attacker obtain some post-layout information as demonstrated in [9].

Logic countermeasures are based on special leakage-resistant logic styles [10-11]. The power consumption of a circuit depends on the output transitions of the gates. If these transitions depend on secret information, the security of the implementation can be compromised. Hence, to make a circuit resistant against power analysis attacks, the power consumption should be independent of the secret information. The circuit-level approaches to achieve this can be divided into two categories: custom logic styles and standard logic styles. Custom logic styles are only applicable to custom ASIC implementations. Standard logic styles combine standard cells from existing libraries into new standard cells. In most of all cases these countermeasures are glitch sensitive. They need particular design of balanced architecture. Moreover such countermeasures are very area and power consuming [10-11].

In [12], Standaert et al. introduce architectural countermeasure. Their idea is to generate an additive power noise signal from the part of the AES blocs by using unrolling architecture. As the AES rounds input are uncorrelated, each round is a power noise generator for the other rounds point of view. Add a power noise to the instantaneous power consumption feel an interesting way to protect the cipher against the DPA. Nevertheless such countermeasure is inefficient. Actually, the averaging in DPA filters out uncorrelated noise from the differential power trace. With such countermeasure it is only arduous to perform the DPA but always possible. The main advantage of architecture countermeasure is the very low area cost and low power consumption without frequency decrease.

In order to take benefit of the architectural countermeasure with a high level of security, in the next section, we will introduce a new architectural countermeasure which uses a correlated power noise generator.

III. PRINCIPLE OF INTERFERENCE COUNTERMEASURE

In hardware implementation of AES, during the SubBytes step the power consumption traces are correlated with data computed in the design. These computing data are correlated with the AES secret key K used in the design. It seems like that K is printed in the power consumption traces. To be very quick and very efficient, DPA attacks take advantage of this security failure named Side Channel.

To eliminate the side channel, our idea is to interfere the AES cipher power signal with a power signal correlated with cipher input data Din and with an interfering random key Kinterf. This architecture level countermeasure is illustrated by figure 1. The input data Din is inserted simultaneously to two encryption cores: a classical AES core and an interference core used like a Correlated Power Noise Generator. The AES core performs the first AddRoundKey step with cipher data input Din and with the secret key K. At the same time, in the interference core, the cipher input data Din is provided to a similar module AddRoundKey but with the interfering key Kinterf. The outputs of the two AddRoundKey modules are applied to two similar SubBytes modules synchronously. The SubBytes modules outputs signals are two signals S and Sinterf.

Like the signal S switching, the signal Sinterf switching is correlated to Din. As the global power consumption is due to S and Sinterf switching both it is not only correlated to the secret K and the data Din. Actually, the global cipher power consumption is correlated to data Din and the couple of secret keys (K, Kinterf). Thereby, it is not possible to extract the secret key K by the classical DPA attacks as we will experimentally prove in the next section.

Nevertheless, to be a DPA efficient countermeasure, the only condition needed is to have different value for secret key K and interference secret key Kinterf as describe in the following rule:

\[ K_{interf} \neq K, K_{interf} \in [0..255] \]

To formally prove the benefit of the proposed countermeasure we use the Guilley et al. work [13]. In this work the authors give the first equation (1) that provides the formal justification to the DPA attacks with correlation analysis:

\[ E(P_{th} \times \left( (-1)^{j(i+1)} \right)) = \frac{\sum_{j \in J} \xi_j^\top \xi_j + \xi_j^\bot \xi_j^\bot}{2} \quad (1) \]

Where \( P_{th} \) is the chip power consumption, \((-1)^{j(i+1)}\) is the balanced hamming weight, \( J \) is the set of nets in the netlist, \( \xi_j^\top \) and \( \xi_j^\bot \) the fall and rise transition power of a single net.

This equation main assumption is that the manipulated data in the design are independent. In our case, we assume that the manipulated data are dependent. In conclusion, with our
interfering countermeasure the equation (1) is not valid. So it is impossible to recover the secret key \( K \) with classical first order DPA without cipher text use.

In the following section, we will experimentally prove the validity of this conclusion.

IV. EXPERIMENTAL VALIDATION AND RESULT

A. DPA experimentation

1) Methodology of validation

For DPA attack, the correlation analysis is chosen to evaluate the secret key. It consists in using Pearson coefficient between the measured power consumption \( V_M \) and the power model \( M_P \). In our work, we use the hamming weight for the 8 bit output of the SubBytes function.

\[
C(V_M, M_P(c_i)) = \frac{E(V_M, M_P(c_i)) - E(V_M)E(M_P(c_i))}{\sqrt{\text{var}(V_M) \cdot \text{var}(M_P(c_i))}}
\]

(2)

To validate our countermeasure, we used two experiments. The first demonstrates that with independent inputs the DPA attack is still achievable. The second shows unsuccessful DPA practical implementation in the secure design with the proposed countermeasure.

Before these two experiments, we have first attacked a sample design which only contain the two first steps of an AES cipher (AddRoundKey and SubBytes). The input data \( D_{in} \) is first added with a secret sub-key (8 bits) \( K \) with exclusive-or before to input the SubBytes module. To design the SubBytes module, we use the smallest S-Box up to date introduced by Canright [14] and we implement it [15].

With such DPA experimentation, figure 4 shows a successful correlation attack result with an unsecure cipher which uses the secret key \( K=43 \) with 1000 measurements. That confirms the DPA lab efficiency.
B. DPA Attack on two adjacent S-Box with independent input data

To validate our countermeasure, we first apply simultaneously two independent input data $D_{in1}$ and $D_{in2}$ to two AddRoundKey modules. These modules use two 8-bits distinct secret keys $K_1$ and $K_2$. Figure 5 shows the attacked design. As show by this figure, one part (driven by $D_{in2}$) of this dual architecture could be considered as an uncorrelated power noise generator for the other part (driven by $D_{in1}$).

We employ the same experimental DPA setup as explained in the previous subsection. The keys $K_1$ and $K_2$ are extracted with correlation power analysis as we can see on figure 6 ($K_1$ successful DPA attack) and figure 7 ($K_2$ successful DPA attack). The results are given with Actel Fusion FLASH FPGA,
however there are similarly testing with Xilinx Virtex4 SRAM FPGA.

As a conclusion we obtain the same results that Standaert’s work, the DPA filters out the uncorrelated power noise. An additive uncorrelated power noise is not an efficient DPA countermeasure DPA attack on secure AES S-Box with interference countermeasure.

1) DPA attack without considering the structure of the interference countermeasure

For the last experiment we use the proposed countermeasure. In this case, to generate a correlated power noise, the same input data \( D_m \) is used for the two modules (AddRoundKey and SubBytes) as figure 8 shows. So, as show by this figure, one part (which uses the interference secret key \( K_{interf} \)) of this dual architecture could be considered as a correlated power noise generator for the other part (which uses the interference secret key \( K \)).

Like for the validation test, we use 12 000 power traces measurements. The DPA doesn’t permit us to detect the secret key \( K \) as is shown in figure 9.

We increase the number of measurements up to 100 000. We do not detect the secret key \( K \). Figure 10 illustrates the result of unsuccessful attack with 20 480 measurements.

We have tested with all possible values (255 values corresponding to \( 2^8 \) values space without the secret key value) of interference 8-bits secret key \( K_{interf} \). The results are the same with all the possible keys.

2) DPA attack with considering the structure of the interference countermeasure

The structure of our interference countermeasure is taking in consideration for DPA attack. Indeed, it possible to take into account of the countermeasure by changing the DPA power consumption model \( PM \):

\[
PM = \mathbf{H}(S) + \mathbf{H}(S_{interf})
\]

Where \( \mathbf{H} \) is the Hamming weight function, \( S \) and \( S_{interf} \) are signals mentioned in figure 1. They are respectively the output of SubBytes with \( D_m \) as input and the useful key \( K \) and the interference key \( K_{interf} \).

We suppose that we doesn’t know the interference key \( K_{interf} \), we realize the DPA attack on our design with considering the power consumption model \( PM \) (equation (2)). The result illustrated by figure 11 shows an unsuccessful attack for our design with 20480 traces.

C. Conclusion

We are able to realize a successful DPA attack against an unsecured design with FPGA flash-based and SRAM-based implementation. The same experimental setup is used to try to attack a secure design with the proposed countermeasure. The results are shown that DPA doesn’t success to find the secret key \( K \). Such experimentation shows the proposed countermeasure robustness.

V. COMPARISON BETWEEN THE PERFORMANCE OF MASKING SCHEME AND THE PROPOSED COUNTERMEASURE

The proposed countermeasure uses the two operations SubBytes and AddRoundKey with 8-bit data. To compare its performance to the masking scheme, we implement the two operations with three ways: unsecured small S-Box proposed by [14], secure small S-Box with masking scheme proposed by [6] and secure small S-Box with our proposed interference countermeasure. Table I gives the implementation results with a Xilinx Virtex4 SRAM FPGA. Note that a Virtex4 slice is composed by two inputs Look Up Table, two D-Flip-Flop and one carry chain). All results are obtained by using Xilinx 8.2 ISE CAD tool version with default synthesis and place-and-route options.

This table clearly shows that the proposed countermeasure with interference method is very low area cost. The area overhead is only 16 Virtex4 slices for AddRoundKey and SubBytes implementation. It costs an area around 4 times smaller than the masking solution (without the TRNG area cost).

Unlike the masking solution, with our solution the critical path is not lengthened. Consequently, the maximal frequency for unsecure design and secure design with our proposed countermeasure is the same, whereas the maximal frequency is reduced with masking countermeasure.

We implemented the whole AES on the same FPGA. We use the 16 times S-Box for all the design. Table II summarizes the performance of the different implementations. We note that the implementation of the secure AES with masking do not take into consideration the implementation of the random generator to produce the mask (the same masks are used in all rounds). The area overhead of our secured implementation compared to the unsecured version is less than 5%. The masking solution gives an area overhead of about 60%.
In this paper, a new architectural countermeasure for DPA attacks against AES is introduced. This new low cost solution uses an efficient correlated power noise generator. As the power noise is correlated with the input data it is not possible to filter out it during DPA process. So this solution is DPA resistant as shown by our experimental result. Although, we have not test it with high order DPA, we assume that our solution is the lower cost first order DPA countermeasure. Indeed, some FPGA implementation results, with Xilinx Virtex4 device, have shown that comparing to existing masking scheme and secure logic countermeasure, our solution consume less area and power. Moreover, with an appropriate design, the proposed solution costs any frequency decrease. With the targeted FPGA, the secure AES cipher consumes only 5% more slice than the unsecure cipher. In the near future, we have to improve the proposed solution to secure the cipher against DPA on the last AES round. Such attack use encrypted texts as a prediction power consumption model input. Actually, the proposed countermeasure architecture (figure 1) could only prevent DPA against AES first round. Nevertheless, if the AES block encrypt output are never output of the chip, that is the case of many application, our solution is sufficient.

VI. CONCLUSION

In this paper, a new architectural countermeasure for DPA attacks against AES is introduced. This new low cost solution uses an efficient correlated power noise generator. As the power noise is correlated with the input data it is not possible to filter out it during DPA process. So this solution is DPA resistant as shown by our experimental result. Although, we have not test it with high order DPA, we assume that our solution is the lower cost first order DPA countermeasure. Indeed, some FPGA implementation results, with Xilinx Virtex4 device, have shown that comparing to existing masking scheme and secure logic countermeasure, our solution consume less area and power. Moreover, with an appropriate design, the proposed solution costs any frequency decrease. With the targeted FPGA, the secure AES cipher consumes only 5% more slice than the unsecure cipher.

In the near future, we have to improve the proposed solution to secure the cipher against DPA on the last AES round. Such attack use encrypted texts as a prediction power consumption model input. Actually, the proposed countermeasure architecture (figure 1) could only prevent DPA against AES first round. Nevertheless, if the AES block encrypt output are never output of the chip, that is the case of many application, our solution is sufficient.

VI. CONCLUSION

In this paper, a new architectural countermeasure for DPA attacks against AES is introduced. This new low cost solution uses an efficient correlated power noise generator. As the power noise is correlated with the input data it is not possible to filter out it during DPA process. So this solution is DPA resistant as shown by our experimental result. Although, we have not test it with high order DPA, we assume that our solution is the lower cost first order DPA countermeasure. Indeed, some FPGA implementation results, with Xilinx Virtex4 device, have shown that comparing to existing masking scheme and secure logic countermeasure, our solution consume less area and power. Moreover, with an appropriate design, the proposed solution costs any frequency decrease. With the targeted FPGA, the secure AES cipher consumes only 5% more slice than the unsecure cipher.

In the near future, we have to improve the proposed solution to secure the cipher against DPA on the last AES round. Such attack use encrypted texts as a prediction power consumption model input. Actually, the proposed countermeasure architecture (figure 1) could only prevent DPA against AES first round. Nevertheless, if the AES block encrypt output are never output of the chip, that is the case of many application, our solution is sufficient.

VI. CONCLUSION

In this paper, a new architectural countermeasure for DPA attacks against AES is introduced. This new low cost solution uses an efficient correlated power noise generator. As the power noise is correlated with the input data it is not possible to filter out it during DPA process. So this solution is DPA resistant as shown by our experimental result. Although, we have not test it with high order DPA, we assume that our solution is the lower cost first order DPA countermeasure. Indeed, some FPGA implementation results, with Xilinx Virtex4 device, have shown that comparing to existing masking scheme and secure logic countermeasure, our solution consume less area and power. Moreover, with an appropriate design, the proposed solution costs any frequency decrease. With the targeted FPGA, the secure AES cipher consumes only 5% more slice than the unsecure cipher.

In the near future, we have to improve the proposed solution to secure the cipher against DPA on the last AES round. Such attack use encrypted texts as a prediction power consumption model input. Actually, the proposed countermeasure architecture (figure 1) could only prevent DPA against AES first round. Nevertheless, if the AES block encrypt output are never output of the chip, that is the case of many application, our solution is sufficient.

VI. CONCLUSION

In this paper, a new architectural countermeasure for DPA attacks against AES is introduced. This new low cost solution uses an efficient correlated power noise generator. As the power noise is correlated with the input data it is not possible to filter out it during DPA process. So this solution is DPA resistant as shown by our experimental result. Although, we have not test it with high order DPA, we assume that our solution is the lower cost first order DPA countermeasure. Indeed, some FPGA implementation results, with Xilinx Virtex4 device, have shown that comparing to existing masking scheme and secure logic countermeasure, our solution consume less area and power. Moreover, with an appropriate design, the proposed solution costs any frequency decrease. With the targeted FPGA, the secure AES cipher consumes only 5% more slice than the unsecure cipher.

In the near future, we have to improve the proposed solution to secure the cipher against DPA on the last AES round. Such attack use encrypted texts as a prediction power consumption model input. Actually, the proposed countermeasure architecture (figure 1) could only prevent DPA against AES first round. Nevertheless, if the AES block encrypt output are never output of the chip, that is the case of many application, our solution is sufficient.

VI. CONCLUSION

In this paper, a new architectural countermeasure for DPA attacks against AES is introduced. This new low cost solution uses an efficient correlated power noise generator. As the power noise is correlated with the input data it is not possible to filter out it during DPA process. So this solution is DPA resistant as shown by our experimental result. Although, we have not test it with high order DPA, we assume that our solution is the lower cost first order DPA countermeasure. Indeed, some FPGA implementation results, with Xilinx Virtex4 device, have shown that comparing to existing masking scheme and secure logic countermeasure, our solution consume less area and power. Moreover, with an appropriate design, the proposed solution costs any frequency decrease. With the targeted FPGA, the secure AES cipher consumes only 5% more slice than the unsecure cipher.

In the near future, we have to improve the proposed solution to secure the cipher against DPA on the last AES round. Such attack use encrypted texts as a prediction power consumption model input. Actually, the proposed countermeasure architecture (figure 1) could only prevent DPA against AES first round. Nevertheless, if the AES block encrypt output are never output of the chip, that is the case of many application, our solution is sufficient.

VI. CONCLUSION

In this paper, a new architectural countermeasure for DPA attacks against AES is introduced. This new low cost solution uses an efficient correlated power noise generator. As the power noise is correlated with the input data it is not possible to filter out it during DPA process. So this solution is DPA resistant as shown by our experimental result. Although, we have not test it with high order DPA, we assume that our solution is the lower cost first order DPA countermeasure. Indeed, some FPGA implementation results, with Xilinx Virtex4 device, have shown that comparing to existing masking scheme and secure logic countermeasure, our solution consume less area and power. Moreover, with an appropriate design, the proposed solution costs any frequency decrease. With the targeted FPGA, the secure AES cipher consumes only 5% more slice than the unsecure cipher.

In the near future, we have to improve the proposed solution to secure the cipher against DPA on the last AES round. Such attack use encrypted texts as a prediction power consumption model input. Actually, the proposed countermeasure architecture (figure 1) could only prevent DPA against AES first round. Nevertheless, if the AES block encrypt output are never output of the chip, that is the case of many application, our solution is sufficient.

VI. CONCLUSION

In this paper, a new architectural countermeasure for DPA attacks against AES is introduced. This new low cost solution uses an efficient correlated power noise generator. As the power noise is correlated with the input data it is not possible to filter out it during DPA process. So this solution is DPA resistant as shown by our experimental result. Although, we have not test it with high order DPA, we assume that our solution is the lower cost first order DPA countermeasure. Indeed, some FPGA implementation results, with Xilinx Virtex4 device, have shown that comparing to existing masking scheme and secure logic countermeasure, our solution consume less area and power. Moreover, with an appropriate design, the proposed solution costs any frequency decrease. With the targeted FPGA, the secure AES cipher consumes only 5% more slice than the unsecure cipher.

In the near future, we have to improve the proposed solution to secure the cipher against DPA on the last AES round. Such attack use encrypted texts as a prediction power consumption model input. Actually, the proposed countermeasure architecture (figure 1) could only prevent DPA against AES first round. Nevertheless, if the AES block encrypt output are never output of the chip, that is the case of many application, our solution is sufficient.

VI. CONCLUSION

In this paper, a new architectural countermeasure for DPA attacks against AES is introduced. This new low cost solution uses an efficient correlated power noise generator. As the power noise is correlated with the input data it is not possible to filter out it during DPA process. So this solution is DPA resistant as shown by our experimental result. Although, we have not test it with high order DPA, we assume that our solution is the lower cost first order DPA countermeasure. Indeed, some FPGA implementation results, with Xilinx Virtex4 device, have shown that comparing to existing masking scheme and secure logic countermeasure, our solution consume less area and power. Moreover, with an appropriate design, the proposed solution costs any frequency decrease. With the targeted FPGA, the secure AES cipher consumes only 5% more slice than the unsecure cipher.

In the near future, we have to improve the proposed solution to secure the cipher against DPA on the last AES round. Such attack use encrypted texts as a prediction power consumption model input. Actually, the proposed countermeasure architecture (figure 1) could only prevent DPA against AES first round. Nevertheless, if the AES block encrypt output are never output of the chip, that is the case of many application, our solution is sufficient.