Output-sensitive Information flow analysis

Constant-time programming is a countermeasure to prevent cache based attacks where programs should not perform memory accesses that depend on secrets. In some cases this policy can be safely relaxed if one can prove that the program does not leak more information than the public outputs of the computation. We propose a novel approach for verifying constant-time programming based on a new information flow property, called output-sensitive noninterference. Noninterference states that a public observer cannot learn anything about the private data. Since real systems need to intentionally declassify some information, this property is too strong in practice. In order to take into account public outputs we proceed as follows: instead of using complex explicit declassification policies, we partition variables in three sets: input, output and leakage variables. Then, we propose a typing system to statically check that leakage variables do not leak more information about the secret inputs than the public normal output. The novelty of our approach is that we track the dependence of leakage variables with respect not only to the initial values of input variables (as in classical approaches for noninterference), but taking also into account the final values of output variables. We adapted this approach to LLVM IR and we developed a prototype to verify LLVM implementations.


Introduction
An important task of cryptographic research is to verify cryptographic implementations for security flaws, in particular to avoid so-called timing attacks. Such attacks consist in measuring the execution time of an implementation on its execution platform. For instance, Brumley and Boneh [12] showed that it was possible to mount remote timing attacks by against OpenSSL's implementation of the RSA decryption operation and to recover the key. Albrecht and Paterson [3] showed that the two levels of protection offered against the Lucky 13 attack from [2] in the first release of the new implementation of TLS were imperfect. A related class of attacks are cache-based attacks in which a malicious party is able to obtain memory-access addresses of the target program which may depend on secret data through observing cache accesses. Such attacks allow to recover the complete AES keys [17].
A possible countermeasure is to follow a very strict programming discipline called constant-time programming. Its principle is to avoid branchings controlled by secret data and memory load/store operations indexed by secret data. Recent secure C libraries such as NaCl [10] or mbedTLS 1 follow this programming discipline. Until recently, there was no rigorous proof that constant-time algorithms are protected against cache-based attacks. Moreover, many cryptographic implementations such as PolarSSL AES, DES, and RC4 make array accesses that depend on secret keys and are not constant time. Recent works [6,4,11] fill this gap and develop the first formal analyzes that allow to verify if programs are correct with respect to the constant-time paradigm.
An interesting extension was brought by Almeida et al. [4] who enriched the constanttime paradigm "distinguishing not only between public and private input values, but also between private and publicly observable output values". This distinction raises interesting technical and theoretical challenges. Indeed, constant-time implementations in cryptographic libraries like OpenSSL include optimizations for which paths and addresses can depend not only on public input values, but also on publicly observable output values. Hence, considering only input values as non-secret information would thus incorrectly characterize those implementations as non-constant-time. [4] also develops a verification technique based on symbolic execution. However, the soundness of their approach depends in practice on the soundness of the underlying symbolic execution engine, which is very difficult to guarantee for real-world programs with loops. Moreover, their product construction can be very expensive in the worst case.
In this paper we deal with statically checking programs for output-sensitive constanttime correctness: programs can still do branchings or memory accesses controlled by secret data if the information that is leaked is subsumed by the normal output of the program.
To give more intuition about the property that we want to deal with, let us consider the following example, where ct eq is a constant time function that allows to compare the arguments: But branching on good is a benign optimization, since anyway, the value of good is the normal output of the program. Hence, even if the function is not constant-time, it should be considered output-sensitive constant time with respect to its specification. Such optimization opportunities arise whenever the interface of the target application specifies what are the publicly observable outputs, and this information is sufficient to classify the extra leakage as benign [4].
The objective of this work is to propose a static method to check if a program is output-sensitive side-channel secure. We emphasize that our goal is not to verify that the legal output leaks "too much", but rather to ensure that the unintended (side-channel) output does not leak more than this legal output.
First, we propose a novel approach for verifying constant-time like security based on a new information flow property, called output-sensitive noninterference. Information-flow security prevents confidential information to be leaked to public channels. Noninterference states that a public observer cannot learn anything about the private data. Since real systems need to intentionally declassify some information, this property is too strong. A possible alternative is relaxed noninterference which allows to specify explicit downgrading policies. For instance, in [27], the authors proposed the "delimited release security property" that captures the leakage by a "declassify" primitive, which allows to state explicitly what is permitted to be leaked with respect to the initial state.
In order to take into account public outputs while staying independent of how programs intentionally declassify information, we develop an alternative solution: instead of using complex explicit policies for functions, we partition variables in three sets: input, output and leakage variables. Hence we distinguish between the legal public output and the information that can leak through side-channels, expressed by adding fresh additional leakage variables. Then we propose a typing system that can statically check that leakage variables do not leak more secret information than the public normal output. Therefore, in our case, the legal leakage is implicitly defined by the normal output of the program and not by any dedicated language primitive. The novelty of our approach is that we track the dependence of leakage variables with respect to both the initial value of input variables (as classically the case for noninterference) and final values of output variables, which hardens the analysis.
As an application of this new non-interference property, we show how to verify that a program written in a high-level language is output-sensitive constant time secure by using this typing system.
Since timed and cache-based attacks target the executions of programs, it is important to carry out this verification in a language close to the machine-executed assembly code. Hence, we adapt our approach to a generic unstructured assembly language inspired from LLVM and we show how we can verify programs coded in LLVM. Finally, we developed a prototype tool implementing our type system and we show how it can be used to verify LLVM implementations.
To summarize, this work makes the following contributions described above: • in section 2 we reformulate output-sensitive constant-time as a new interesting noninterference property and we provide a sound type system that guarantees that well-typed programs are output-sensitive noninterferent; • in section 3 we show that this general approach can be used to verify that programs written in a high-level language are output-sensitive constant time; • in section 4 we adapt our approach to the LLVM-IR language and we develop a prototype tool that can be used to verify LLVM implementations.

Output-sensitive non-interference
In this section we introduce output-sensitive noninterference as a new information flow property that aims to extend classical noninterference definitions in two directions: • this property should be able to characterize general side channel attacks; • in the same time, it should be accurate enough to capture the precise side channel information leaked beyond the regular output of the program. In order to take into account public outputs while staying independent of how programs intentionally declassify information, we partition variables in three sets: input, output and leakage variables. Hence we distinguish between the legal public output and the information that can leak through side-channels, expressed by adding fresh additional leakage variables. These leakage variables are updated each time that some information is leaked to the environment. For instance, for some leakage variable x l , if an adversary can get some knowledge about the branch that will be taken in case of a test condition if e then c 1 else c 2 , we update x l with a dependency on e. Similarly, for some assignment x := f (e 1 , . . . , e n ), if the evaluation of f is time-dependent on its arguments, then we update x l with dependencies on e 1 , . . . , e n . In this paper we focus on constant-time security, but our approach could be used as well for other side-channel attacks.
In order to precisely characterize the side channel information leaked additionally with respect to the normal output of the program, we also consider output variables.
The novelty of our approach is that we track the dependency of leakage variables with respect to both the initial value of input variables (as classically the case for noninterference) and final values of output variables, which hardens the analysis.
The rest of the section is structured as follows: first we introduce the While language and we formulate the definition of output-sensitive noninterference. Then we propose a typing system that can statically check that leakage variables do not leak more secret information than the public normal output. We conclude the section by proving the soundness of this typing system.
2.1. The While language and Output-sensitive noninterference. In order to reason about the security of the code, we first develop our framework in While, a simple high-level structured programming language. In section 3 we shall enrich this simple language with (While e Do c oD , σ) −→ (If e then c; While e Do c oD else skip fi , σ) arrays and in section 4 we adapt our approach to a generic unstructured assembly language. The syntax of While programs is listed below: Meta-variables x, e and c range over the sets of program variables V ar, expressions and programs, respectively. We leave the syntax of expressions unspecified, but we assume they are deterministic and side-effect free. The semantics is shown in Figure 1. The reflexive and transitive closure of −→ is denoted by =⇒. A state σ maps variables to values, and we write σ(e) to denote the value of expression e in state σ. A configuration (c, σ) is a program c to be executed along with the current state σ. Intuitively, if we want to model the security of some program c with respect to side-channel attacks, we can assume that there are three special subsets of variables: X I the public input variables, X O the public output variables and X L the variables that leak information to some malicious adversary. Then, output sensitive nonintereference asks that every two complete executions starting with X I -equivalent states and ending with X O -equivalent final states must be indistinguishable with respect to the leakage variables X L . In the next definition, = X relates two states that coincide on all variables belonging to X. Definition 2.1 (adapted from [4]). Let X I , X O , X L ⊆ V ar be three sets of variables, intended to represent the input, the output and the leakage of a program. A program c is (X I , X O , X L )-output-sensitive non-interferent when all its executions starting with X I -equivalent stores and leading to X O -equivalent final stores, give X L -equivalent final stores. Formally, for all σ, σ , ρ, ρ , if c, σ =⇒ σ and c, ρ =⇒ ρ and σ = X I ρ and σ = X O ρ , then σ = X L ρ .
Example 2.2. To illustrate the usefulness of the output-sensitive non-interference property let us consider again the example given in the previous section, shown on Figure 2. Then, in order to reduce its output-sensitive constant time security to the output-sensitive noninterference property (as in Definition 2.1), first, we add a fresh variable xl to accumulate the leakage information (using the abstract concatenation operator @). Then, variable xl is updated with the boolean condition of each branching instruction (i.e., at lines 3, 7, 9, 12 and 16) and with each expression used as an index array (i.e., at lines 5 and 14). The result of this transformation is given on Figure 3 (we denote by ct eq ni the function obtained by applying recursively the same transformation to ct eq). We consider the following sets of  In Section 3 we will show that this transformation is general and, moreover, whenever the transformed program satisfies Definition 2.1, then the initial program is output-sensitive constant-time, i.e., it can still do branchings or memory accesses controlled by secret data if the information that is leaked is subsumed by its normal output. Note that the code of 2.2. Preliminary definitions and notations. As usual, we consider a flow lattice of security types L (also called security levels). An element x of L is an atom if x = ⊥ and there exists no element y ∈ L such that ⊥ y x. A lattice is called atomistic if every element x ∈ L is the join of the set of atoms below it [25]. This set is denoted by At(x).
Assumption 2.3. Let (L, , , ⊥, ) be an atomistic bounded lattice. As usual, we denote t 1 t 2 iff t 2 = t 1 t 2 . We assume that there exists a distinguished subset T O ⊆ L of atoms. Hence, from the above assumption, for any τ o , τ o ∈ T O and for any t 1 , t 2 ∈ L: it holds that The assumption that our set L of security types is an atomistic lattice provides a general structure which is sufficient for our purposes: it ensures the existence of decompositions in atoms for any element in the lattice, and the ability to syntactically replace an atomic type by another (not necessarily atomic) type.
A type environment Γ : V ar → L describes the security levels of variables and the dependency with respect to the current values of variables in X O . In order to catch dependencies with respect to current values of output variables, we associate to each output variable o ∈ X O a fixed and unique symbolic type α(o) ∈ T O . For example if some variable x ∈ V ar has the type Γ(x) = Low α(o), it means that the value of x depends only on public input and the current value of the output variable o ∈ X O .
Hence, we assume that there is a fixed injective mapping α : X 0 → T 0 such that We extend mappings Γ and α to sets of variables in the usual way: given A ⊆ V ar and B ⊆ X O we note Γ(A) Our type system aims to satisfy the following output sensitive non-interference condition: if the final values of output variables in X O remain the same, only changes to initial inputs with types t should be visible to leakage outputs with type t α(X O ). More precisely, given a derivation α Γ{c}Γ , the final value of a variable x with final type Γ (x) = t α(A) for some t ∈ L and A ⊆ X O should depend at most on the initial values of those variables y with initial types Γ(y) t and on the final values of variables in A. We call "real dependencies" the dependencies with respect to initial values of variables and "symbolic dependencies" the dependencies with respect to the current values of output variables. Following [19] we formalize the non-interference condition satisfied by the typing system using reflexive and symmetric relations.
We write = A 0 for relation which relates mappings which are equal on all values in A 0 i.e. for two mappings f 1 , f 2 : A → B and A 0 ⊆ A, f 1 = A 0 f 2 iff ∀a ∈ A 0 , f 1 (a) = f 2 (a). For any mappings f 1 : A 1 → B and f 2 : A 2 → B, we write f 1 [f 2 ] the operation which updates Given Γ : V ar → L , X ⊆ V ar and t ∈ L, we write = Γ,X,t for the reflexive and symmetric relation which relates states that are equal on all variables having type v t in environment Γ, provided that they are equal on all variables in X: ) . When X = ∅, we omit it, hence we write = Γ,t instead of = Γ,∅,t .
Definition 2.4 [20]. Let R and S be reflexive and symmetric relations on states. We say that program c maps R into S, written c : The type system we propose enjoys the following useful property: This property is an immediate consequence of Theorem 2.17. Hence, in order to prove that the above program c is output sensitive non-interferent according to Definition 2.1, it is enough to check that for all Two executions of the program c starting from initial states that coincide on input variables X I , and ending in final states that coincide on output variables X O , will coincide also on the leaking variables X L .
We now formally introduce our typing system. Due to assignments, values and types of variables change dynamically. For example let us assume that at some point during the execution, the value of x depends on the initial value of some variable y and the current value of some output variable o (which itself depends on the initial value of some variable h), formally captured by an environment Γ where Γ(o) = Γ 0 (h) and Γ(x) = Γ 0 (y) α(o), where Γ 0 represents the initial environment. If the next to be executed instruction is some assignment to o, then the current value of o will change, so we have to mirror this in the new type of x: even if the value of x does not change, its new type will be Γ (x) = Γ 0 (y) Γ 0 (h) (assuming that α(o) Γ 0 (y)). Hence Γ (x) is obtained by replacing in Γ(x) the symbolic dependency α(o) with the real dependency Γ(o). The following definition formalizes this operation that allows to replace an atom t 0 by another type t in a type t (seen as the join of the atoms of its decomposition).
Definition 2.5. If t 0 ∈ T O is an atom and t , t ∈ L are arbitrary types, then we denote by t[t /t 0 ] the type obtained by replacing (if any) the occurrence of t 0 by t in the decomposition Now we extend this definition to environments: let x ∈ X O and p ∈ L. Then Γ 1 def = Γ α x represents the environment where the symbolic dependency on the last value of x of all variables is replaced by the real type of x: 2.3. Useful basic lemmas. The following lemma is an immediate consequence of the Assumption 2.3 and Definition 2.5.
We want now to extend the above definition from a single output variable x to subsets X ⊆ X O . Our typing system will ensure that each generated environment Γ will not contain circular symbolic dependencies between output variables, i.e., there are no output is an acyclic graph. For acyclic graphs G(Γ), we define a preorder over X O , denoted Γ , as the transitive closure of the relation The following lemma can be proved by induction on the size of X using Lemma 2.6.
For all variables x ∈ X, and all variables y ∈ V ar, α(x) Γ 2 (y).
Next Lemma gives a precise characterization of the new preorder induced by the application of the operator α .
Then Γ 1 and Γ 2 are well formed. Moreover, Proof. The key remark is that any edge of G(Γ 1 ) where Γ 1 def = Γ α x corresponds to either an edge or to a path of length two in G(Γ). Indeed, let x 1 , x 2 ∈ X O such that there exists an edge from . Then either α(x 1 ) Γ(x 2 ) or α(x) Γ(x 2 ) and α(x 1 ) Γ(x). Hence either there is an edge from x 1 to x 2 in G(Γ) or there must exist edges from x 1 to x and from x to x 2 (and hence a path of length two from x 1 to x 2 ) in G(Γ). Now the assertion of the Lemma is an immediate consequence of the above remark and Lemma 2.6. Vol. 17:1 Let def (c) be the set of assigned variables in a program c, formally defined by: We define the ordering over environments as usual: We also define a restricted ordering over environments: Intuitively, when enriching an environment using r , we have the right to add only "real dependencies" (and not "symbolic" dependencies with respect to variables in X O ). We adapt this definition for elements t 1 , t 2 ∈ L as well: we denote t 1 r t 2 when t 1 t 2 , and for all o ∈ Next lemma is immediate from the definitions.
The last assertion of the lemma is a consequence of the remark that Γ 1 α X r Γ 3 implies that Γ 3 does not contain more "symbolic dependencies" than Γ 1 α X, and Γ 1 α X does not contain any "symbolic dependencies" with respect to variables in X. Obviously, using that r ⊆ , all inequalities hold also when the premise Γ 1 Γ 2 is replaced by Γ 1 r Γ 2 .
2.4. Typing rules. For a command c, judgements have the form p α Γ{c}Γ where p ∈ L and Γ and Γ are type environments well-formed. The inference rules are shown in Figure 4. The idea is that if Γ describes the security levels of variables which hold before execution of c, then Γ will describe the security levels of those variables after execution of c. The type p represents the usual program counter level and serves to eliminate indirect information flows; the derivation rules ensure that all variables that can be changed by c will end up (in Γ ) with types greater than or equal to p. As usual, whenever p = ⊥ we drop it and write α Γ{c}Γ instead of ⊥ α Γ{c}Γ . Throughout this paper the type of an expression e is defined simply by taking the lub of the types of its free variables Γ[α](f v(e)), for example the type of . This is consistent with the typing rules used in many systems, though more sophisticated typing rules for expressions would be possible in principle.
Let us explain some of the typing rules: • The rule As1 is a standard rule for assignment, the new value of x depends on the variables occuring in the right-hand side e. Since x is not an output variable, we do not need to update the type of the other variables. Moreover, notice that considering the type of an expression to be Γ[α](f v(e)) instead of Γ(f v(e)) allows to capture the dependencies with respect to the current values of output variables. • The rule As2 captures the fact that when assigning an output variable x we need to update the types of all the other variables depending on the last previous value of x: Γ 1 = Γ α x express that symbolic dependencies with respect to the last previous value of x should be replaced with real dependencies with respect to the initial types of variables. • The rule As3 is similar to As2 when the assigned variable x occurs also on the right-hand side. • The rule Sub is the standard subtyping rule. Notice that we use relation r instead of in order to prevent introducing circular dependencies. • The rule If deals with conditional statements. In an If statement, the program counter level changes for the typing of each branch in order to take into account the indirect information flow. Moreover, at the end of the If command, we do the join of the two environments obtained after the both branches, but in order to prevent cycles, we first replace the "symbolic" dependencies by the corresponding "real" dependencies for each output variable that is assigned by the other branch. In order to give some intuition about the rules, we present a simple example in Figure 5.
Since the types of variables x, u and o 3 do not change, we omit them in the following. We highlighted the changes with respect to the previous environment. After the first assignment, the type of o 1 becomes X, meaning that the current value of o 1 depends on the initial value of x. After the assignment y := o 1 + z, the type of y becomes O 1 Z, meaning that the current value of y depends on the initial value of z and the current value of o 1 . After the assignment o 1 = u, the type of y becomes X Z as o 1 changed and we have to mirror this in the dependencies of y,  At the end of the If command, we do the join of the two environments obtained after the both branches, but in order to prevent cycles, we first replace the "symbolic" dependencies by the corresponding "real" dependencies for each output variable that is assigned by the other branch.
2.5. Well-formed environments. In this section we prove that, if the initial environment is well-formed, then all the environments generated by the typing system are well-formed too. To do so, the following lemma states some useful properties, where for any p ∈ L, we denote by atomO(p) Proof. Proof is by induction on the derivation of p α Γ{c}Γ for all assertions in the same time. We do a case analysis according to the last rule applied (case Skip is trivial). (Ass1) c is an assignment x := e for some x ∈ X O . Since x ∈ X O , it follows that G(Γ ) = G(Γ) and obviously for any x ∈ def (c), Γ(x) = Γ (x) and Γ = Γ and WF(Γ) implies WF(Γ ).
By Lemma 2.8, WF(Γ 1 ), and using Lemma 2.6 we get that WF(Γ 1 ) does not contain any edge with origin o and hence WF(Γ ). The second part follows from the remark that using Lemma 2.8 we get Using the induction hypothesis for p α Γ{c 1 }Γ 1 we get that WF(Γ 1 ), and for any Using the induction hypothesis for p α Γ 1 {c 2 }Γ 2 we get that WF(Γ 2 ), and for any For any o ∈ atomO(p), Finally, for any x ∈ def (c), we have ( r Γ 2 (x). We used whenever necessarily Lemma 2.11; in addition, in (1) we used that (Γ α U 1 )(x) does not depend on variables in U 1 , and by induction hypothesis for all variables v ∈ U 1 , (Γ α U 1 )(v) r Γ 1 (v), in (2) we used that by induction hypothesis (Γ α U 1 )(x) r Γ 1 (x), in (3) we used that x ∈ U 2 , and hence by induction hypothesis, Using the induction hypothesis for p p α Γ{c 1 }Γ 1 we get that WF(Γ 1 ), and for Using the induction hypothesis for p p α Γ{c 2 }Γ 2 we get that WF(Γ 2 ), and for any o ∈ X O \ (U 2 ∪ atomO(p p ), GT Γ 2 (o) ⊆ GT Γ (o), for any o ∈ atomO(p p ), GT Γ 2 (o) ⊆ GT Γ (o) ∪ U 2 and for any o ∈ U 2 , GT Γ 2 (o) ⊆ U 2 . In addition, for any x ∈ U 2 , (Γ α U 2 ))(x) r Γ 2 (x). For For any o ∈ U 1 , GT Γ 2 αU1 (o) = ∅ and by induction hypothesis For any o ∈ atomO(p p ), by induction hypothesis Now if we assume by contradiction that ¬WF(Γ ), we get that there must exist x 1 , x 2 ∈ X O such that x 1 Γ 1 Γ 2 x 2 and x 2 Γ 1 Γ 2 x 1 . We make an analysis by case: This implies that for any x ∈ GT Γ 1 (x i ), it holds x ∈ U 1 and hence GT Γ 2 (x) = ∅. It means that and GT Γ 2 (x 1 ) = ∅. This implies that for any x ∈ GT Γ 1 (x 1 ), it holds x ∈ U 1 and hence GT Γ 2 (x) = ∅. It means that implies that x 2 Γ x 1 , contradiction with WF(Γ). -The remaining cases are symmetrical ones with the previous cases. (While) Similar to the rule (If). (Sub) Trivial from the premises of the rule, using the induction hypothesis, the transitivity of r and that Γ 1 r Γ 2 implies that G(Γ 1 ) = G(Γ 2 ).
2.6. Soundness of the typing system. As already stated above, our type system aims to capture the following non-interference condition: given a derivation p α Γ{c}Γ , the final value of a variable x with final type t α(X O ), should depend at most on the initial values of those variables y with initial types Γ(y) t and on the final values of variables in X O . Or otherwise said, executing a program c on two initial states σ and ρ such that σ(y) = ρ(y) for all y with Γ(y) t which ends with two final states σ and ρ such that σ (o) = ρ (o) for all o ∈ X O will satisfy σ (x) = ρ (x) for all x with Γ (x) t α(X O ). In order to prove the soundness of the typing system, we need a stronger invariant denoted I(t, Γ): intuitively, (σ, ρ) ∈ I(t, Γ) means that for each variable x and A ⊆ X O , if σ = A ρ and Γ(x) t α(A), then σ(x) = ρ(x). Formally, given t ∈ L and Γ : V ar → L, we define The following lemmas provide some useful properties satisfied by the invariant I(t, Γ). Lemma 2.12. If Γ 1 Γ 2 then for all t ∈ L, I(t, Γ 1 ) ⊆ I(t, Γ 2 ).
Proof. We prove only the first inclusion, the second one can be easily proved by induction using the first one.
We do a case analysis according to the last rule applied (case Skip is trivial). (Ass1) c is an assignment x := e for some x ∈ X O . Then If y ≡ x, then Γ(y) = Γ (y) α(A) t, and since σ = Γ,A,α(A) t ρ, we get σ (y) = σ(y) = ρ(y) = ρ (y). Let us assume that c is y := e for some e and y ∈ X O .
2.7. Soundness w.r.t. to output-sensitive non-interference. In this section we show how we can use the typing system in order to prove that a program c is output-sensitive We denote ⊥ = τ ∅ and = τ V ar e and we consider the lattice (L, ⊥, , ) with τ A τ A def = τ A∪A and τ A τ A iff A ⊆ A . Obviously, L is a bounded atomistic lattice, its set of atoms being The following Theorem is a consequence of the Definition 2.1 and Theorem 2.16. Proof. Let t = Γ(X I ). First, we prove that if σ = X I ρ, then (σ, ρ) ∈ I(t, Γ). Let A ⊆ X O such that σ = A ρ and let y ∈ V ar such that Γ(y) α(A) t = α(A) Γ(X I ). This implies that y ∈ X I , and since σ = X I ρ, we get σ(y) = ρ(y). Now let σ, σ , ρ, ρ , such c, σ =⇒ σ and c, ρ =⇒ ρ and σ = X I ρ and σ = X O ρ . Let x l ∈ X L . We have to prove that σ = X l ρ . Let us apply the Theorem 2.16 with t = Γ(X I ). Since α Γ{c}Γ and (σ, ρ) ∈ I 1 (t, Γ), we get that (σ , ρ ) ∈ I(t, Γ ). It means that σ = Γ ,X O ,α(X O ) Γ(X I ) ρ . Since by hypothesis we have that σ = X O ρ and Γ (x l ) α(X O ) Γ(X I ), we get that σ = x l ρ .

Output-sensitive constant-time
In this section we illustrate how our approach can be applied to a more realistic setting, considering a specific side-channel leakage due to the cache usage. However, this approach could be applied to any other side-channel setting as soon as one can model the leakage produced by each command.
Following [1,4], we consider two types of cache-based information leaks: (i) disclosures that happen when secret data determine which parts of the program are executed; (ii) To simplify notations, we assume that array indexes e 1 are basic expressions (not referring to arrays) and that X O does not contain arrays. Moreover as in [4], a state or store σ maps array variables v and indices i ∈ N to values σ(v, i). The labeled semantics of While programs is listed in Figure 6. In all rules, we denote The labels on the execution steps correspond to the information which is leaked to the environment (r() for a read access on memory, w() for a write access and b() for a branch operation). In the rule for (If), the valuations of branch conditions are leaked. Also, all indexes to program variables read and written at each statement are leaked to. Remark that in cache based attacks, only the offsets are leaked and not the base variable addresses/values. When there is no label on a step, this step is considered to be invisible.
We give in Figure 7 the new typing rules. As above, we denote − → f = (f i ) i , the set of all indexes occurring in e. We add a fresh variable x l , that is not used in programs, in order to capture the unintended leakage. Its type is always growing up and it mirrors the information   We can reduce the (X I , X O )-constant time security of a command c to the (X I , X O , {x l })security (see section 2.7) of a corresponding command ω(c), obtained by adding a fresh variable x l to the program variables f v(c), and then adding recursively before each assignment and each boolean condition predicate, a new assignment to the leakage variable x l that mirrors the leaked information. Let :, b(, )r(, )w() be some new abstract operators. The construction of the instrumentation ω(•) is shown in Fig. 8. As above, we denote by − → f = (f i ) i the set of all indexes occurring in e.
First we can extend the While language with array variables, then we need to extend the typing system from section 2 with a rule corresponding to the new rule Ast . Then, the following lemma gives the relationship between the type of a program c using the new typing system and the type of the instrumented program ω(c) using the extended typing system from the previous section.

Application to low-level code
We show in this section how the type system we proposed to express output-sensitive constant-time non-interference on the While language can be lifted to a low-level program representation like the LLVM byte code [21].

LLVM-IR.
We consider a simplified LLVM-IR representation with four instructions: assignments from an expression (register or immediate value) or from a memory block (load), writing to a memory block (store) and (un)conditional jump instructions. We assume that the program control flow is represented by a control-flow graph (CFG) where B is the set of basic blocks, → E the set of edges connecting the basic blocks, b init ∈ B the entry point and b end ∈ B the ending point. We denote by Reach(b, b ) the predicate indicating that node b is reachable from node b, i.e., there exists a path in G from b to b . A program is then a (partial) map from control points (b, n) ∈ B × N to instructions where each basic block is terminated by a jump instruction. The memory model consists in a set of registers or temporary variables R and a set of memory blocks M (including the execution stack). V al is the set of values and memory block addresses. The informal semantics of our simplified LLVM-IR is given in Figure 9, where r ∈ R and v ∈ R ∪ V al is a register or an immediate value.
In the formal operational semantics, execution steps are labeled with leaking data, i.e., addresses of store and load operations and branching conditions. This formal semantics is defined in Figure 10. It is implicitly parameterized by the program p, a configuration is a tuple ((b, n), ρ, µ) where (b, n) ∈ B × N is the control point and ρ : R → V al (resp. µ : M → V al) denotes the content of registers (resp. memory).

4.2.
Type system. First, we introduce the following notations for an LLVM-IR program represented by a CFG G = (B, → E , b init , b end ): (1) Function dep : B → 2 B associates to each basic block its set of "depending blocks", i.e., b ∈ dep(b) iff b dominates b and there is no block b" between b and b such that b" post-dominates b . We recall that a node b 1 dominates (resp. post-dominates) a node b 2 iff every path from the entry node b init to b 2 goes through b 1 (resp. every path from b 2 to the ending node b end goes through b 1 ). (2) Partial function br : B → R returns the "branching register", i.e., the register r used to compute the branching condition leading outside b (b is terminated by an instruction cond(r, b then , b else )). Note that in LLVM branching registers are always fresh and assigned only once before to be used.  We now define a type system (Figures 11 to 14) that allows to express the outputsensitive constant-time property for LLVM-IR -like programs. The main difference with respect to the rules given at the source level (Figures 4 and 7) is that the control-flow is explicitly given by the CFG, and not by the language syntax. For a LLVM-like program, an environment Γ : R ∪ M → L, associates security types to registers and memory blocks. We will use the notation α (b, n) : Γ ⇒ Γ inspired from the one used by [6]. The intuitive meaning of this judgement is the following: if I is the instruction of control point (b, n) and τ 0 is the join of all the types of the branching conditions dominating the current basic block b then α (b, n) : Γ ⇒ Γ is equivalent to the notation τ 0 α Γ{I}Γ used in section 2.4, that is, it transforms the previous environment Γ into the new environment Γ .
Assignment from an operation Op (Figure 11). In rules Op1 and Op2, the new type τ of the assigned register r is the join of the type of operands − → v and the type of all the branching conditions dominating the current basic block (τ 0 ). Note that since branching registers r are assigned only once in LLVM there is no need to update their dependencies from output variables (using the α operator), Γ(r) being never changed once r has been assigned.  Figure 11: assignment from an operation Op Figure 12: assignment from a load expression Assignment from a load expression ( Figure 12). Rules Ld1 and Ld2 update Γ in a similar way as Op1 and Op2, the main difference being that since some of the memory locations accessed when dereferencing v (i.e., P tsT o(b, n)(v)) are in A m (i.e., potentially in the cache) the dependencies of v are added to the type of the leakage variable x l .
In the above definition the set A is mandatory in order to prevent dependency cycles between variables in X O .
The following Theorem is the counterpart of Theorem 3.3. It shows the soundness of our type system for LLVM-IR programs with respect to output-sensitive constant-time.

4.4.
Example. We illustrate below the effect of the LLVM-IR typing rules on a short example. The C code of this example is given on Figure 15, and the corresponding (simplified) LLVM-IR on Figure 16.  Figure 15 to not define specific typing rules for these instructions, they are taken into account only in building the initial environment. First, we assume that x l denotes the leakage variable and that the content of C variables p, q, x and y are stored in memory blocks b 0 to b 3 , i.e. at the initial control point, Now we consider the following initial environment (produced by lines 1-4 in Figure 16): Γ 0 (@p) = Γ 0 (@q) = Γ 0 (%x) = Γ 0 (%y) = ⊥ This initial environment captures the idea that the values of variables @p, @q, %x, %y are addresses (of memory blocks corresponding to the "high-level" C variables p, q, x and y) and hence their security type is ⊥, and the memory blocks b 0 to b 3 correspond to the C variables p, q, x and y, and this is mirrored in the initial environment Γ 0 . Moreover, initially, nothing is leaked yet.
We then update Γ 0 by applying our typing rules in sequence to each instruction of the LLVM-IR representation. Note that the getelementptr instruction, which is specific to LLVM, allows to compute an address corresponding to an indexed access in a buffer. Hence, it is treated by our typing system as an arithmetic (Op) instruction.

%1 = load %y
store %3, %5 Making all the replacements, we get that the final environment is: In this final environment Γ 6 , variable x l depends on the initial types X and Y assigned to memory blocks b 2 and b 3 . This means that the addresses accessed when reading (resp. writing) buffer p (resp. q) may leak to an attacker. Hence, if one of the variables x or y is a secret, since neither x nor y is an output value, then this program is not output sensitive constant-time, which may lead to a security issue.
4.5. Implementation. We are developing a prototype tool implementing the type system for LLVM programs. This type system consists in computing flow-sensitive dependency relations between program variables. Def. 4.1 provides the necessary conditions under which the obtained result is sound (Theorem 4.2). We give some technical indications regarding our implementation.
Output variables X O are defined as function return values and global variables; we do not currently consider arrays nor pointers in X O . Control dependencies cannot be deduced from the syntactic LLVM level, we need to explicitly compute the dominance relation between basic blocks of the CFG (the dep function). Def. 4.1 requires the construction of a set A ⊆ X O to update the environment produced at each control locations in order to avoid circular dependencies (when output variable are assigned in alternative execution paths). To identify the set of basic blocks belonging to such alternative execution paths leading to a given block, we use the notion of Hammock regions [15]. More precisely, we compute function Reg : (B×B×(→ E )) → 2 B , returning the set of Hammock regions between a basic block b and its immediate dominator b with respect to an incoming edge e i of b. Thus, Reg(b , b, (c, b)) is the set of blocks belonging to CFG paths going from b to b without reaching edge e i = (c, b):

Related Work
Information flow. There is a large number of papers on language-based security aiming to prevent undesired information flows using type systems (see [27]). An information-flow security type system statically ensures noninterference, i.e. that sensitive data may not flow directly or indirectly to public channels [31,24,30,29]. The typing system presented in section 2 builds on ideas from Hunt and Sands' As attractive as it is, noninterference is too strict to be useful in practice, as it prevents confidential data to have any influence on observable, public output: even a simple password checker function violates noninterference. Relaxed definitions of noninterference have been defined in order to support such intentional downward information flows [28]. Li and Zdancewic [22] proposed an expressive mechanism called relaxed noninterference for declassification policies that supports the extensional specification of secrets and their intended declassification. A declassification policy is a function that captures the precise information on a confidential value that can be declassified. For the password checker example, the following declassification policy λp.λx.h(p) == x, allows an equality comparison with the hash of password to be declassified (and made public), but disallows arbitrary declassifications such as revealing the password.
The problem of information-flow security has been studied also for low level languages. Barthe and Rezk [8,9] provide a flow sensitive type system for a sequential bytecode language. As it is the case for most analyses, implicit flows are forbidden, and hence, modifications of parts of the environment with lower security type than the current context are not allowed. Genaim and Spoto present in [16] a compositional information flow analysis for full Java bytecode.
Information flow applied to detecting side-channel leakages. Information-flow analyses track the flow of information through the program but often ignore information flows through side channels. Side-channel attacks extract sensitive information about a program's state through its observable use of resources such as time or memory. Several approaches in language-based security use security type systems to detect timing side-channels [1,18]. Agat [1] presents a type system sensitive to timing for a small While-language which includes a transformation which takes a program and transforms it into an equivalent program without timing leaks. Molnar et al [23] introduce the program counter model, which is equivalent to path non-interference, and give a program transformation for making programs secure in this model.
FlowTracker [26] allows to statically detect time-based side-channels in LLVM programs. Relying on the assumption that LLVM code is in SSA form, they compute control dependencies using a sparse analysis [13] without building the whole Program Dependency Graph. Leakage at assembly-level is also considered in [6]. They propose a fine-grained information-flow analysis for checking that assembly programs generated by CompCert are constant-time. Moreover, they consider a stronger adversary which controls the scheduler and the cache.
All the above works do not consider publicly observable outputs. The work that is closest to ours is [4], where the authors develop a formal model for constant-time programming policies. The novelty of their approach is that it is distinguishing not only between public and private input values, but also between private and publicly observable output values. As they state, this distinction poses interesting technical and theoretical challenges. Moreover, Vol constant-time implementations in cryptographic libraries like OpenSSL include optimizations for which paths and addresses can depend not only on public input values, but also on publicly observable output values. Considering only input values as non-secret information would thus incorrectly characterize those implementations as non-constant-time. They also develop a verification technique based on the self-composition based approach [7]. They reduce the constant time security of a program P to safety of a product program Q that simulates two parallel executions of P. The tool operates at the LLVM bytecode level. The obtained bytecode program is transformed into a product program which is verified by the Boogie verifier [5] and its SMT tool suite. Their approach is complete only if the public output is ignored. Otherwise, their construction relies on identifying the branches whose conditions can only be declared benign when public outputs are considered. For all such branches, the verifier needs to consider separate paths for the two simulated executions, rather than a single synchronized path and in the worst case this can deteriorate to an expensive product construction.

Conclusion and Perspectives
In this paper we proposed a static approach to check if a program is output-sensitive constant-time, i.e., if the leakage induced through branchings and/or memory accesses do not overcome the information produced by (regular) observable outputs. Our verification technique is based on a so-called output-sensitive non-interference property, allowing to compute the dependencies of a leakage variable from both the initial values of the program inputs and the final values of its outputs. We developed a type system on a high-level While language, and we proved its soundness. Then we lifted this type system to a basic LLVM-IR and we developed a prototype tool operating on this intermediate representation, showing the applicability of our technique. This work could be continued in several directions. One limitation of our method arising in practice is that even if the two snippets x l = h; o = h and o = h; x l = o are equivalent, only the latter can be typed by our typing system. We are currently extending our approach by considering also an under-approximation β(•) of the dependencies between variables and using "symbolic dependencies" also for non-output variables. Then the safety condition from Theorem 2.17 can be improved to something like "∃V such that (Γ (x l ) α V ) (Γ(X I ) α V ) (β (X O ) α V ) α(X O )". In the above example, we would obtain Γ (x l ) = α(h) = β (o) α(o) β (o), meaning that the unwanted maximal leakage Γ (x l ) is less than the minimal leakage β (o) due to the normal output. From the implementation point of view, further developments are needed in order to extend our prototype to a complete tool able to deal with real-life case studies. This may require to refine our notion of arrays and to take into account arrays and pointers as output variables. We could also consider applying a sparse analysis, as in FlowTracker [26]. It may happen that such a pure static analysis would be too strict, rejecting too much "correct" implementations. To solve this issue, a solution would be to combine it with the dynamic verification technique proposed in [4]. Thus, our analysis could be used to find automatically which branching conditions are benign in the output-sensitive sense, which could reduce the product construction of [4]. Finally, another interesting direction would be to adapt our work in the context of quantitative analysis for program leakage, like in [14].