Cryptographic protocols and libraries are today's trusted cryptographic base for a secure internet infrastructure. Unfortunately, neither currently deployed protocols, nor their implementations in libraries are bug-free. The never-ending stream of CVE advisories regarding crypto weaknesses is an obvious testimony to the fact that our cryptographic infrastructure is badly broken. The question is: why is that so, and what can we do about it? Obviously, we need to deploy formally verified protocols as parts of new standards, and we need formally verified cryptographic library implementations.
In this series, we'll present tools for the automatic verification of cryptographic protocols. Furthermore, we'll explore the implementation process for verified cryptographic libraries. The common goal is to create a secure cryptographic stack which is formally verified from the specification phase down to the assembly level. We will focus on the work of the Prosecco group[Prosecco] at Inria which is instrumental in the quality assurance of the upcoming TLS 1.3 protocol standardization, and which aims with project Everest[ProjectEverest] to rebuild the whole HTTPS stack from scratch with fully verified components.
Covered Topics (in Progress)
- Why Verified Crypto? (work in progress)
- The Dolev-Yao vs Computational Model (work in progress)
- Symbolic Verification of Cryptographic Protocols (work in progress)
- ProVerif (work in progress)
- CryptoVerif (work in progress)
- EasyCrypt (work in progress)
- The HACL* Library (work in progress)
- Sequence of Games
- (... and more)
Why is Real-World Crypto Broken?
- Cryptographic Weaknesses in Legacy Constructions: Weak hash functions, weak DH groups, short block ciphers, leaky PKCS#11v1.5 padding, ...
- Logical Flaws in Protocol: Cross-Protocol Attacks, Downgrade Attacks, Transcript Synchronization/Collision Attacks
- Implementation Bugs in TLS libraries: Bugs in crypto library, Buffer overflows in packet parsing, Composition bugs in state machines, Bad configurations
and sometimes, a mix of all of all of them.
The point being[B18, Slide 102], that all too often, vulnerabilites in less-studied modes (e.g. of TLS) can break strong provably secure modes of the protocol: they are way too many modes and corner cases to prove by hand in classical current TLS/SSL implementations. There is therefore a need for automated protocol verification. Additionally, there is a need for verified crypto libraries that are provably secure. That's the topic of present series.
Verifying Cryptographic Protocols
In the series on key exchange protocols, we saw how tricky it can be to design robust cryptographic protocols that are immune against all kinds of attacks. Nowadays, it should be self-evident that every new cryptographic protocol is to be accompanied with a rigorous proof of (its) security. Such proofs are traditionally done by hand, with pencil and paper, and a lot of head scratching and patience. Unfortunately, when such protocols are being considered for inclusion in a standard like, say, the upcoming TLS 1.3, this method quickly becomes cumbersome. Why is that so? That's because of the iterative nature of the standardization process. With each new iteration, the protocol is modified according to the wishes of the standardization committee, thus potentially breaking security in subtle and non-so-subtle ways, and therefore needs to be verified again. It is not uncommon to have like 10 to 20 or more iterations. Since hand-written proofs are slow, error-prone, and very time-consuming tasks, the question arose whether it is feasible to automate this task. It turns out, that it is... within limits.
There are two approaches to automate the verification of cryptographic protocols. The easier one uses the Dolev-Yao model[DY83] of perfect cryptographic primitives. In this case, we can reason in the formal model on a symbolic level about the protocols that make use of them. For example, in a symmetric encryption scheme \((G, E, D)\), if the key generation algorithm \(G\), the encryption algorithm \(E\) and the decryption algorithm \(D\) are defined with the correctness property equation
\[ \forall k \leftarrow G(1^n) \colon D(k, E(k, m)) = m \]
then we can reason about protocols using this scheme using principles of logic programming (satisfiability of Horn clauses). This is exactly what the verification tools ProVerif and Tamarin do. Tools of this kind are the first to be used to weed out obvious and non-obvious logical protocol errors. They are mature (at least ProVerif and Tamarin are), easy to use, and every protocol designer should be able to understand them.
Unfortunately, the Dolev-Yao model isn't a realistic representation of the real world, where cryptographic components can fail with hopefully only negligible probability. While this may seem like a non-issue, it actually is: in the real world, those components are used iteratively for a lot of data, they are combined in various unpredictable ways etc., potentially yielding a non-negligible probability of security failure. It should be obvious that we need to prove such constructions in the computational model (recall semantic security). Such proofs are typically done with Bellare and Rogaway's game-based methodology[BR04, S06]. A typical security proof in this methodology consists of a sequence of games, starting from the construction under scrutiny, and ending with a game of an obviously secure construction. Each intermediate game is a modification of the previous one, introducing an additional advantage for the adversary. The point is that at the end of the sequence, all those advantages add up. If the total advantage remains negligible, the proof was successful. This is where logic solvers come into play[BP06, B11, B16]: they search the whole space of attack game modifications automatically, until they reach a secure construction (an axiom if you wish). The verification tools CryptoVerif and EasyCrypt do just that. They read a protocol specification, and output upon success a sequence of attack games which is a security proof.
Limitations of protocol verifiers: It should be stressed that due to the nature of logic solvers, tools like ProVerif and Tamarin can detect and report security bugs and possible attacks or output a security proof, but that they may not always terminate. In this case, a non-terminating verifier doesn't mean that there are no attacks and that the protocol is secure. Furthermore, just like e.g. PROLOG needs human help in pruning the search tree with judiciously placed cuts, protocol verifiers also often need human (expert) help in guiding their search. While ProVerif and Tamarin are still easy to use even by amateurs, CryptoVerif and EasyCrypt are tools best placed in the hands of experienced cryptanalysts.
Verified Cryptographic Libraries
As already said, a never-ending stream of crypto-related CVE advisories clearly demonstrates that even popular crypto libraries like OpenSSL are riddled with bugs. Such bugs include mundane errors like violations of memory safety (buffer overruns) and violations of functional correctness (e.g. errors in bignum libraries). What's worse, inexperienced coders often unwittingly introduce violations of secret independence[ZBPB17] opening the door to timing and power side-channel attacks, which could be easily mitigated by following some heuristics[BLS12], especially:
- No data flow from secrets to load addresses
- No data flow from secrets to branch conditions
There are two ways to deal with this problem:
- Running some verification tools on existing code bases
- Creating a new verified code base from scratch
The first approach has been tried, but since the problem of automatically verifying some properties of existing programs is provably impossible in all cases, those verifiers are bound to fail finding all possible bugs, e.g. by not terminating.
The second approach is much more promising, and is one subject of this series. The main idea[ZBPB17] employed by the Prosecco group in writing their HACL* library is to write specifications in pure F*, a subset of a high-level description and verification language F* that is able to automatically verify the stated constraints using the Z3 SMT prover[Z3]. Then, the implementation is written in a subset of F* called Low*, which translates naturally to readable C with the KreMLin tool[KreMLin]. This translation only succeeds it the implementation in Low* matches the specification in pure F*. This match is established by the F* SMT solver in the form of proofs. Those proofs are then "thrown away", since they are not included in the generated C code. In other words, the proofs may slow down HACL* at the verification phase, but they won't impact performance during the compilation phase and run-time phase of the resulting C code.
Background on F*
The F* language used by ProVerif, CryptoVerif, and HACL* is an interesting beast of its own[H18]. It is an extension of OCaml, a variant of the ML family of 2nd generation functional programming languages. It doesn't make use of OCaml's object-oriented extensions. But what F* introduces are so called refinement types. These are constraints that extend the ML type system, and that are enforced during verification time by Z3, a sophisticated SMT prover[Z3]. This verification uses and generates proofs.
On top of F*, ProVerif defines a language representing the π-calculus (not to be confused with the λ-calculus), a formalism representing communicating parties, which is worth studying on its own too. Tamarin too models its actors with the Pi calculus.
The main inspiration for this series on verified crypto is a series of talks by Karthikeyan Bhargavan at
- [ProjectEverest] Project Everest
- [Prosecco] Prosecco: Programming Securely with Cryptography
- [Z3] The Z3 Prover
- [KreMLin] The KreMLin tool
- [DY83] Danny Dolev, Andrew C. Yao: On the Security of Public Key Protocols. In: IEEE Transactions on Information Theory, Vol. IT-29, No. 2, March 1983. (full pdf)
- [B16] Bruno Blanchet: Automatic Verification of Security Protocols, ProVerif and CryptoVerif. May 2016. (slides pdf)
- [BR04] Mihir Bellare, Philipp Rogaway: The Game-Playing Technique. December 11, 2004 (Draft 0.4). (full pdf, alternative source)
- [BP06] Bruno Blanchet, David Pointcheval: Automated Security Proofs with Sequences of Games. In: Lecture Notes in Computer Science Vol. 4117, Advances in Cryptology. CRYPTO 2006. Pages 537-554. (full pdf via link.springer.com)
- [S06] Victor Shoup: Sequence of Games, A Tool for Taming Complexity in Security Proofs. January 18, 2006. (full pdf)
- [BR08] Mihir Bellare, Phillip Rogaway: Code-Based Game-Playing Proofs and the Security of Triple Encryption. November 27, 2008 (Draft 3.0). (iacr.org 2004/331, full pdf; revised pdf in the new 2006 version)
- [B11] Bruno Blanchet: Mechanizing Game-Based Proofs of Security Protocols. 2011. (full pdf)
- [ZBPB17] Jean Karim Zinzindohoué, Karthikeyan Bhargavan, Jonathan Protzenko, Benjamin Beurdouche: HACL*: A Verified Modern Cryptographic Library. (iacr.org 2017/536, full pdf)
- [BLS12] Daniel J. Bernstein, Tanja Lange, Peter Schwabe: The security impact of a new cryptographic library. (full pdf)
- [H18] Cătălin Hriţcu: The Formal Semantics and Evolution of the F* Verification System. October 11, 2018. (full pdf)
- [B18] Karthikeyan Bhargavan: Formal Methods for Analyzing Crypto Protocols, from attacks to proofs. (slides pdf via BIU)