In the last part we discussed about logging. In this part we are giving a rudimentary introduction to relevant cryptography concepts.
Introduction to https
The purpose of the HTTPS protocol is to enable secure web browsing. From the user point of view, it is important to know whether she connects to a website over an unsecure HTTP connection or over a secure HTTPS connection. Different browsers indicate this in different ways.
Next we see how an example of how a particular browser provides the indication. You should try to figure out how the indication works in all different browsers that you use.
The indicator of https protocol is a small lock next to the url:
Let’s click the lock, then the right arrow and go for more information. We get something that looks like this:
This tab provides lots of information. We will take a closer look at many strange acronyms that appear under ”Technical details”.
What do these letters mean and why are they needed?
First thing that may come to your mind when you hear the word ”cryptography” is about transforming normal messages into secret messages and back.
This is not a false impression but there is quite much more in cryptography as well.
But let us start with the basic setting and define some basic terms.
We call the original message that is supposed to be kept hidden the plaintext. The idea of ciphering is to use some encryption function and some encryption key to get the cryptotext. That is the message in the hidden form. For the reverse direction, we need a decrypting function, a method that is used together with a decryption key in order to retrieve the original plaintext.
Encryption is used to protect confidentiality of the data.
An important building block in cryptography is a one-way function (OWF). An OWF is a function that, from the one hand, can be efficiently evaluated. This means it is easy to calculate y = f(x) when x is given as an input. In this context ”easy” means something that is easily doable with the help of any kind of computer. On the other hand, it should be hard to invert the OWF. This means it is very difficult to find out x, if you only know f and y.
Cryptographic hash function is a special type of an OWF such that
- the output y is always of fixed length
- it is hard to find collisions.
A collision is a situation where two different inputs x and x’ give the same output: f(x) = f(x’).
SHA-256 and SHA-3 are two commonly used cryptographic hash functions. It you calculate the SHA-256 value of any binary string, the result is a random-looking string of 256 bits.
Because the length of the input is not restricted, it is obvious that many different inputs must exist that have the same output. But what is important is that no such collisions are known for the SHA-256 function.
The output of different variants of SHA-3 are of different length. However, there are no variants with a very short output length.
Q: Why a cryptographic hash function cannot have an output length of 3 bits?
All details of hash functions are publicly known and, in principle, anybody can calculate the hash value of any input. Sometimes it is useful that only a person who knows a secret key is able to calculate the hash value. Functions like this are called message authentication codes (MAC) and the value of the function is called the MAC value.
Let us assume that you know some secret key and you have received a message and a MAC value. If you now use the received message and the secret key as inputs to the functions and find out that the output of the function is equal to the received MAC value, then you can be reasonably sure that somebody who knows the secret key must have calculated the MAC value. In addition, you are able to conclude that the message has not been changed after the calculation of the MAC value on the sending side.
Summary: Message authentication codes are used to protect integrity of the data.
One of the commonly used methods for constructing a MAC function is hmac. It is built on top of some hash function, e.g. SHA-256 or SHA-3.
Our HTTPS example (TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384_256 bit keys,TLS 1.2) makes use of a SHA384 hash function.
All classical (i.e. more than 40 years old) ciphering methods are symmetric. This means that anybody who knows the function and the secret key needed for encrypting messages is also able to decrypt messages. In other words, encryption and decryption are in symmetric roles. The encryption function and the decryption function have very similar structures; sometimes they could even be equal to each other. The key used for encryption is the same key that is used for decryption; or at least each key can be derived from the other by some simple transformation.
There are nowadays also encryption methods that are not symmetric, i.e., being able to encrypt does not guarantee that you are able to decrypt even your own message. This kind of asymmetry is useful in many situations and we will discuss those later.
Symmetric encryption methods are still useful in many situations, and they are typically much faster than asymmetric ones. Therefore, symmetric encryption is in wide use, and new symmetric methods are developed. One-time pad (OTP) and AES are examples of symmetric encrypting schemes.
One of the oldest encryption methods is substituting every instance of a letter with some other letter. A cipher like this is called a substitution cipher. An example is the CAESAR cipher which is an encryption algorithm where you get the cryptotext by rotating every letter in the plaintext three positions forward in the alphabet. Decryption is done by rotating every letter of the cryptotext three positions backwards.
One-time pad (OTP) is one of the simplest encryption methods. To encrypt a message of, say, 140 bits you need a secret key of 140 bits. You compute the XOR of each message bit with the corresponding key bit, and you get 140 cryptotext bits:
C = M xor K.
The decrypting process is exactly the same as the encryption process. If you know the secret key you can decrypt the message by XORing bit-by-bit the cryptotext message and the key:
M = C xor K.
This works because
K xor K = 0
regardless of whether K=0 or K=1.
Although OTP is very simple and fast, it possesses some very good security properties. We are going to discuss those later.
The only downside of the OTP is that the key must be as long as the actual plaintext message. And you should not use the key more than once, hence the name ”one-time pad”.
Advanced Encryption Standard (AES) is a family of modern block ciphers. AES-256 has key size of 256 bits and it turns 128-bit plaintext blocks into 128-bit cryptotext blocks, and vice versa. This algorithm is quite fast even when implemented in software and considered secure enough for almost all uses. Many modern processors provide hardware support for AES.
Cryptanalysis is the study of cryptographic algorithms and methods. The basic goal is to break cryptographic algorithms in one way or another. For example, the attacker may have acquired knowledge of some cryptotexts and the corresponding plaintexts. Then he tries to use this information for finding the used key. In another setting, we assume that the attacker has only some cryptotexts based on which the key should be found. It is always assumed that the attacker knows all details of the encryption and decryption processes, only the secret key is not known. If this cannot be assumed then the encryption method is considered to be very weak.
Let us take a look at the substitution cipher. Now every occurrence of the letter ’e’ in the plaintext is encrypted as the same letter in the cryptotext, lets say ’Å’. Because ’e’ is the most common letter in the English language, ’Å’ should be one of the most common letters of the cryptotext. So we may assume that the common letters in the cryptotext correspond to common letters in the language of the plaintext and use this information for making educated guesses about how each letter is encrypted. This cryptanalytic approach is called frequency analysis.
The same approach cannot be used against OTP. If the key is chosen randomly, both 0 and 1 appear as often in the cryptotext, in average. This happens regardless of how common 1 is in the plaintext. In fact, any cryptotext could result from any plaintext with a suitable key. Assume the known cryptotext bit is C. Now the corresponding plaintext bit could be either C (which happens if the key bit is zero) or 1-C (which happens if the key bit is one). Therefore, knowing the cryptotext does not provide any new information about the plaintext to the attacker. This means the OTP is unconditionally secure.
For the OTP, we applied the setting where the attacker knows only the cryptotext. If the attacker knows both the cryptotext and the corresponding plaintext, then he can easily recover the used key. However, breaking OTP in this setting is not relevant because the recovered key is not used to encrypt anything else than the plaintext that the attacker already knows.
Modes of operation
A block cipher, like AES, is used to encrypt blocks of a certain size. What should be done if the message is longer than that size?
The simplest way to encrypt a long message is to take the first block, encrypt it using the key to produce the first cryptotext block, then take the second block, encrypt it using the same key to produce the second cryptotext block etc. This approach is one of the block cipher modes of operation, called Electronic Codebook (ECB). It is the simplest way but often could be broken by frequency analysis. The attacker notices if you encrypt the same plaintext block twice because the two cryptotexts are also the same. This happens, for example, if plaintext contains some commonly used short pattern, like 'OK'.
Other modes of operation avoid this problem by using extra input in addition to plaintext and key. For instance, previously computed cryptotext blocks or counters could be used for this purpose. Our HTTPS example (TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384_256 bit keys,TLS 1.2) makes use of AES algorithm in Galois/Counter Mode. The key length is 256 bits.
In symmetric encryption the same key is used both for encryption and decryption. In public key cryptography we have a separate key for encryption and another one for decryption. This makes sense if it is hard to determine the decryption key even if the encryption key is known.
If this is the case then you could tell everybody how to encrypt messages they want to send to you but still only you would know how to decrypt these messages. This explains the name: encryption key could be made public while the corresponding decryption key still remains secret.
Typically the keys are longer and the algorithms are slower in public-key cryptography than in the symmetric encryption. But with public-key cryptography you are able to do things which cannot be done with symmetric encryption.
The above mentioned possibility for even strangers to send confidential messages to you is one such thing. Digital signature is another such thing: Using public-key cryptography it is possible to sign messages electronically and to authenticate users and their messages.
At this point we cannot avoid getting a little bit mathematical. Let us denote by a mod b the remainder when a is divided by b. The b here is called the modulus. For example, 12 mod 5 = 2 and 77 mod 8 = 5.
Also, we make use of prime numbers, shortly primes. An integer, larger than 1, is a prime if it is only divisible by 1 and itself. For example, 7, 101 and 7823 are primes but 6, 15 and 100 are not.
Given integers a, b and c, it is easy (at least in principle) to calculate ab mod c. This modular exponentiation is relatively easy to do even when the integers a, b and c are very big. With big a and b the power ab becomes huge. Luckily we do not have to compute this huge number at all. Instead, it is possible to compute the remainder for each intermediate value before continuing. Let us take a simple example: a=2, b=60, c=1003. Now ab = (210)6.
We first compute 210=1024. Then we take the remainder of this intermediate result: 1024 mod 1003 = 21. Next we raise 21 to the power of 3. The result is 9261. Taking the remainder again with the modulus 1003 we get 234. The final step is to raise 234 to the power of 2: the result is 54756. Taking the remainder again we get the final result: ab mod c = 594.
On the other hand, when arbitrary integers a, c and d are given, it is very hard to find an integer b such that ab mod c = d. This problem is called the discrete logarithm problem. We use the notation
loga d mod c = b.
Because modular exponentiation can be done relatively fast even for large numbers but the inverse problem of finding the discrete logarithm becomes very hard for large numbers, modular exponentiation with large numbers is an example of a one-way function.
Let us take a closer look at the discrete logarithm with small numbers. We choose c = 101 and below see a graph depicting 2x mod 101.
Note that the graph is not smooth and it is not growing like the ’normal’ exponential function defined for real numbers. Instead, the graph seems to make random-looking moves.
A key agreement protocol is a method for two parties, say Alice and Bob, to agree on a shared secret key over an insecure channel, but still in such way that nobody else who is eavesdropping on the channel is able to learn the shared secret. An example of such method is Diffie-Hellman key agreement protocol.
The Diffie-Hellman key exchange begins by Alice and Bob agreeing on a modulus p and a number g. Then Alice picks a random number a and sends Bob the number ga mod p = u. Similarly, Bob picks a random number b and sends Alice the number gb mod p = v. Now Alice can calculate va mod p and Bob can calculate ub mod p. These two values are both equal to gab mod p and that value can now be used as the new shared secret key.
Our HTTPS example (TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384_256 bit keys,TLS 1.2) makes use of a certain variant of the Diffie-Hellman key exchange.
Next we take a look at how public-key cryptosystems look like. The most used system is called RSA. It is based on the following mathematical fact:
if p and q are primes, n = p·q and e·d mod (p-1)(q-1) = 1
then xe·d mod n = x for all values of x.
To set up RSA keys, Alice first needs to find two large primes p and q. Both primes should be at least 1000 bits long. She then needs to pick an integer e such that it is possible to find another integer d for which e·d mod (p-1)(q-1) = 1 holds. The public key of Alice is now the pair (e,n), where n = p·q. The private key that Alice needs to decrypt messages is the pair (d,n). Alice should not reveal the primes p and q to anybody.
Let us assume that Alice has given her public key to Bob. Bob can now encrypt any "message" x that is (encoded as) an integer between 0 and n by calculating xe mod n = y. Bob would send the result to Alice. Alice can decrypt the cryptotext y because she knows the secret decrypting exponent d. She calculates yd mod n = x.
A traditional signature is some ink on a paper. It is based on the assumption that only one person can write a signature in a certain way. A digital signature is always different for every message (otherwise you could just copy-paste it to any document) and only a person who has the signing key can calculate it correctly. The signature can be verified by anybody who has the verification key.
RSA can be used also for digital signing and verification. Alice reveals the verification key (e,n) and uses herself the signing key (d,n). If Alice wants to sign message x, she calculates the signature as xd mod n = s. When Bob receives the message x and the signature s, he can check that se mod n = x. If this is true then the signature must be from Alice because nobody else knows the parameter d thats is necessarily needed to calculate Alice’s signatures.
Usually the signature is not computed for the entire message but, instead, a hash of the message is computed first, and then the hash value is signed. This guarantees that the value to be signed does not become too big. On the verification side, similarly, the received message is first given as an input to the hash function, and parallel to that, the verification key (e,n) is applied to the signature. If outputs from each operation are equal then the signature is accepted. The hash value of the message is called its fingerprint, and it can be used instead of the entire message if no collisions have been found for the hash function.
Our HTTPS example (TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384_256 bit keys,TLS 1.2) makes use of RSA for digital signatures.
One important part of the digital signature process was explained above as follows: Let us assume that Alice has given her public key to Bob. But this imposes quite a big restriction: Bob can only verify signatures from such parties with whom he has already been in contact beforehand. It would be better if Bob could get Alice’s public key just when a need to verify her signature arises, without any preparations in advance.
A straight-forward solution would be that Alice provides her public key together with the message that she has digitally signed. But now we encounter a ”chicken-and-egg” problem because the whole point of the digital signature is to verify that the message is really coming from Alice, and not from somebody who just pretends to be Alice. If the public key of Alice is delivered together with the message, and this public key should be used to verify the origin of the message, then an imposter could just include a fake public key to the delivery, i.e., a key for which the corresponding private key is in the possession of the imposter, not Alice.
One way out of the ”chicken-and egg” situation is found if Alice’s public key is signed by somebody whose signatures Bob is able to verify, e.g., by some authority. Such digitally signed information is called a certificate. Note that the authority is just endorsing Alice’s public key, the authority does not have to take any stand on messages that Alice is sending.
A typical example where certificates are needed is when a user wants to get some assurance of the validity of the web page he visits. HTTPS protocol enables this and the authentication of the web server is carried out by digital signatures, based on a certificate.
All browsers provide information about the certificate. In our example, the information can be found as shown below.
What has been signed is the fingerprint of the information in the certificate. The authority who has signed the certificate is ”DigiCert SHA2 Extended Validation Server CA”.
Our solution to the ”chicken-and-egg” problem did not remove all restrictions. We still assumed that
- Authority A has verified somehow Alice’s identity and, consequently, has signed Alice’s public key;
- Bob has somehow acquired the public key of the same authority A.
Such an authority A is called a Certificate Authority (CA). But it should be possible for Alice and Bob to be in contact with different authorities. Otherwise, the CA easily becomes a bottleneck in the system.
A typical solution for this problem is to arrange a hierarchy of authorities. For example, Bob could have a public key of a root CA, e.g., in-built into his browser. This root CA would then provide certificates for other CAs that are one level lower in the hierarchy. These CAs could provide certificates for public keys of regional CAs, and Alice could be in contact with one such regional CA.
Bob would need to verify all certificates in a certificate chain. Once all certificates are verified, Bob is finally able to verify Alice’s signature.
All CAs, together with some other authorities (that are needed, e.g., for managing revoked certificates), form a public-key infrastructure (PKI).
In our HTTPS example, the certificate chain has a length of two. ”DigiCert SHA2 Extended Validation Server CA” has provided a certificate for the F-Secure website, and ”DigiCert High Assurance EV Root CA” has provided a certificate for ”DigiCert SHA2 Extended Validation Server CA”. The root CA public key needs to be embedded in the browser for successful authentication of the website.
The actual signature value of the certificate is provided as shown below.