# Cryptanalysis

*Cryptanalysis* is the study of cryptographic algorithms and methods. The basic goal is to *break* cryptographic algorithms in one way or another. For example, the attacker may have acquired knowledge of some cryptotexts and the corresponding plaintexts. Then he tries to use this information for finding the used key. In another setting, we assume that the attacker has only some cryptotexts based on which the key should be found. It is always assumed that the attacker knows all details of the encryption and decryption processes, only the secret key is not known. If this cannot be assumed then the encryption method is considered to be very weak.

Let us take a look at the substitution cipher. Now every occurrence of the letter ’e’ in the plaintext is encrypted as the same letter in the cryptotext, lets say ’Å’. Because ’e’ is the most common letter in the English language, ’Å’ should be one of the most common letters of the cryptotext. So we may assume that the common letters in the cryptotext correspond to common letters in the language of the plaintext and use this information for making educated guesses about how each letter is encrypted. This cryptanalytic approach is called *frequency analysis*.

The same approach cannot be used against OTP. If the key is chosen randomly, both 0 and 1 appear as often in the cryptotext, in average. This happens regardless of how common 1 is in the plaintext. In fact, any cryptotext could result from any plaintext with a suitable key. Assume the known cryptotext bit is C. Now the corresponding plaintext bit could be either C (which happens if the key bit is zero) or 1-C (which happens if the key bit is one). Therefore, knowing the cryptotext does not provide any new information about the plaintext to the attacker. This means the OTP is *unconditionally* secure.

For the OTP, we applied the setting where the attacker knows only the cryptotext. If the attacker knows both the cryptotext and the corresponding plaintext, then he can easily recover the used key. However, breaking OTP in this setting is not relevant because the recovered key is not used to encrypt anything else than the plaintext that the attacker already knows.

## Modes of operation

A block cipher, like AES, is used to encrypt blocks of a certain size. What should be done if the message is longer than that size?

The simplest way to encrypt a long message is to take the first block, encrypt it using the key to produce the first cryptotext block, then take the second block, encrypt it using the same key to produce the second cryptotext block etc. This approach is one of the block cipher modes of operation, called *Electronic Codebook* (ECB). It is the simplest way but often could be broken by frequency analysis. The attacker notices if you encrypt the same plaintext block twice because the two cryptotexts are also the same. This happens, for example, if plaintext contains some commonly used short pattern, like 'OK'.

Other modes of operation avoid this problem by using extra input in addition to plaintext and key. For instance, previously computed cryptotext blocks or *counters* could be used for this purpose. Our HTTPS example (`TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384_256 bit keys,TLS 1.2`

) makes use of AES algorithm in *Galois/Counter Mode*. The key length is 256 bits.

The next 3 exercises should be done in order.

### Padding oracle attack

Padding oracle attack shows that a tiny amount of additional information can be enough to break the cipher.

Earlier versions of CBC decipher implementations would return an error message to the sender if the padding of the sent message was correct. This information, assuming that we have access to submit our own messages to the decipher, is enough to break CBC encryption! Furthermore, the breakage doesn't depend on the underlying block cipher.

Assume that we have an oracle that given a ciphertext will tell whether the decrypted message has a valid padding. Note that we don't get to see the decrypted message, we only observe whether the padding is valid.

Assume that we have two blocks of ciphertext $C_1$ and $C_2$, each of length 8. Write $A = decrypt(C_2)$. We know from previous exercise that $P_2 = C_1 \oplus A$. If we can solve $A$, then we can solve $P_2$.

Let us focus on solving the last byte of $A$. There are 256 possible values for $A[8]$. Consider 256 different ciphertexts of form $(M_i, C_2)$, where $M_i$ is an array of zeros, except for the last entry where $M_i[8] = i$. Let $Q_i = M_i \oplus A$ be the 2nd block of the decrypted message $(M_i, C_2)$.

Note that $Q_i[8]$ is different for every $i$; there is an index $c$ for which $Q_c[8] = 1$. If we know $c$, then we can deduce $A[8]$ since $A[8] = c \oplus 1$.

We don't know $c$ but we know that, since $Q_c[8] = 1$, $(M_c, C_2)$ has a valid padding! So to find $c$ we can test every $(M_i, C_2)$ with an oracle to see which ciphertexts produce a valid padding.

There is a small complication as the oracle can find multiple messages with a valid padding. For example, if $Q_j[7] = Q_j[8] = 2$ or $Q_j[6] = Q_j[7] = Q_j[8] = 3$, then $(M_j, C_2)$ has a valid padding. Note that $Q_i[7]$ does not change as we vary $i$. This means that the oracle can find at most 2 messages with a valid padding (can you see why?).

This gives us the following approach. We test 256 ciphertexts $(M_i, C_2)$ to see which ones have valid padding. If there is only one, then we have found $c$ and we can solve $A[8]$. If there are two, say $c$ and $j$, we need to figure out which one is which. This can be done by changing $M_c[7]$ and $M_j[7]$ to a different value (for example by xoring with 1). This will change the 7th byte of the decrypted message. Let $M'_c$ and $M'_j$ be the modified ciphers. Then $(M'_c, C_2)$ will still produce a valid padding while $(M'_j, C_2)$ no longer has a valid padding since the last and the second last byte in the decrypted message no longer match. In summary, we can find $c$ and solve $A[8]$.

Once we have solved $A[8]$ we can move to solving $A[7]$. This is done by setting $M_i[8] = A[8] \oplus 2$ and $M_i[7] = i$. Using the oracle we can find the index $c$ for which $Q_c[7] = A[7] \oplus c = 2$, resulting in a valid padding. Note that, unlike in the case for the last byte, the oracle will find only index (can you see why?). Finding $c$ allows us to solve $A[7]$. We can now move to to $A[6]$ by setting $M_i[8] = A[8] \oplus 3$, $M_i[7] = A[7] \oplus 3$, and $M_i[6] = i$. We continue until we have solved $A$, which will give us $P_2$.

If the cipher has more than two blocks, say, $C_1, \ldots, C_n$ we can decrypt each individual block, say $C_j$, by running the previous procedure for each pair $(C_{j - 1}, C_j)$.

Remember to check your points from the ball on the bottom-right corner of the material!