# If you drew random values between 0 and 2256 − 1 inclusive, how many attempts on average it would take before you find a value below the target?

If you drew random values between 0 and 2256 − 1 inclusive, how many attempts on average it would take before you find a value below the target? Compare this number to the answer obtained in part (b). What does it tell you about SHA-256COMP2300/COMP6300 Applied Cryptography

Objectives

This assignment has been designed to test your knowledge of public-key cryptography and application of cryptography in cryptocurrencies.

Notes

⦁ Assumptions (if any) must be stated clearly in your answers.

⦁ Please upload two files only: A text file (Word or PDF) answering the descriptive/numerical parts of the questions, and a .java or .py file containing the programming parts (Question 3, parts (a) and (b)). Combine the code for Question 3, parts (a) and (b) into one program.

⦁ Scanned copies will not be accepted.

Question 1 (10 marks)

The aim of this question is to show that there are some groups in which the discrete logarithm problem (DLP) is easy. In this example, we will consider the multiplicative group G whose elements are exactly the set where p is a prime and the multiplication operation is multiplication modulo p. In particular, p = 2t + 1 for some positive integer t ≥ 2. The number of elements in , i.e., the order of the group, is 2t.

Recall that under DLP, we are given g and h such that gx ≡ h (mod p) for some unknown x, and we need to find x. We will assume that g is a generator of this group.

As an example, you may consider p = 28+1 = 257. Then g = 3 is a generator of this group. (Hint: It might be helpful to run parts (a) through (d) with these example values first to understand what they mean.)

⦁ Show that gt ≡ 1 (mod p). (2 marks)

⦁ Show that the square root of g2t modulo p, i.e., (2 marks)

Since x is a number in the set {0,1,…,2t}, we can write x in binary as:

x = b0 · 20 + b1 · 2 + b2 · 22 + ··· + bt · 2t, (1)

where bi are bits. If b0 = 0, then

x = b1 · 21 + b2 · 22 + ··· + bt · 2t = 2y,

for some integer y, i.e., x is an even number. On the other hand, if b0 = 1, then

x = 1 + b1 · 21 + b2 · 22 + ··· + bt · 2t = 2y + 1,

for some integer y, i.e., x is an odd number. Let m = 2t−1.

⦁ Show that if b0 = 0, then (gx)m ≡ 1 (mod p). (2 marks)

⦁ Show that if b0 = 1, then (gx)m ≡ p − 1 (mod p). (2 marks)

⦁ Let p = 257, let g = 3 be the generator of the group . Let gx ≡ h ≡ 130 (mod 257) be given. Using the above method, find the least significant bit (LSB) of x. Show your steps. (2 marks)

Fortunately, such primes, called Fermat primes, are very rare. So, we need not worry about them in practice. See, for example, this Wikipedia article: https://en.wikipedia.org/wiki/Fermat_number.

Question 2 (10 marks)

For reasons, that will become clear, the version of RSA encryption covered in the lecture is called “naive RSA” in [Sma16]. In practice, it is recommended that RSA be used with some sort of message padding. A standard way is using optimal asymmetric encryption padding (OAEP), as indicated in the lecture. In this question, we will cover a simpler version, which at least makes the scheme semantically secure under chosen plaintext attacks.

The CEO of the organization XYZ decides to hold a vote to decide whether employees should be allowed to work from home (WFH) all five days of the work week. All 5 employees of XYZ (excluding the CEO), need to vote either “WFH yes” or “WFH no.” To ensure privacy, the CEO asks employees to send their votes by emailing them to the CEO. Furthermore, the CEO tells them to encrypt their votes using her RSA public key e. The messages are encoded as:

WFH yes 7→ 100, WFH no 7→ 200.

The CEO recieves the sequence of ciphertexts:

c1,c2,c3,c4,c5

where ci is the ciphertext from employee i. For completeness, assume the employees are in the order:

Alice,Bob,Charlie,Dave,Ellen

So, e.g., Bob’s ciphertext is c2.

⦁ Assume you, the eavesdropper, gets the sequence:

37013,104476,104476,37013,37013

What are the possible sequences of votes? (2 marks)

⦁ Suppose the CEO’s RSA public key is (N,e) = (157439,5). What is Ellen’s vote? How did you come to this conclusion? Your answer should not involve any decryption. (2 marks)

The issue raised above is that naive RSA is deterministic. We would want different ciphertexts, even if the same plaintext is encrypted twice. To do this, instead of letting m be from the set {0,1,…,N − 1} = ZN, we will restrict it to a smaller set, and then use the remaining bits for a random number r. We will then encrypt r||m instead of m. Let r ∈ {0,1}`. Let |N| be the number of bits in the binary representation of N.

Then, encryption of the message m in this “padded” version of RSA is:

⦁ Choose a random r ∈ {0,1}`.

⦁ Encrypt as (r||m)e (mod N).

Decryption after receiving c is:

⦁ Decrypt as (c)d (mod N).

⦁ Retrieve the last |N| − ` bits as the message m.

⦁ Suppose ` = 2, so r can be from the set {0,1,2,3}. Assume you, the eavesdropper, gets the sequence:

145471,25245,14596,143468,153125

encrypted via the padded RSA scheme using (N,e) = (157439,5). Without decrypting, find the sequence of votes. (2 marks)

⦁ Suppose now that ` = 100, so r can be from the set {0,1}100. What is the time complexity of your attack from part (c)? Is it feasible? If you were an employee, would you recommend this scheme to the CEO? (2 marks)

⦁ Suppose a university decides to release an “anonymized” dataset of its students’ campus transactions. Anonymization is done by replacing student names by their naive RSA encryptions, and removing any other obvious identifiers (e.g., gender, age, address). Very briefly, argue how data released in this way can breach privacy. (2 marks)

A real-world account of a similar form of data release using (deterministically) encrypted IDs (instead of names) can be found here: https://arxiv.org/abs/1712.05627.

Question 3 (10 marks)

Recall hash puzzles from Lecture 10. In this question, we will see how they work in practice. This question requires programming in a language that has an implementation of SHA-256 hash function. You could use Java to program, although, you are free to use any other language with a library containing SHA-256, e.g., Python’s hashlib (which maybe easier to implement). Create an integer s containing all the digits in your student ID. For example, if my student ID is 12345678, then s = 12345678. Let “str” denote the string function, i.e., given any integer s, the function str(s) casts it into a string. For example, s = 12345678 becomes str(s) =“12345678”. Set the target t to:

Let H be the SHA-256 hash function. Let r be a counter starting from 1. Finally, let “||” denote string concatenation.

⦁ Implement a program that tries successive values of r, i.e., r = 1,2,3,…, computes H(str(s)||str(r)), compares it with t and halts whenever H(str(s)||str(r)) < t, with the output r. You need to provide the program and the output r. (5 marks)

⦁ Let us call the program from part (a) as: PuzzleFinder(s,t). Write a program that calls PuzzleFinder with successive inputs (s + i,t), for i = 0 to 999, and records the output for each of these 1,000 runs. What is the average number of attempts before you found the target? You need to provide your program. (3 marks)

⦁ If you drew random values between 0 and 2256 − 1 inclusive, how many attempts on average it would take before you find a value below the target? Compare this number to the answer obtained in part (b). What does it tell you about SHA-256? (2 marks)