Introduction / overview
Authors: Zademn Reviewed by:
Introduction
Another desired propriety of our cryptographic protocols is data / message integrity. This propriety assures that during a data transfer the data has not been modified.
Suppose Alice has a new favourite game and wants to send it to Bob. How can Bob be sure that the file he receives is the same as the one Alice intended to send? One would say to run the game and see. But what if the game is a malware? What if there are changes that are undetectable to the human eye?
Hashes are efficient algorithms to check if two files are the same based on the data they contain. The slightest change (a single bit) would change the hash completely.
On the internet, when you download, files you often see a number near the download button called the hash of that file. If you download that file, recalculate the hash locally and obtain the same hash you can be sure that the data you downloaded is the intended one.
Another use for hashes is storing passwords. We don't want to store plaintext passwords because in case of a breach the attacker will know our password. If we hash them he will have to reverse the hash (or find a collision) to use our password. Luckily the hashes are very hard to reverse and collision resistant by definition and construction.
Note that hashes need a secure channel for communication. Alice must have a secure way to send her hash to Bob. If Eve intercepts Alice's message and hash she can impersonate Alice by changing the file, computing the hash and sending them to Bob. Hashes do not provide authenticity.
Definitions and Formalism
Definition - Hash
Desired proprieties
Deterministic
Fast to compute
Small changes change the hash completely
Preimage, second preimage and collision resistance (Explained below)
How to use a hash:
Suppose you want to check if Alice and Bob have the same version of some file (File integrity)
Proprieties
Preimage Image Resistance
Second Preimage resistance
Resistant to collisions
1. Preimage Resistance
Intuition
This propriety prevents an attacker to find the original message from a hash
2. Second Preimage Resistance
Attack game
This propriety prevents an attacker to substitute a message with another and get the same hash
3. Hash Collisions
Intuition
A hash collision happens when we have two different messages that have the same hash
Why do we care about hash collisions?
Since hashes are used to fastly verify a message integrity if two messages have the same hash then we can replace one with another => We can play with data
Yet, we don't want hash collisions to be computable
We don't want an attacker to be able to craft collisions or find collisions given a message
Let's throw some definitions
Attack game
Security
Intuition
We know hash collisions exist (therefore an efficient adversary must exist) and that is easy to prove therefore we request an explicit algorithm that finds these collisions
This propriety makes it difficult for an attacker to find 2 input values with the same hash
Difference from 2nd preimage
There is a fundamental difference in how hard it is to break collision resistance and second-preimage resistance.
Breaking collision-resistance is like inviting more people into the room until the room contains 2 people with the same birthday.
Breaking second-preimage resistance is like inviting more people into the room until the room contains another person with your birthday.
One of these fundamentally takes longer than the other
Implications
Lemma 1
Note
Provisional implication
Lemma 2
Assuming a function is second preimage resistant is a weaker assumption than assuming it is collision resistant.
Resources
https://en.wikipedia.org/wiki/Cryptographic_hash_function - Wikipedia entry
https://www.youtube.com/watch?v=b4b8ktEV4Bg - Computerphile
https://www.cs.ucdavis.edu/~rogaway/papers/relates.pdf - Good read for more details
Bibliography
Figure 1 - Wikipedia
Last updated