Exploiting SHA-1-signed messages

March 4, 2011

Context ¶

A few weeks ago, I was asked to review a coworker’s PHP code, which provides an API for a web game. The specifications for this API were sent to a mobile development company, in charge of making an iPhone app for the game. All requests from API clients had to be signed using the SHA-1 hash function, prefixing the message with a salt for security reasons. I pointed out that this was probably not safe, and here is a detailed explanation of why. Please note that I am not a cryptographer, and that you should take my advice with a grain of salt.

Signing with SHA-1 ¶

Let’s consider the simple case of a user who wants to transfer money to a friend. The API will need 3 parameters:

from=123&to=456&amount=50

The salt was invariant, embedded as a constant string in the mobile app. That vulnerability is too easy to exploit, so let’s pretend we only intercepted the packet on a public WiFi and focus on the problem with SHA-1. This is how the signature is checked on the server:

$salt = "s3cRe7-#!@~";
$signature = sha1($salt.$msg);
if($signature == $x_signature_header) { // sent as an HTTP header
	// correct
} else {
	// discard message
}

In the case of our transaction, the correct signature for the message is:

sha1(“s3cRe7-#!@~from=123&to=456&amount=50”) = “18d2fccddeba1b08304fb53e849be5711d55b2e0"

This is included in the packet we’ve intercepted.

How SHA-1 is computed ¶

The SHA-1 algorithm is simple; it starts by padding the message with zeros up to a multiple of 64 bytes, and processes each 64-byte block of the message in sequence using the hash of each chunk to compute the next. It also includes the message’s length as a 64-bit integer at the end of the last 64-byte chunk.

In our case, this is what the first and only chunk looks like:

73 33 63 52 65 37 2d 23 21 40 7e 66 72 6f 6d 3d    s3cRe7-#!@~from=
31 32 33 26 74 6f 3d 34 35 36 26 61 6d 6f 75 6e    123&to=456&amoun
74 3d 35 30 80 00 00 00 00 00 00 00 00 00 00 00    t=50............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 20    ...............

This chunk is 64 bytes long, it starts with the salt, followed by the message, followed by a single “1” bit (0x80 is 10000000 in binary), followed by padding zeros and the length in bits of our input: len(salt) + len(msg) = 11 + 25 = 36 bytes = 288 bits = 0x120 bits.

This single chunk c0 is hashed using SHA-1’s default seed values h0,h1,h2,h3,h4:
0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476, 0xc3d2e1f0.

Injecting data into a new chunk ¶

We can add data to this message by creating a second chunk. This means considering that first chunk as the beginning of a new message, and injecting data only after the first 64 padded bytes. Because SHA-1 reuses the output of the first chunk as the initial h0..h4 values to process the second chunk, we can compute the hash of (c0 + injected data) by reusing the initial hash and only hashing the second chunk. Our new message will contain the paddings zeros and the first original size, but that’s not really a problem with a PHP target.

The data we want to inject redirects the transaction to another account, and adds a generous tip:

&to=666&amount=99999

The injected string is 20 bytes long, which means that the complete message is 64 + 20 = 84 bytes = 0x2a0 bits.

With this information we create the second chunk:

26 74 6f 3d 36 36 36 26 61 6d 6f 75 6e 74 3d 39    &to=666&amount=9
39 39 39 39 80 00 00 00 00 00 00 00 00 00 00 00    9999............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 a0    ................

As we saw, hashing c0 produces 18d2fccddeba1b08304fb53e849be5711d55b2e0, which is split into h0..h4: 0x18d2fccd, 0xdeba1b08, 0x304fb53e, 0x849be571, 0x1d55b2e0. Those values will be used as the input of the main SHA-1 loop, in order to process the second chunk as if it was the continuation of the first.

Processing that second chunk with the custom h0..h4 values generates the following hash: 5a77fe8376811dbf7533509cceb6c8c53f98e253.
This result is the hash of (chunk 0 + injected data), which is (salt + message + padding + size). We want to recreate this new input in order to send a fake order without knowing the salt. As far as I know, we can only guess its size. If we guess that the salt is 11 bytes long, here is the message we’ll send:

66 72 6f 6d 3d 31 32 33 26 74 6f 3d 34 35 36 26    from=123&to=456&
61 6d 6f 75 6e 74 3d 35 30 80 00 00 00 00 00 00    amount=50.......
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00 00 00 01 20 26 74 6f 3d 36 36 36 26 61 6d 6f    .... &to=666&amo
75 6e 74 3d 39 39 39 39 39                         unt=99999

This message is made of the end of the first chunk with our guess of an 11-byte salt, to which we have added our new data.

Signature check on the server side ¶

On the server side, the following code reads the input and prints out its signature:

/* PHP */
$salt = "s3cRe7-#!@~";
$msg = urldecode($_SERVER['QUERY_STRING']); // read input.

$signature = sha1($salt . $msg);    // compute signature.

The sha1() call will process the following chunks:

The first chunk is hashed with the default h0..h4 = (0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476, 0xc3d2e1f0):

73 33 63 52 65 37 2d 23 21 40 7e 66 72 6f 6d 3d    s3cRe7-#!@~from=
31 32 33 26 74 6f 3d 34 35 36 26 61 6d 6f 75 6e    123&to=456&amoun
74 3d 35 30 80 00 00 00 00 00 00 00 00 00 00 00     t=50............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 20     ...............

Producing h0..h4 = (0x18d2fccd, 0xdeba1b08, 0x304fb53e, 0x849be571, 0x1d55b2e0), which are then used to hash the second chunk:

26 74 6f 3d 36 36 36 26 61 6d 6f 75 6e 74 3d 39    &to=666&amount=9
39 39 39 39 80 00 00 00 00 00 00 00 00 00 00 00    9999............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 a0    ................

Producing 5a77fe8376811dbf7533509cceb6c8c53f98e253, which is what we had predicted. We have added data to our message, and used the original hash to sign the new chunk.

PHP is happy with the input, the later values affected to the to and amount variables replacing the earlier ones which contained binary data:

Input = array(3) {
	["from"]=> string(3) "123"
	["to"]=> string(3) "666"
	["amount"]=> string(5) "99999"
}

Hash = string(40) "5a77fe8376811dbf7533509cceb6c8c53f98e253"

And the signature is correct, too! The transaction is executed.

Message signature ¶

Don’t use a hash function as a symmetric signature.
Don’t roll your own.
Use HMAC.

I highly recommend this 1-hour presentation on cryptography by Colin Percival, which includes specific dos and don’ts for different use cases: (PDF), (video), (Hacker News discussion).

Code ¶

Demo code for this article is available on GitHub.

If you enjoyed reading this article, you should follow me on twitter.

Vélib’

December 1, 2009

Vélib’ is a bike rental service offered by the Paris City Council. Bikes are parked in 1,217 stations in and around Paris, and a simple swipe of an RFID card unlocks a bike that you can ride for free for 30 minutes and leave at another station.

On the official website, a Google Maps mashup gives riders the location and details of each station; the page uses an HTTP-based API to get real-time info on the service.

The Vélib’ XML API ¶

$ curl http://www.velib.paris.fr/service/carto
...
<marker name="07025 - SUFFREN TOUR EIFFEL"
    number="7025"
    address="2 AVENUE OCTAVE CREARD -"
    fullAddress="2 AVENUE OCTAVE CREARD - 75007 PARIS"
    lat="48.856449937"
    lng="2.29304397362"
    open="1"
    bonus="0" />
<marker ...

A second URL gives the details for each station:

$ curl http://www.velib.paris.fr/service/stationdetails/7025
<xml version="1.0" encoding="UTF-8"?>
<station>
  <available>10</available>
  <free>17</free>
  <total>33</total>
  <ticket>1</ticket>
</station>

This station has 10 parked bikes, 17 available slots to attach a bike to, and 33 slots in total. 6 of these slots are either locked, broken, or holding broken bikes.

I have built a bot that queries each station periodically and writes the data to a log file, recording the numbers for the whole of Paris in a key frame every 10 minutes. A simple script then reads the key frames and interpolates the data in order to get a frame per minute.

A Tuesday in Paris ¶

Paris at 4 AM: there are bikes all over the place, scattered without a clear pattern.

At 8, 8:30 AM, people start cycling to work. There is a clear trend of bikes leaving the center of Paris and arriving in the rest of the city. My own interpretation of this displacement is that people arrive at the central Chatelêt station and grab a bike there. It is a very large station with connections to many metro and train lines.

At 9 AM, many bikes have left the center.

At 6 PM, they start leaving work to go home (or to the Chatelêt metro). The service slowly returns to a more uniform distribution during the evening and beginning of the night.

The frames have a small graph in the top-left corner that resembles a “sparkline”: it represents the number of bikes in circulation. There are two peaks, the first one almost exactly at 9 AM as people arrive at work, and the second peaking at 7:30 PM as they leave work. The evening peak is more spread out in time. There is also a slight bump a bit before midnight, possibly from people going home after an evening out.

The raw data is available here in CSV format. The file contains 7 columns: hour, minute, latitude, longitude, free bikes, available bikes, total number of bikes. Sadly I’m missing data after 2 AM so there are only 22 hours instead of 24. That said, this period is quiet and the data doesn’t change much anyway.