Thursday, June 16, 2011

Explaining Bitcoin and Cryptography, Part 2

UDPATE: This was actually posted ~8:15 am CST, 6/25/11. For some reason, the date shown is that of an earlier draft. Blame blogger/blogspot.

Now that you've gotten your feet wet with my masterful explanations of some of the cryptographic pre-requisites of Bitcoin, you're ready for a more detailed explanation that removes some of the simplifications I used last time. But I will focus more on the cryptography here, telling it as I wish someone had told me when I was learning. So without further ado...

"Bitcoin really uses no encryption at all?"

The protocol itself does not involve encrypted messages, as many news outlets mistakenly report. Rather, the protocol is based on everyone seeing every message, unencrypted. However, some consider hashing a text to be encrypting it. And the address you use to send and receive is actually a hash of your public key rather than the public key itself (the signature protocol used only requires the verifier to have a hash of the public key). So, in that sense, there is encryption.

Also, as an optional (but recommended) technique, you can encrypt the "wallet file" that stores your private (and public) keys so that if someone gets control of your computer, they can't use your private keys to sign away your bitcoins.

So be careful: just because a protocol uses "cryptography" ("In cryptography we trust" being an unofficial motto of Bitcoin), doesn't mean it's actually encrypting anything, just that it's using a technique studied in the field of cryptography.

You don't usually sign an entire message in public key signatures.

I simplified: normally you just need to sign a hash of the message. Given the properties of hash functions, this is just as good as signing the message: it doesn't introduce a new weakest link, and signing a hash is computationally easier than signing the full message.

Now, you might argue that, "But there are infinitely many messages (preimages) that hash to the same digest! You said so yourself! How could I not be introducing a weakness by only signing the message digest? That allows someone to claim that I signed every preimage that hashes to that digest! I don't want to take responsibility for signing all those unknown messages!"

Calm down. For one thing, those second pre-images are, by design, very difficult to find, even despite the huge numbers of them (remember first and second pre-image resistance?). Don't let the infinite size deceive you. If the digest is 256 bits long (as in the case of the hash function bitcoin uses, SHA-256), then that means that only 1 in 2^256 (about 10^77) of all messages will "collide" with yours. That means that, on average, they have to look through 2^128 (about 3*10^38) candidate messages just to find one collision. That's a lot of work! (The "birthday paradox" ensures that you only have to search a space whose size is the square root of the space of digests: sqrt(2^256) = 2^128.)

And remember, cryptographic hash functions "look random" -- meaning there's no simple relationship between two preimages that collide. So let's say that your message is, "I hereby transfer $10 to Bob", and you sign the SHA-256 digest of that message. And let's even assume that an attacker did a lot of work and found their first collision, entitling them to claim you signed a different message, since it hashes to the same digest. Danger! Well, no, no danger. Because of the pseudo-randomness of hash functions, that "colliding message" won't be something neat and useful for the attacker, like "I hereby transfer $1 million to Bob."

Rather, in all likelihood, their second pre-image (i.e. purported alternate message) will look something like, "n02nS+TH/4dXcuPasQQn4". Doesn't seem to get the attacker very far, does it? All it lets them do is say, "Hey, I have proof that Silas sent the message 'n02nS+TH/4dXcuPasQQn4', and yes, I durn well do have have the signature, derived from Silas's public/private keypair, which matches the hash of that message. Checkmate!"

See the problem? "Um, excuse me Mr. Mallory, but what does 'n02nS+TH/4dXcuPasQQn4' actually mean? What is Silas transferring to you with that statement? It just looks like garbled text. I doubt Silas actually signed something like that ... hey, it looks like he *did* sign the hash of this other message, which actually makes sense. You can buzz off now, Mallory."

(Note: this may be a moot point, as I don't know if the Bitcoin protocol requires you to sign a hash or the original message, since the latter is already short.)

"But how do pubilc key signature algorithms actually work?"

Those of you with a scientific or rational mindset will rightly object that I didn't actually tell you how to digitally sign a message. I really just gave you the vocabulary for discussing public key signatures and asked you to take on faith my claim that the relationships hold (i.e. which parts of the protocol are "hard" and which are "easy"). I certainly didn't tell you enough to go out and create your own digital signature scheme (be it weak or strong), and this probably bothered some readers.

Well, I still won't! But I invite you to read about RSA, a commonly-used public key algorithm (with both an encryption and signature protocol). It's fairly easy to understand, and will shed some light on how it's possible for them to introduce the criticial asymmetries, such as how the private key can be difficult to infer from the public key, making it hard to generate a signature for anyone but the private key holder.

"And what do trapdoor functions have to do with public key signatures, again?"

When I mentioned the use of trapdoor one-way functions (TOWF) as underlying public key algorithms, I didn't make it clear how you turn a TOWF into a public key signature method. In the comment section of the last post, Boxo spelled out the mapping. I'll phrase it in a slightly different way. Remember that a TOWF is a function meeting the following criteria:

1) Given x, it's easy to compute f(x).

2) Given a value V equal to f(x1), it's hard to infer x1 (or any other x such that f(x) = V).

3) But if you have some "trapdoor knowledge", it's easy to find that x1 given V.

So if you have a TOWF, here's how you can sign a message. First you find a particular instance of the function class, f1(x) to which your TOWF belongs. The information that identifies f1(x) out of the function class is your public key. The trapdoor information is your private key.

One you generate a message M, you let that M (or some hash of M) take the role of V in item 2) of the description above. Because you have the "trapdoor knowledge" (item 3), you can find x1 easily, where f1(x1) = M. Then x1 is your signature, and you attach it to the message.

Others can very your signature by checking that f1(x1) really does equal M (or the hash of M). This is the "mathematical relationship for verifying a signature" that I kept mentioning in the last post. Per item 1, this computation is easy.

Hope you found this helpful!

4 comments:

digital certificates said...

I never knew that Bitcoin uses no encryption.The optional technique that you suggested of encrypting the wallet file seems useful but is their no other way?

thomblake said...

>The "birthday paradox" ensures that you only have to search a space whose size is the square root of the space of digests: sqrt(2^256) = 2^128.

This might be confusing - it looked to me on first reading like you were saying they'd only need to look through 2^128 messages to find one that matches with *your particular message*, while the birthday paradox merely entails that you should be able to find an arbitrary 2 matching messages in that time. Or am I mistaken?

Silas Barta said...

Yeah, I wasn't thinking there. THe point I was just trynig to get across is that the security of the hash is regarded has half the bitlength of its size.

Tabatha sima said...

Cashout Bitcoin Money into your bank account directly. Contvert Bitcoin Funds into Real Cash. Exchange Bitcoin Payment into Bank Account with Highest Available Rate.
Bitcoin to bank transfer || Bitcoin|| Bitcoin to Bank wire