Saturday, July 16, 2011

Bitcoin overview: proofs and common knowledge

In previous posts, I gave an explanation of the cryptographic building blocks of Bitcoin. Now I'll give a more "big picture" overview of how the overall system works. As before, I expect this to be easier to follow than the explanations I had to read to get to my current level of understanding.

Let's start from the general problems that a decentralized, anonymous (or pseudonymous) currency system has to solve. The most fundamental problem, is that of achieving "common knowledge" of the currency ownership. Specifically, everyone has to know not only who is the valid owner of any currency unit (so as to prevent double-spends); they must also know that everyone else knows the same answer. And they must know that you know that they know (and so on) this information. (This level of knowledge is known in the literature as common knowledge, but with the definition I just gave, not the conventional one.)

In other words, it's not enough that I know the current ownership status of any coin; I must count on others agreeing with me and knowing I agree with them. If you could accomplish this, you could get everyone to use and depend on the same record, thereby resolving disagreements about who is the current owner of what -- without trusting any one person. It is this problem that required the "key" innovation behind Bitcoin, as it has normally needed a trusted authority to solve it.

So what is this key innovation to solving that problem? The first insight is that it's possible to prove how many computing cycles were spent working on something. And with a system that implements such a "proof protocol", you can have a transaction record that provably has a certain number of past computing cycles spent on it. Then, you just need most of the users of a system to agree that they'll "go along with" whatever transaction record has the most computing cycles spent on it. Then, you know what the "real" global ledger is -- and you can trust that everyone else is using it too! (And they can trust that you're using it, etc.)

And there you have it: proof of ownership, without a central authority.

With that problem and solution in mind, a lot of the complexity of Bitcoin starts to make sense.

Remember how I had previously mentioned that bitcoins are initially doled out based on who can solve complex mathematical problem? Well, that math problem doesn't just exist to get initial bitcoins widely distributed -- that's not even the most important function of the problem. The main purpose, rather, is to prove that the the largest number of computer cycles were spent on a given transaction record. You see, if you start from the last known solution (which itself has the transaction record up to a point in time), you are starting from a record with, so far, the biggest number of cycles spent on it. (And the Bitcoin protocol specifies that you should start from the biggest one, though its in your own interest, as you will see.)

If you publish an "update" -- the previous ledger plus more recent transaction -- with the next solution, then the other users know that your purported ledger has all the cycles you spent on it plus all the cumulative cycles spent up to the last solution. Therefore, if you want to claim credit for the latest solution (entitling you to the 50 BTC bounty), you should start from the ledger in the latest solution.

So, let's step back and summarize. Here is a simplified version of what goes on in the Bitcoin network:

1) Whenever users want to transfer their bitcoins over to someone else, they broadcast a message describing the transfer and sign it with their private key.

2) Whenever a user receives a message indicating a transfer, they first check that the signature is valid (see previous post on digital signatures), and that the address doesn't spend more than the latest "confirmed" ledger shows it as having. If it checks out, they keep the message and propagate it to others.

3) All users wishing to claim the reward for a solution (aka "miners") bundle up all transactions they know of (i.e., new ones plus those in the latest confirmed ledger), and convert it into a math problem unique to that transaction set. They then work on solving that problem.

4) When someone finds a solution, they broadcast it, with their bundle of known transactions (new latest ledger), to all other users. Like with individual transactions, anyone who receives one of these checks it, and if valid, broadcasts it to others.

5) Miners who receive a new valid solution quit their current search for a solution, then take the latest ledger as definitive. Again, as in 3), they bundle up new transactions they hear of, add them to this new ledger, and try to solve a new math problem unique to the new transaction set, and the process begins anew.

In practice, sometimes different users will simultaneously find a solution, or solutions will propagate through different parts of the network at different speed. So miners will typically hold on to the 4-5 last latest ledgers, in case one of them is extended and becomes definitive. Users, for their part, will wait for several new ledger solutions before accepting their transaction is firmly in the network.

Oh, and as for the relevant jargon? A new solution, with its bundle of old and new transactions, is called a block. The complete transaction record, with each solution along the way, showing how the build off of each other, is called the block chain -- because each block "chains" off a previous ledger.

Now, I'm leaving out a lot of details, but I hope that explains the overall system and the different roles played. In the future, I'll go into more detail on:

- How you prove you spent X computing cycles on something.
- How you prevent situations where miners constantly find solutions at the same time.
- How you minimize storage requirements for the transaction record.
- How overlapping solutions get resolved.
- And much more.