Okay, I had some spare time last night, so I figured I’d sit down and write up an explanation of some of
Bitcoin’s workings. The chief problem in explaining this to the layman is that, as a prerequisite, you need to understand the basics of public key cryptography (aka asymmetric cryptography), which, for the average person, is quite a tall order in itself. But since I’m the master at this kind of thing, here’s how I would put it:
First, to get something out of the way:
nothing in Bitcoin is actually encrypted. Rather, it works, and works robustly, without centralization, specifically because all transactions are visible. The privacy comes in how the entities trading the coins are referred to in this transaction database, purely by their Bitcoin address (a string of numbers and letters, like 1mVQtx6rn…), which is like one of those supposed Swiss bank accounts you hear about that are only known by a number. (So yes, if you publicly and believably reveal that, "Hey, I own address 152zpfu5b20gh29...!", then people can see what you do via the address 152zpfu...) So rather than anonymous, Bitcoin is best described as pseudonymous (sue-DONN-i-MUS).
The reason that you need to know the basics of public key cryptography, rather, is that a lot of its "primitives" (building blocks) are used in Bitcoin, and the protocols used are heavily studied by professional cryptographers.
First primitive: public key-based digital signatures
How do you accomplish signatures in a digital world, where anyone can put any data on any storage medium? Like a physical signature, a digital one needs to meet the following characteristics:
A) Proof of identity: only you can produce your signature, so seeing your signature is proof that you endorse what you signed.
B) Non-repudiation: after giving your signature, you can't plausibly deny having signed it.
C) Non-transferability: your signature on Document1 can't be "moved" to a different Document2, implying your endorsement of the latter
Quite surprisingly, you can accomplish these goals with a kind of signature in the digital world. Here's the trick: you generate a keypair -- a "public key" and a corresponding "private key". You keep the private key secret, and tell everyone in the world your public key. You then use a "public key algorithm" (PKA) that takes as an input:
1) the message, M1, that you want to sign
2) your private key, SK1
and outputs a signature, SIG1. PKAs are designed so that computing this algorithm and generating this signature is quick and easy.
Then, if someone wants to verify that you really did sign message M1, they just verify that a certain mathematical relationship (corresponding to the particular PKA used) holds among your public key (which, remember, they know), your message M1, and your signature SIG1. Again, this process is designed to be quick and easy for the verifier.
So, how does this provide the desired qualities A through C above? A and B are satisfied by the fact that it is extremely difficult and time-consuming to produce SIG1 *unless* you know the private key SK1. (Inferring the private key from the public key is likewise too time-consuming to be finished anytime in the next few centuries.) So, the fact that you were able to (quickly) compute SIG1 is proof that you hold the private key corresponding to the public key, AND that (with a few caveats) you chose to use that key to generate the signature for M1.
This protocol satisfies criterion C (non-transferability) because, as you recall, SIG1 is partly a function of the message itself. This means that your signature will be different for each message you could conceivably want to sign. So someone can't take SIG1 and cite it as proof that you signed a different message M2 -- because the protocol's specified mathematical relationship will *not* hold for {M2, SIG1, public key} -- it will only hold for {M1, SIG1, Bob's public key}. To "forge" a signature, they would need to produce {M2, SIG2, Bob's public key}. But like I said above, it's way too hard for them to figure out what SIG2 would be unless they know your private key.
I'm deliberately leaving off the specific algorithms used for such systems so that this does not become unbearably long. Suffice to say, there are algorithms that accomplish this, and they mainly rely on
modular arithmetic and prime numbers. I will only add that the class of function needed to produce such a PKA is known as a "trapdoor one-way function". That is any function f(x) such that:
- Given x, it's easy to compute f(x).
- Given a value V equal to f(x) for some unknown x, it's hard to find an x such that f(x) = V. (i.e., it's hard to invert f)
- But, if you know a specific piece of information particular to f, called the "trapdoor knowledge" (in the exposition above, this is the part played by the private key), it is *easy* to invert f
What role do public key signatures play in Bitcoin? They are used to prove to the network that the owner of address A1 (A1 also functioning as a public key!) really did authorize the transfer of certain coins to the next address. Other nodes in the network, in turn, are able to easily verify that the owner of A1 signed off on the transfer by checking that the mathematical relationship I mentioned above holds among the A1 public key, the message indicating the transfer, and the signature on the transfer. And if this relationship doesn't hold, the other nodes (per the Bitcoin protocol) ignore the purported transfer, acting like it didn't exist, and refuse to tell other nodes about it.
Second primitive: (cryptographically secure) hash functions
A hash function (in cryptography) is a function that takes an input of any length, and deterministically computes a fixed-length output based on it, such that the relationship between input and output "seems random", and there's no quicker way to compute the output, or otherwise learn *anything *about what the output will look like, than to churn through the hash function itself. I will make this make a bit more sense. For simplicity, call the input to a hash function its "preimage", and the output of a hash function its "digest" (the output is also referred to as the checksum or the [digital] fingerprint).
An example of a (weak) hash function most people are familiar with is the kids' game where you find out your "Star Wars" name or your "stage name” by doing something like, "Take the first syllable of the street where you grew up, and add on the last syllable of your middle name, plus the first syllable of where you were born." This name is a hash of all that data about yourself.
However, cryptographically-secure hash functions have to meet more stringent requirements. Like I said above, it must be really hard to make inferences about the relationships between classes of input and classes of outputs without actually grinding through the function for each input in the class. So, for example, you can't have a hash function where "small changes in the input (preimage) lead to small changes in the output (digest)". Rather, they are designed so that a tiny change in the preimage will *significantly *change the digest. More formally, cryptographically secure hash functions must meet the following characteristics:
- Given a digest, it's hard to find a preimage that hashes to that digest. This is called "[first] preimage resistance". (Note: because preimages can be any length and the hash length is fixed, there are an infinite number of preimages that hash to any given digest.)
- Given a preimage, it's hard to find another preimage that hashes to the same digest. This is called "second preimage resistance".
- It's generally hard to find *any* preimages (given or not), that hash to the same digest. Such instances are known as "collisions", and this trait is called, obviously, “collision resistance”.
(Exercise for the reader: how the Star Wars name game described above fail all of these?)
The function of hashes: in everyday data security, they serve the function of obscuring data in a way that limits its malicious uses. For example, websites don't actually store your password (if they know anything about security whatsoever). Rather, they store a *hash* of your password. That way, they can still verify you by password (Check: does the hash of the password given match the hash we have on record?), but if someone breaks into their database, all they get are the hashes. Because the hash function has first preimage resistance
(see above), the list is much less useful to the attacker because they have to accomplish the difficult task of finding preimages for the hashes they found.
Hashes are where the "miners" come into play: initial bitcoins are generated and allocated (and still are) based on who can solve a mathematical problem. That problem is similar to the one of breaking a hash function's (first) preimage resistance. But rather than having to find a preimage with a *specific* digest, the problem is to find a preimage whose hash is a *partial* match (for some specific number of digits) with a target digest string. So, it's like an easier version of breaking preimage resistance, though still requiring the ability to do lots of (parallel) calculations – because there is, by design, no shortcut to solving this but to try as many preimages as you can.
Anyway, that's about all for now, something for you to chew on and get some understanding of the whole thing. There’s still a lot left, but that should cover the pre-requisites.