NOTE: Previously, I wrote an introductory article on cryptocurrencies and other applications of blockchain, The article can be found here: http://twisri.blogspot.com/2017/05/summary-of-bitcoin-and-its-underlying.html. The article below builds on this by explaining the actual “nuts and bolts” of the bitcoin blockchain. Our goal is to have the reader walk away with a better, more technical understanding of the bitcoin blockchain. If the reader is not familiar with terms and concepts such as: blockchain, cryptocurrencies, distributed ledger, bitcoin mining, etc. It may be best to read the introductory article first or obtain the basic knowledge from other sources.)
Many people have a rough idea of how the bitcoin blockchain works,
but few understand precisely how it
works. This article intends to explain the blockchain in a clear, simple, and
visual way.
Part 1 The Hash and the Blockchain [1]
Before we talk about blockchain, let’s talk about a hash [2]. The hash is the foundation of blockchain technology.
A
hash is simply a product of a hash
function [3]. So what’s a hash function? It is a computer program that
takes in data of any size, length, format, type, and returns an output with a specific
format, length, size, type, etc. This output is known as a hash.
For example, let’s say you have a hash function that returns a
3-digit value; input like “adfoadihfopahfeoapfjda” will return a 3-digit value (“391”);
input such as “&(*!&$@” will return a 3-digit value (“126”), and an
input such as “ ” will still return a
3-digit value (“792”). The point is, no
matter the randomness of the input, the output is always one specific thing.
Note that hash functions are not unique, there are many different
versions! The one bitcoin uses is called the SHA-256 Function and it returns hashes that are 256 digits-long hexadecimals
[4]) The picture at right is a visual explanation of what this means:
FIG 1. Above
On the left side are random input (data) differing in size, shape,
category, etc. On the right side is a specifically formatted output, or hash, in this case, it is a five-digit code.
Hashes are unique, one
hash represents one specific input. A hash never represents two or more inputs.
This makes a hash an effective identifier of a specific input. The
SHA-256 hash function used by bitcoin generates hashes of 256 bits. Each bit
represents either a 0 or a 1. You can form 2^256 different hashes, more than
enough to identify every single thing in the universe!
Hashes are asymmetrical,
there is no “inverse hash function”. In other words, you can apply the hash
formula on some input to return some output. But there is no reverse formula that lets you reverse the output and get back to
the original input. The only way to do it would be to use brute force - applying
millions/billions of different inputs and see which one returns the correct
output.
Hashes are random, if you
change the input by a tiny, tiny, little bit, the output will be completely
different. This means that you can not tell whether two inputs are correlated
based on their similarity in hash. You’ll see why this is a favorable characteristic
in a second.
So how is all this tied to the blockchain and transactions?
Say we have a transaction described as this: “Jane gave Allen 57.2
bitcoins on September 29, 2017 at 4:35 pm.” If we apply a hash function to it, we get a specifically
formatted hash, such as something like this: “123910912” (random example,
the actual bitcoin hash is a much longer).
This random number “123910912” is a hash: it uniquely identifies this specific transaction statement. It cannot be traced back to the transaction
statement, and it cannot be inferred
by knowing what a similar hashes stand for (there is absolutely no correlation
between “123910912” and “123910913”. For example, “123910913”could stand for
“Helen received 25 bitcoins through mining on January 6, 2002 at 1:12 am”).
Now things start getting cool:
Hash functions doesn’t have to be 1-input-to-1-output. It can be 2-input-to-1-output
or 3-input-to-1-output. In other
words, hash functions can take in multiple inputs and still generate 1 output.
Hashes and hashes can
combine to form new hashes. Again, a picture will help
clearly illustrate the point.
Fig 2. above
It’s the same idea as in Fig. 1, except this time, previous hashes are the inputs.
This is “blending the hashes.” What exactly does “blending the
hashes” do? Take hash123: 83513 from the picture as an example.
First, it is a unique
identifier of the three-input-combination.
(the ball, the ball with a mark, and the square), it is the proof
of existence of these three pieces of
data.
Second, if all you see is the hash123: 83513, there’s no way for you to figure out what the
inputs are.
And third, hash123: 83513 is not inferable.
83514, 83515, or 83516 does not tell you anything about 83513.
As shown, you can keep on combining and combining hashes until you
get to one final hash. This final hash is cryptographically secure and it’s a proof that all the data went into it
exists. The visual outline of this process is known as the Merkle Tree [5], the ultimate, single
hash is known as the Merkle Root [6].
(the picture illustrates a 2 branch Merkle Tree, but it could have up to as
many branches as needed).
This is how transactions are recorded in a block. Take, say 100
transaction statements, hash them into 100 unique, uninterpretable,
un-patterned code, combine them down to 50, then 25, then 13…7…4…2…and finally
down to 1; that 1 hash is a proof that ALL 100 transactions took place, and is
infinitely close to being 100% secure!
To understand the important role that hashes play in blockchain,
imagine…
- Q. What happens if one hash represents two transactions or two bundles of transactions?
- A. For example: What if transaction statement: “Jane gave Allen 57.2 bitcoins on September 29, 2017 at 4:35 pm.”, and transaction statement: “Helen received 25 bitcoins through mining on January 6, 2002 at 1:12 am” both are identified by hash: “123910912”? The system breaks down.
- Q. What happens if you can trace back to the transaction from the hash?
- A. For example: by applying some formula to “123910912”and conclude “Jane gave Allen 57.2 bitcoins on September 29, 2017 at 4:35 pm.” You have no privacy, and the information might be used to cause harm.
- Q. What happens if there was a pattern to the hashes?
- A. For example, knowing what “123910912” is gives you some idea of what “123910911” and “123910913” are? You can pretty much infer a whole bunch of things with one piece of knowledge, which is undesirable.
This is how privacy and integrity of the blockchain can be
maintained at once.
Part 2 The miners and the blockchain
A miner creates hashes that represent whole blocks and attaches them
to the blockchain (we’ll refer to these hashes as “complete-hashes”). The
first miner that successfully attaches the next complete-hash wins a prize of some
number of bitcoins. The process repeats until all bitcoins have been mined
(21 million).
There are three components that make up the complete-hash. The
Merkle root we just described, representing all pending transactions generated by the network after the consolidation of the previous block, the hash of the
previous block, and the nonce. We
define this term below.
[ In the bitcoin network, a transaction is considered valid if that transaction has been
authorized by the transferrer through his or her digital signature.
Because of the way blocks are setup (All transactions
come down to one hash), there is no way for you to validate the past transactions by flipping through the old blocks-because
you can’t reverse the hashes of those blocks! So, instead, you “wax” them
together, creating self-evident proof. Here’s what I mean in a picture
Fig 3. (Click on picture to enlarge).
The mere existence of the most current block is the proof that all the transactions before it are valid, since you cannot get here without this being the case. Notice that we made a very important assumption: WE ASSUME THAT ALL “CURRENT” TRANSACTIONS WERE VALID IN THE FIRST PLACE. How do you know that? This leads to the discussion of the last component: the nonce. ]
NOTE: (what I said in the brackets [] could be wrong, due to my misunderstanding. I would love some feedback on this.)
NOTE: (what I said in the brackets [] could be wrong, due to my misunderstanding. I would love some feedback on this.)
We need to discuss two things before we get to the nonce:
- The consensus issue and
- How to commit fraud.
The consensus issue
Blockchains exist as files on a network of computers connected via the internet or another network protocol. For blockchain to work, every computer on the network needs to have an exact copy of the newest blockchain. When one computer announces an updated version of the blockchain, this “work” is publicized and all the other computers on the network verify this “work”.
The newest, most recent blockchain needs to be consistent- all transactions must make sense. This insure that it is impossible to have some extra bitcoins on someone’s account that can’t be traced back to being mined. The newest blockchain also needs to be accepted by the majority of the computers online (51% of the network participants). After consensus is reached, the newest version of blockchain is accepted by all computers.
When offline computers come online, they’ll compare the copy of the blockchain they hold with the newest copy on the network. If the newest copy is longer and consistent, they’ll automatically update their copy up to that version.
Fraud
Blockchain network computers only verify that updated transactions are consistent. Most types of fraud are denied because they cause inconsistencies. Note that computers fall short of detecting fraud that is consistent with the transactions (you’ll see what I mean in a bit). If a hacker commits this type of fraud, and becomes the first one to announce a fraudulent and consistent update, other computers will accept his work as valid. Once the consensus reach 51% of all participants, the fraud is permanently in the record and unsolvable.
Let’s talk about how fraud is committed. Imagine you are a malicious, greedy hacker (just for a while). You want to cheat the system. There are three ways you can cheat:
- Create some bitcoins out of nowhere;
- Steal some bitcoins from…say your friend Bob, and
- Purchase something from Bob with bitcoins, and after receiving that something, create a fake transaction that reverses the initial transfer of bitcoins.
As for option 3, you can’t wait too long. Once those transactions are consolidated into the blockchain, you’re back in the ‘option 2’ situation. So, your only option is to reverse the transactions before they get incorporated into the blockchain. What you want to do then, is create a fake reversal transaction, include it in the Merkle root, and be the first one to complete the hash. Other computers will check your hash, make sure it is consistent, and accept all those transactions as valid.
Now you might not be the first one to complete the hash, since many other computers are competing with you. But, based on today’s computer processing power, there is a significant chance that you’ll be successful. And nothing stops you from keep trying.
Enter the nonce….
Remember the randomness property of a hash? It states that by changing the input a tiny tiny bit, the output hash becomes completely different. A nonce is just a number generated by the computer. Since the final hash is the product of a hash function that takes in: 1. the Merkle root, 2. the previous hash, and 3. the nonce. By changing the nonce a little bit, say from 1 to 2, you end up with a totally different result.
Now, here’s the thing: you can force the result to be a certain way, or satisfy certain criteria, before it can be attached onto the blockchain.
For example, you can require the size of the output hash to be below a certain threshold value. By doing this, you are essentially forcing the computer to do a lot of computational work, as it must try many nonces until it is able to find one that satisfies the criteria.
Now what’s the point of this? The answer is that it make it exceptionally, exceptionally hard for hackers to commit the type of fraud described in 3. above (known as transaction-consistent fraud). It becomes impractical.
One important build-in mechanism of the blockchain algorithm is that the more people participating in the activity of extending the blockchain/fighting over the rewarded bitcoins, or so-called mining, the harder the hash function mathematical problem becomes.
The average time to add a new block onto the existing blockchain
doesn’t change (for bitcoin, it’s 10 minutes), but the probability that any
particular participant solving out the puzzle diminishes. So, the more people
involved, the more unlikely that a hacker will be able fraud the system. As of
right now, it is practically impossible to the system.
Besides that, hackers today face another huge challenge: the group
known as the “mining experts”.” These people that have the most powerful mining
equipment on the network-huge processors that take up a room, and specifically
designed to mine bitcoins. The probability that experts mine bitcoin is much
higher now than the probability that a regular person will be a miner. Of
course, a hacker can become a mining expert, but he’ll soon find that mining
equipment costs much more than he can recover from reversing transactions.
Fig. 4 (Click on picture to enlarge).
Forking
Here’s a side issue I want to bring up: it is possible for two or more computers to simultaneously solve out the mathematical hash function puzzle, broadcasting two or more versions of the newest blockchain. In this circumstance, all the computers are going to receive all the versions of the blockchain but in a different chronological order. Under this circumstance, blockchain network computers work on whichever one it receives first, and put the other one aside. The tie is broken when one of the versions gets updated before all other versions (either it’s the one currently working on or not); All the computers then unify on this version and abandon all the rest.
Glossary
Blockchain: a digital ledger in which transactions are recorded chronologically and made available publicly or to a network.
Hash/Hash Function: A hash function is any mathematical rule that can be used to map data of arbitrary size to data of fixed size. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes.
Hexadecimal: In mathematics and computing, hexadecimal (also base 16, or hex) is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F (or alternatively a, b, c, d, e, f) to represent values ten to fifteen.
Merkle Tree: a tree in which every non-leaf node is labelled with the hash of the labels or values.
Merkle Root: the hash of all the hashes of all the transactions in the block.
Nonce: an arbitrary number that may only be used once.