Demystifying Bitcoins – Pt.2 Blockchains & Bitcoin Mining

Statutory Warning: For the sake of understanding, I’ve simplified the following concepts to a certain degree. Although the basic idea remains the same, it should be noted that its actual implementation in the Bitcoin network will be slightly more complicated.

This time around, I’d like to explain what is going on with Blockchains and Bitcoin Mining. Just to refresh your memory, Pt.1 of the guide talked about the need for Cryptocurrencies and what Decentralized ledgers are. If you’ve not read it yet, please do check it out here, as it’d help make more sense of the content provided below. This is a slightly longer read, but I promise it’d be just as simple to understand. Without further ado, here we go-

Hashing

Before we get to the juicy bits, we will need to understand what ‘Hashing’ is.

Don’t worry!! We’ll simplify it so that it’s easily understandable!

Imagine we have the world’s most power blender (or a Mixie in desi terms). It’s of such advanced technology that anything you put in, regardless of Size, Shape or Quantity, it’ll grind it and produce exactly one glass of the finely ground material as output. Depending on what you put in, only the colour of the output differs, quantity remains the same at one glass. Let this setting slowly sink in.

The magic here is that regardless of whether you put in just one or two apples, or the world’s biggest skyscraper if you can fit it in, the mixie will still produce exactly one glass of finely ground material (fixed-length output). Needless to say, it’d be impossible to reconstruct the skyscraper or apples back from the glass of resulting dust (irreversible). Another factor is that if the ingredients used are the same, then the resulting ground material will also be of the same colour (repeatable output). This means each time you put in two apples, the ground mixture will be of the same red colour.

This my friends, is quite close to how hashing works. Hashing is an irreversible mathematical function that magically (for the sake of keeping it simple) gives out an output of a fixed length regardless of whether you feed it a 10KB input or a 10GB input. The output from a hash function cannot be reversed to get back its inputs. In fact, even the smallest change in the inputs will end up in a completely different output when it runs through a Hash function.

TerminologyThe result of a hashing function is also called a Digest.

The following table from Wikipedia shows an example hash function and how different inputs produce different outputs of the same lengths. It can also be seen that even the smallest of change (such as spelling of ‘Over’ in the examples) can result in entirely different outputs.

One of the common real-life implementations of hashing is to store passwords. Nowadays passwords are not stored as plain-text and instead, hashes, generally of the combination of username and password is stored.

This ensures that when the right username and password is provided by the user, the hash function output for this will match that which is available in the database. The advantage here is that even if the database is hacked and the hashes are stolen, unlike plain text passwords, the hacker will not be able to regenerate a password out of the hash. This is why nowadays when you click on ‘Forgot Password’option, you only get a ‘Password Reset’ link and not a copy of the actual password, as even the administrators themselves will not be able to retrieve the forgotten password from the hash.

Blockchains

Remember we talked about Ledger books maintained by the users in the Bitcoin network? Just like everything else in the world, these ledger books are also quite finite in its storage capacity, and thus the number of entries it can store is also limited. Let’s take an arbitrary value and say one ledger book can store 100 transactions. What do you do when the book gets filled over? Put it back safely in a shelf and start a new one obviously. Once the second book gets over, that is also moved to the shelf on top of the first one and this will go on and on. As per ‘Decentralized Ledgers’ philosophy (discussed in the previous part), everyone in the network will have this shelf of ledger books stacked in the chronological order (from earliest transaction to the latest). These ledger books can be considered as ‘Blocks’ and this ‘chain or stack’ of books stored in the chronological order is known as a ‘Blockchain’. Theoretically, one can go through these books and back-track each coin in the course of all its transactions to its day one, when it was created.

A challenge here is to ensure that the ledger-books which are closed need to be tamper-proof. Think of it as locking the book down using a wax seal and a number written down on it, so that if anyone tries to change data in it, the seal breaks. Note that the seal is not supposed to stop users from looking inside the closed ledgers, but to ensure they don’t try and modify the data inside. They’re free to read the data as many times as required since everyone needs to see the ledger entries.

The number used to seal the blocks are of very high importance as it is with the help of this number that the network figures if there has been any foul play or not. How this is done, we will see shortly.

At the minimum, each block in the blockchain will contain a set of transactions, a sealing number and the sealing number of the block right before the current one.

There is no easy way of calculating this sealing number for each block. In fact, the Bitcoin network gets help from the users to calculate this number. This process of calculating the sealing number is called Bitcoin Mining.

Bitcoin Mining

To arrive at this sealing number, the Bitcoin network presents the users or miners with a mathematical puzzle in the form of a hash function accompanied by two known inputs(say X, Y), an unknown input (say Z) and a known result (say R).

X is the hash of all the transactions available in the current ledger book about to be sealed.

Y is the hash of the previous ledger-book, right before the current one.

Z is unknown as of now, and this is for the miner to mine out.

R is just a random output arbitrarily decided by the network.

The network then requests the users to find a number to replace Z so that when X, Y, and Z are fed into the hash function, it produces the known result R. Since hash functions are irreversible, the only way to find the third input is by trial and error. So the miners start to randomly assign values to Z, move it through the hash function and compare the resulting output with the known result R until a match is found. Once a match is found, this number can be used as the seal to close the book. How this sealing number makes ledgers tamper-proof, we shall see in the last section.

Terminology: In Bitcoin terms, this unknown input Z is called a Nonce. Miners keep changing the value of nonce and running the hash function until the resulting hash is equal to R. Once this match is found, it comes to be known asProof of Work.

Whoever finds this sealing number first, will announce it to the whole network and once it is verified to be correct by other users in the network, everyone will close their ledger book with this Proof of Work. With agreement from the Bitcoin network, the person who finds the number first is rewarded with newly created Bitcoins (out of thin air), by just adding a new credit entry into their ledger. This is how new Bitcoins are introduced into the system.

Terminology: This reward entry to the ledger adding new coins into the network is called a ‘Coinbase’ transaction.

This reward is put in place because finding the nonce requires a lot of computational power and this reward incentivizes more users to come forward and become miners, thereby maintaining a healthy Bitcoin network. This process of finding the nonce and sealing ledgers is called Bitcoin mining. It is quite evident here that Bitcoin mining is less of mining and more of mathematical puzzle solving.

So in short Bitcoin miners just keep gathering the broadcasted valid transactions in the network, and once their current ledger book runs out, it’s a race to find out who calculates the sealing number first and broadcasts it. Once a block is sealed, the miners move on to find the nonce for the next set of blocks, so on and so forth.

When Bitcoin had just started, the predetermined result R was designed to be simple enough so that normal home PCs could be used to calculate the nonce easily. Nowadays there’s a lot of specialized hardware being designed just to carry out mining, making it a very simple and fast process. So, to prevent a sudden influx of coins into the system, the Bitcoin network dynamically changes the difficulty of solving the puzzle. Think of our blender example from the hashing section. If at first the network used to just say – ‘Mix three things and give me a glass of Blue material’, which seemed simple enough, slowly the network starts increasing the difficulty and starts to demand more specific requests such as a glass of ‘Baby Blue’ or ‘Indigo blue’ material, instead of ‘General Blue’. In the Bitcoin world, this is done by setting the number of preceding 0s in the output ‘R’. Initially, if the network wanted an output of ‘00********’, where the ‘*’ could be anything, as time goes on, the number of preceding 0s also increase to ‘000000***’. It is easier to calculate the former as it is a more general output than the latter where the output is more specific with more preceding 0s.

As time goes on, the network also slowly reduces the number of Bitcoins earned as a reward for finding the nonce. Although the initial starting reward was 50 Bitcoins, the number of Bitcoins generated per block is set to decrease geometrically, with a 50% reduction every 210,000 blocks, or approximately four years. If you do the math, we can see that the total number of Bitcoins that will ever come to existence is limited to only 21 Million!!

Why Jump All These Hoops for the Seal?

Why are we spending so much of computational power in calculating this seal? How does the seal work against tampering? Let’s try and see how this seal ensures that the ledgers are tamper-proof.

Consider a situation where a user tries to go back and edit the transactions in his previous ledger-book which was just sealed.

This editing will result in a change in the value for X as X is the hash of the existing transactions in the book which he just modified. This will also mean that if the value of X is different, the output of hashing X, Y and Z will be different from R. Thus the network will know the ledger-book has been tampered with and will reject it.

What if he tries to tamper with a much older ledger deep in the system?

Here too, since one of our inputs Y is the hash of the previous ledger, modifying an older block anywhere in the network will have a cascading effect (as hash result of every block is linked to a hash of its previous block as well) ensuring the value for Y will change, which in-turn will cause a failure in the hash function to produce a matching output R.

This way the Bitcoin network ensures that all sealed ledgers regardless of its position in the Blockchain are tamper-proof!

Although quite a long read, these I feel are some of the basic concepts that will help you develop a deeper understanding of how Bitcoins work. In case you are interested to know further, I highly recommend you go through the original Bitcoin Whitepaper written by the inventor of Bitcoin – Satoshi Nakamoto.

In the next part, I’ll try to explain what you need to do to get your hands dirty and start buying yourself some bitcoins.

PS: Fun Fact – No one knows who Satoshi Nakamoto is. It’s rumoured that Satoshi is in fact not a single person, but a group of people who collectively designed Bitcoin.

Leave a Reply

Your email address will not be published. Required fields are marked *