How blockchain technically works?
In the last article on What is actually blockchain?, we tried to understand blockchain from a non-technical perspective. If you enjoyed that and want to know more technical details, this article is for you. I have still tried to make sure it’s not very technical and have simplified at many places, still it has more technical details than last article. After reading this article, you still want to get more technical, you can search online and refer any technical material. This article will prepare you for that. If you have directly landed on this article but you are kind of non-technical person who tries to understand things only high level, you should go back to article What is actually blockchain? and come back here if that generates interests to know more.
Now let’s understand blockchain in details. Before that, we need to understand few nuts and bolts of blockchain technology first.
Hashing – Let’s start with Hash or Hash Value. Hashing is a process where the input of any length can be converted to a fixed width output. The size of the output is defined by the process or function we are using. For our discussion, let’s take very commonly used Hash function SHA256. It is a mathematical function which converts any length of digital input into 256-bit string (or say text), which is called Hash. Few properties of Hash Functions –
1. Same input used with same hash function will always give same Hash value.
2. Just looking at Hash value you cannot figure out what the input was.
3. Irrespective of size of input, Hash value will always be of same size (256 bit in case of SHA256)
4. The number of possible outcomes of SHA256 is 2 is to power 256. Which may look small but is huge, in number it is
How large that number is? – It is larger than number of atoms in universe. So, possibility of two different inputs having same output is negligible, almost impossible.
So how is Hash function used? Let’s understand it by one example. If I send you one file and its Hash value separately, you can figure out by again creating hash value of that file and comparing it with hash value I sent you, to understand if anything has changed in file that I sent you. When we understand blockchain working, we will see in more detail how hash function is used.
Encryption – Next thing that we need to understand is encryption. Encryption is changing the text or input in unreadable format using a rule (called key) in such a way that only person knowing the rule or key can change it back and read it. Simplest example is to increase by two. Say we write a sentence and replace every letter in that sentence by letter coming after by two places for e.g. replace a with c and b with d and c with e and so on. This is one of the simplest encryption.
The problem with this kind of encryption is that any unintended person can guess key or rule with some effort and then decrypt the message. The solution is making the rule or key very difficult which will make decryption by unintended person very difficult even if not impossible. This doesn’t solve the problem completely because irrespective of how difficult the rule is, we need to communicate this rule to intended person so that he can decrypt it. Problem is if in between unintended person also gets rule he can also decrypt it.
So, to make it secure, we need to come up with system where we don’t need to communicate the key or rule to intended person and he can decrypt it without knowing the key we used to encrypt.
This may sound impossible but can be achieved through asymmetric key system. In this system two keys are generated, these two keys work in pair. The message encrypted using one key can be decrypted using another key and vice versa. Now user can decide and make one key as public key and other as private key. Public key is published and is known to everyone. Private key is kept secret with user himself. So, when we need to send message to someone, we can use his public key and encrypt it. Now this message can only be decrypted with his private key which is known only to him. So, this message is now fully secured, only risk being if his private key gets leaked.
Digital Signature – Now we need to understand third term Digital Signature which is based on first two terms Hash and Encryption. First let’s try to understand what a normal signature is. A signature is verification by a person. Anyone can read your signature and figure out it’s signed by you but cannot produce your signature. Same way digital signature works.
When we want to sign a document digitally, first we create hash of message/document that we want to sign. Then we encrypt this hash using our private key. This encrypted hash is called Digital Signature. This digital signature is sent along with original message to receiver. Now receiver of message can verify if it was signed by correct person. He will decrypt the encrypted hash using that person’s public key. Then he will create the hash of message received and compare it with decrypted hash. If both of these match, it means message was signed by person it claims to be signed by and also nothing in message has changed post signature i.e. message received is same is message signed (because Hash of both matched). Digital signature created for one document/message cannot be used on another message/document because of obvious reason that digital signature is encryption of hash of one message and no two messages can have same hash.
Now that we understand these three basic terminologies Hash, Encryption and Digital Signature, we can start understanding how blockchain technology works.
Let’s first understand that Blockchain is nothing but a data structure i.e. a way to arrange data. Data is arranged in Blocks which are chained together sequentially, once chained there is no way to change previous blocks, you can only add blocks further. Blocks are nothing but set of transactions accumulated together with some more information about this block and previous block to chain these together.
If you are with us from the beginning, you know blockchain is nothing but a ledger, so it’s not difficult to understand what is a transaction. So, to summarize group of transactions form block and blocks chained together form blockchain. So far, the most successful use case of blockchain is bitcoin. When we now talk about working of blockchain, we are basically talking about how bitcoin implements blockchain. Different versions of blockchain may change it partly based on use case.
Before we start understanding the working of bitcoin blockchain, let’s quickly understand what is bitcoin. Bitcoin is a digital currency the ownership of which is recorded in a blockchain. If as per records in that blockchain (called bitcoin blockchain), you have balance in your account (either received from someone or created, will see how it works in detail), you own bitcoin. You don’t have anything physical, but record in that public ledger that you own those bitcoins. Now let’s see how it works.
Let’s start with transactions – bitcoin blockchain can have two kinds of transactions i.e. transfer coins transaction and create coins transaction. Let’s first talk about transfer transactions which are most in number. When Mr. A wants to transfer his coins to Mr. B, he will create the transaction and send it to bitcoin nodes. Nodes are computers which have copies of blockchain.
This transaction will have three parts header data, Input and output. Header data primarily consists of transaction Id, time, size etc. Input data consists of:
– Reference to previous transaction through which Mr. A has received bitcoin which he wants to transfer.
– Mr. A’s address or identity, this is his Public Key
– Amount of bitcoin received in previous transaction
Output data consists of:
– Mr. B’s address or identity, this is his Public Key
– Amount of bitcoin being transferred
Before sending this transaction, this is digitally signed by Mr. A using his private key.
Few things to keep in mind, amount of bitcoin in Input and output should always match. If amount of bitcoin in output is more than amount in input, that means you are trying to transfer bitcoins more than you have. This transaction is invalid and will not be processed. If amount of bitcoin in output is less than the amount in input, it will be assumed that remaining bitcoins you are trying to give as transaction fee, which will be paid to miner (we will see in a while who is miner). So, if you want to transfer less than you received in input transaction, you need to allocate remaining bitcoins back to you. It means, output in above transaction will have two lines, allocating some to Mr. B and remaining back to Mr. A.
Now that Mr. A has written transaction and digitally signed it, he will send it to bitcoin nodes. Bitcoin nodes are nothing but computers all around the world who have copy of bitcoin blockchain and run bitcoin protocol. Nearest node that receives this transaction will first validate it for few validations, then start executing it. Executing means first checking if it is coming from person (public identity or key) it claims to be coming from. You would recall that this transaction was digitally signed by Mr. A using his private key. If bitcoin node is able to decrypt it using public key of Mr. A, it is proved that it was indeed sent by Mr. A. Next thing that this node has to check is if Mr. A has bitcoin in his account that he claims to have. Remember every node has copy of bitcoin blockchain which is record of all transactions in blockchain ever done and also Mr. A has given reference to transaction through which he had received bitcoin he is trying to transfer. This node can check if the reference given by Mr. A is correct or not, also if Mr. A has already spent those bitcoins in any other transaction or he still has it.
If this node finds that this transaction is valid, it will relay this transaction to other close by nodes. It will also add it to blockchain in block with other transactions. This process of adding block to blockchain is called mining, we will see in a while why this is called mining. This process makes sure everyone has same and correct version of blockchain. If this works perfectly, no wrong transaction can be executed and blockchain will always be perfect. There are certain nuances to this process which we should address quickly before we move forward.
Earlier we talked that bitcoin can have two kinds of transactions, one which we already talked about transferring bitcoin and other is creating bitcoin. Create bitcoin transaction cannot be created by anyone. When you add a block to blockchain as a node you are awarded certain bitcoins. The number of bitcoins that anyone can get by adding block gets reduced every four years. Currently it is 12.5 bitcoins. This process of adding blocks or mining is only way to create bitcoin. That’s why it is called mining. This is one of the unique feature which keeps lot of people connected to bitcoin blockchain and they want to validate transactions and add it to blockchain.
If you get blockchain by adding valid transactions to blockchain, will every node who adds the transaction of Mr. A and B will get bitcoins? The answer is no. Let’s see in little bit more detail.
Firstly, blocks are added to blockchain and not transactions directly. All nodes keep accumulating their validated transactions and try to add block when completed accumulating. The size of block is limited to 1 mb. When you have accumulated validated transactions up to size of 1 mb, you can try to add the block.
Secondly, you must have noticed I am talking about trying to add block and not about adding block. The bitcoin protocol is made in such a way that not everyone is able to add block to ensure only those who are putting enough efforts are able to add. This is called Proof of Work concept. Without going into much detail, any node who is able to solve cryptographic puzzle first, will be able to add block. The difficulty level of this puzzle is set in a way that it takes around 10 min to add a block, this difficulty level keeps changing to maintain 10 min. So only node which is able to add block by solving this puzzle, gets reward of 12.5 bitcoin.
Let’s talk a bit about this cryptographic puzzle. If you want to avoid more technical details, you can skip this part, you will not miss much. You know that no one is controlling or running bitcoin blockchain centrally, so obvious question is who decides what puzzle would be and who has solved it. This is all already built into bitcoin protocol. First understand how blocks are added to blockchain technically. Each block has following content –
1. Previous block ID (Hash Value of previous block)
2. Block ID (Hash value of this block)
3. Block No.
4. Nonce – a random number
5. Transactions in the block
Having previous block ID in each block ensures blocks are in sequence and nothing can be inserted in between.
Block ID is Hash value of 3, 4 and 5 combined. Block no is serial number of block. Nonce is a random number, we will talk about it in a while. So, if you are the node who wants to add block to blockchain, you will create block this way, you can get block ID of previous block from blockchain. Take block no, nonce (a random number) and transactions you want to add, create a hash which will be block ID of this block. Now you have all five parts of block and you can add it to blockchain.
To make it difficult and make sure only node which puts effort is able to add block, there are some rules enforced by bitcoin protocol about Block ID or Hash value of block.
At any point of time, there is requirement to ensure that block ID begins with certain number of zeros. But remember block ID is Hash of three items Block No, Nonce and Transactions. So, to ensure Hash value matches certain criteria (in this case starting with certain number of zeros) you need to change nonce in a way that you get such value. There is no way to back calculate it, you (or your program) needs to try different options of nonce. This process takes lot of computing power (or work) and is called Proof of Work. The number of zeros required is set in such a way that it takes around 10 minutes to mine a block based on computing power involved in mining globally. If time taken starts getting more or less, protocol will adjust difficulty level (number of zeros) every two week to maintain it at 10 minutes.
Now let’s talk about some practical problems, what if Mr. A tries to twice spend bitcoins he owns. Same bitcoins that he has already transferred to Mr. B, he tries to again transfer to Mr. C?
As we know bitcoin blockchain has record of all transactions, so when he sends transaction to again spend those bitcoins, miners will check it against his balance of bitcoins in blockchain, it will be realized that he doesn’t have those bitcoins and transaction will not process.
What if he sends at same time two different transactions to two different nodes spending same bitcoins.?
As both nodes don’t know about other transaction and see balance in his account, they will treat his balance as correct and process it. But both nodes will not be able to add blocks at same time. As we know only one node can add block at any given point of time based on proof of work or who solves that cryptographic puzzle first. So only one transactions of these two will make to blockchain and other one will get invalidated as balance of bitcoin will get reduced with first transaction.
What if Mr. A knows some miner (or himself has computing power to mine) and tries to get both his transactions entered into blockchain?
First, it is very difficult to know if your block will be added to blockchain because you don’t know if you will be able to solve puzzle. Second, even if he is able to add block with both the transactions, blockchain has inbuilt mechanism to address this situation. As blockchain doesn’t have any central verifying authority, it has internal verification which is called consensus mechanism.
Let’s take same example to understand how it works. If Mr. A is able to add block with both transactions, other miners will find this error and figure out that because of this transaction block is not valid. While adding next block they will reject this block and use previous version of blockchain to add next block. So now we have two versions of blockchain, one that Mr. A has and one that another miner has. Slowly all miners will figure out error in Mr. A’s version of blockchain and will start ignoring it. Now more people have another version of blockchain and it becomes acceptable version. So, there is no way Mr. A can double spend his bitcoins. As there is no central authority to certify which is correct version of blockchain, this consensus (most accepted version) mechanism takes care of validation of blockchain. From point of view of person who helped Mr. A getting in wrong transaction, he lost his mining bitcoin despite putting so much computing power and getting the puzzle. So, it is self-demotivating mechanism to cheat, which works very well. There are cases where serious conflicts arise which are called forks, which is beyond our discussion here.
In above example, what about Mr. B and Mr. C who have sold goods to Mr. A based on his transactions without knowing he has double spent it. For a cryptocurrency transaction, Mr. B or Mr. C whose transaction becomes part of long term accepted version of blockchain will benefit, other will lose out. So, Mr. B and Mr. C should have waited to see bitcoins in their account before handing over goods. So how long they should have waited, there is no certain number as it’s all about consensus but it is accepted fact that if 6 blocks get added post block containing your transaction, it means it has been verified and has formed part of accepted version of blockchain. Most of bitcoin wallets show bitcoins in your account only after that. Bitcoin wallets are way to store your private keys, the technical details on bitcoin wallets and exchanges is beyond scope of this discussion.
Now that we have already seen how practically blockchain works, let’s summarize key features of blockchain –
- Distributed Ledger – As we discussed earlier, blockchain is nothing but a ledger but what makes it different from traditional ledger is that there is no central location where it is stored. This ledger is distributed among all its participants. All participants (called nodes) of blockchain have same copy of ledger and that makes it trustworthy and also tamper proof. Anyone willing to corrupt this distributed ledger will need to tamper so many copies of it at same time. The way that tampering needs to be done, we have already seen it is almost impossible to do that with current computing power.
- Consensus – Blockchain technology works on consensus mechanism. There is no central authority governing which is valid transaction or block to be added to blockchain. The protocol is set-up in such a way that there is incentive to add valid transactions into blockchain. What is valid gets decided by what majority of nodes agree on (this may be implemented differently in different blockchains). This feature makes it truly decentralized.
- Transparent – Blockchain is decentralized and distributed ledger so every node participating has copy of it. Although blockchain uses encryption methods to check valid transactions and instead of using real identity it uses Public key address as identify, the transactions itself are not encrypted or hidden. This is done purposefully so that anyone can check validity of transactions being processed. So, transparency is at heart of blockchain technology.
- Immutable – Once a record (transaction and block) is added to blockchain and verified through consensus, there is no way it can be altered. It becomes unalterable part of blockchain. We can say there is one truth which everyone refers. There can be more records added post that but existing records stay there forever.
- No Central Authority – Although we have proved this point through first two points above, consensus and distributed ledger, it’s necessary to mention that there is no central authority which governs even rules of blockchain, those are also decided and altered based on consensus.
- No Double spend – Although this feature is unique to cryptocurrencies but its essential feature of blockchain technology which made it best fit for digital currencies. You have already seen in above example that it’s not possible to double spend your bitcoins. Double spend problem was biggest problem which stopped all earlier steps to come up with digital currency, it was only possible to bring it successfully with blockchain technology.
- Smart Contract – Although this is relatively new feature being introduced in Ethereum and is not present in bitcoin, it is being termed as revolutionary. In essence it means that through blockchain self-executing smart contracts can be built. This is topic of larger discussion.