Blockchain explained – part 2

In Part 1 I explained the very basics of what a blockchain is with a simple example of kids trading toys. In this part I’ll explain how to fix problems which exist because of the distributed nature of the blockchain.

 

If Adam wants to trust Jake when Jake arrives at his house to trade, Adam needs to go through all the new transactions that are new to him (that Jake brought) and check the new transactions are valid.

If you remember from Part 1 every kid signed their transaction to prevent forgeries. In the blockchain world the way to sign a transaction is via PKI ( Public key infrastructure ). Every kid needs to have a pair of keys – a public key and a private key. The private key is used to sign a transaction. The magic of PKI is that any other kid can quickly verify the signature of another kid by simply checking the signature on the transaction against the public key of whoever signed the transaction.

Double spend

So signing your transaction prevents Jake or anyone else to change any transaction in the blockchain. However it does not prevent Jake from going around and showing other kids the old version of his ledger and spending what he has multiple times.

To prevent double spend anyone who wants to trade using a blockchain needs to verify that the latest ledger they have is actually the latest one. In our case anyone trading with Jake must check he did not go around trading with other kids.

Before Adam trades with Jake he could call up a all the other kids and have them verify the latest ledger Jake brought with him. This would obviously take time and might not even be possible if some kids are not around. Same happens in a distributed blockchain network where some participants might simply be offline.

Adam could also check with a small number of kids of his own choosing accepting the fact that the more kids he checks with the more sure he can be Jake is not trying to cheat him.

Proof of work

There is however another problem commonly known as the Sybil attack where you cheat in a distributed system by creating many fake identities. Remember that before Adam trades with Jake he needs to call a certain number of kids that participate in the blockchain to verify Jake did not trade with them and spent his balance. If it is cheap for Jake to create fake kids identities ( with fake phone numbers or email addresses ) then he could easily fool Adam. There would be a good chance Adam would be calling fake kids Jake created.

We could simply make it expensive to fake identities. But that would mean each kid who wants to join our toy trading needs to pay some fee to join.

Instead of asking for a fee to join the blockchain – we could make it so that when someone claims a transaction is valid they need to perform some work and prove they did that work. Many blockchain implementations choose to do just that and call it proof of work. Each kid not only signs their transaction but also offers a small reward – say a piece of candy – for anyone who performs a mathematical task that verifies the transaction. This math tasks needs to take some effort to do but once performed should be very cheap to verify.

If every transaction took a lot of effort to verify then other kids might not want to do it for a piece of candy each. But what we could do is allow kids to perform this math task for a whole block of transactions.

Hashing

This math task is usually based on what is called hashing. This is a mathematical way of turning some data – say a sentence, or a transaction – “Adam owes Alice two toy cars” into a big number. This process of hashing makes it very easy to convert a sentence into a number but very hard to go from having a number to finding out what sentence it represented.

We can ask people to perform this math task by telling them to take our sentence and “hash” it or perform this math task of turning our transaction into a number. But there is a catch. We want that final number to have the first digit be for example the number 5. We allow people to add as many letters or words to our original sentence ( that represents our transaction – “Adam owes Alice two toy cars” ) until they get a number that starts with 5.

What you need they need to do is try many sentences like:

"Adam owes Alice two toy cars a" --> 67299
"Adam owes Alice two toy cars b" --> 87322
"Adam owes Alice two toy cars c" --> 52277 Bingo, we got 5 as first digit.

If we ask for even more first digits, say we want two fives to be first digits, the harder this task is. You see whoever does the hashing would have to try out adding a lot more letters or words until they get lucky and get a number that starts with even more first digits we want.

This is harder because if you remember the task of hashing is cheap but there is no way to know what number you will get once you are done. And you cannot start with a number and tweak it until you get the sentence you need. Commonly in the blockchain world this is called difficulty. You might have heard that for example Bitcoin which is implemented using blockchain changes difficulty automatically when there are more people participating in mining bitcoins. What they are doing is simply asking for more specific digits in their proof of work.

Attack is too expensive

Once a kid performs this math task of verifying a block they would simply add a transaction where they claim the “fee” or candy for all transactions in that block, sign it with their private key to prevent forgeries and add their transaction as the latest in the ledger. They would also broadcast to all other kids that they verified all the transactions in a block and did the work – proving it by writing what sentence they used and what number they got.

Remember, it is hard to do the work but easy to check if the work was performed. Adam or anyone else trading with Jake would simply wait for their transaction to be confirmed.

If Adam waited only on one block to be confirmed there is a chance Jake used one of his fake kids identities and performed the work. So if Adam waits for more blocks to be confirmed he would be more certain he is not getting cheated.

If Jake double spent his balance with someone else, then Adam would see that the next confirmed block does not include the transaction between him and Jake. Jake could go to great lengths and both confirm the next block and confirm more blocks after the one Adam is waiting on. That would however be even more expensive ( worth more than two toy cars he is getting from Adam) for Jake so it would not make sense for Jake to even try.

Chain split

Since blockchain is distributed there is a chance that different people confirm the same block at the same time. This would cause a “fork” in the chain and new blocks being added to multiple parts of the chain. I’ll write about how this is resolved in Part 3.