This is the second of a two-part series on blockchains. If you haven't read the rip-roaring success that was Part 1, you can find it here:
Read it? Cool. In that article we learned about hashes, their properties and how they can help us add security and verifiability to our data structures, in the form a a blockchain. We also stepped through a worked example of a blockchain's resilience to tampering with my good self playing the part of a villainous digital thief.
This time, we'll build on that slightly contrived example and look at how blockchains can and do provide very real security and accountability, not just in financial transactions but in more generalised digital dealings. This'll take us through the grand-daddy of blockchains, Bitcoin, all the way to Ethereum, the blockchain that's currently doing its best to shake up governance and investments worldwide.
It's a wild ride, but as always you've got this. Relax, learn, and by the time we're finished you'll be a font of delicious blockchainy knowledge.
We're going to jump right in and learn about Bitcoin, since I know that's half the reason you're here. Before I can really define Bitcoin for you, though, we need to learn about a couple of concepts.
Cast your mind back to the example at the end of the previous article. The setup was this: a bank stores its transaction information in a blockchain, and I (a thief) managed to get access to one of the blocks, perhaps on file somewhere. I modified that block to forge a new transaction that gave me lots and lots of money, but I didn't get away with it thanks to the blockchain's hash mechanism ensuring data integrity.
However, if you thought to yourself "this doesn't seem like a particularly realistic situation", you weren't wrong. Let's try to flesh this out into a more believable context.
We're going to invent a cryptocurrency.
We'll call it XYZCoin, with the symbol "XYZ" - if I own two XYZCoin, I can write this as "2XYZ". By invent a cryptocurrency, I mean that we're going to try to design a blockchain which satisfies a few requirements:
Requirement 1) People have some way of sending XYZCoin to each other, or between their own accounts.
Requirement 2) People can't spend more money than they have.
Requirement 3) No central authority / bank needs to be involved.
Sensible requirements, I think you'll agree. 1 and 2 are good for any currency, and it just wouldn't be a proper cryptocurrency without the distrust of central authorities that comes with 3. There are some other things that you'd want in a real, honest-to-goodness practical currency, but for the purposes of learning about the relevant concepts this will serve us just fine.
The blockchain from last time pretty much already satisfies requirement 1, but let's formalise a little bit on how we write down a transaction. If we're going to send XYZ between people, we need a unique way to refer to everyone. In normal bank-backed currencies, this is your account / IBAN number. In XYZ, everyone can also have an ID, but we're not going to require that anyone give it to them - anyone can feel free to make up a new one at any time for themselves, as long as no-one else is using it. We'll call this ID a wallet, since it represents the place in which a person stores their XYZ and from which they send it.
If John has the wallet with ID "JOHN" and Mary has "MARY", then we'll write the following to mean "John sent 10XYZ to Mary":
We'll use these transactions as the data content for our blocks. That's requirement 1, sorted. Easy.
Requirement 2 is much harder. Especially doing it in a way which satisfies requirement 3.
To make sure someone can't spend money they don't have, we need to secure the chain against a few different types of fraud. Firstly, Mary needs a way to check that John actually has enough XYZCoin to send to her. In the modern currency world, this is usually done by Point of Sale devices, ATMs and the like communicating with the bank, generally over the internet. Without such a bank, we need to create a situation in which everyone can see how much is contained in everyone's wallets.
We do this by having anyone who's interested in the contents of the blockchain keep a copy of it. We distribute the blockchain, in which we've stored a ledger of all XYZCoin transactions. The ledger is distributed. It's a Distributed Ledger.
Everybody who cares can run a bit of software called a node which maintains a copy of the entire blockchain and talks to other nodes to stay up to date. An addition to the blockchain can be suggested to the network by any node, others will validate it, and if it passes then the change will be propagated through the network and become the new state. Invalid additions to the chain will be rejected by the validating nodes and won't propagate. It works a lot like a peer-to-peer file sharing network swapping new episodes of Game of Thrones.
Now, when anyone wants to know the wallet JOHN's balance, they just have to grab their local copy of the chain, step through every block and look for transactions involving JOHN. He starts with 0XYZ, and if the rest of the transactions involving the wallet look like:
(MIKE->100->JOHN) (JOHN->10->ALAN) (JOHN->15->JANE)
...then he has 75XYZ. No need to store the value "75" anywhere, we can just work it out as required from the ledger. This is a common feature of this kind of system: balances are defined entirely by the history of transactions. More generally, system state is defined entirely by the history of changes to it.
We also need to make sure that somebody can't spend money twice. This is a problem slightly alien to the world of cash (although arguably spending forged currency falls into this definition) - once you hand over your ten pound note, you don't get it back to spend again. However, if you're running a node on the network, it's a trick you can get away with.
Just follow this three step process. Send somebody some XYZCoin, wait for the transaction to be propagated through the network, then have your node start advertising a tampered-with version of the chain that simply doesn't include your transaction. As long as the chain still checks out as valid, you can get your version accepted by the network and it'll be like you never made the first transaction. Since your balance is nothing more than the sum of your transactions on the chain, bingo - you now have your money back! Or rather, you never spent it. This is known as double-spending.
We can solve this with an idea that lies at the heart of Bitcoin and many other blockchains: proof of work. This is pretty much exactly what it sounds like. We make one change to our blockchain: a block will only be accepted as valid (and therefore help to compose a whole valid chain), if it can be proven that somebody put a lot of effort into creating it.
Each block will contain an extra piece of information that makes it possible to prove this (we'll get to how in a second).
Just... consider this for a moment. It's probably the most important concept in all of blockchain technology. We're asking for it to be provably the case that somebody has committed a lot of time and / or energy to making any block in the chain.
If you can't think of how on Earth this works in practise, don't feel bad. This is the nice, elegant icing on the cake of distributed ledgers and blockchains and it's the bit that took a real, honest-to-goodness leap of imagination and intelligence to invent.
How do you add a single piece of data to the blocks of your blockchain that let every node on the network look at them and say "oh, yeah, someone must've tried reaaaal hard to make this block"?
The answer (and have a gold star if you saw this coming) lies with our good old friend, hashes.
Let me give you a little challenge. Below, I've included the SHA-256 widget from the previous article. My challenge to you is this: find something you can type into the box that gives a hash that starts with lots of zeroes. You'll be able to see 0's more easily since they show as black squares. Here, I'll get you started with six zeroes:
Spoiler: unless you get very very lucky, you're not going to find something that beats my six zeroes without writing a computer program and running it for quite a while. That's how I found my example. I generated millions of unique strings of increasing length, hashed them, and kept going until I found one with a hash starting with six zeroes. On my Macbook Pro, this process took a few minutes and had to try millions of hashes before it found a good one. The reason I chose six rather than seven is that I tried looking for a hash with seven zeroes and my laptop hadn't found one after quite a long time. The difficulty of finding such hashes increases drastically as you ask for more zeroes.Tell Twitter about this visualisation, earn Jack internet points
Now think about a very similar challenge. For the sake of a common reference point, let's say the following few blocks make up the beginning of XYZCoin's chain:
Here's the challenge - given a block from our blockchain, find a string that you can add to it so that its hash starts with at least... six zeroes. The "six" here is pretty arbitrary, and I could easily say "seven" if I thought you'd started to find it too easy. As a reminder, when we hash a block, we hash the combined string of the block's number, its previous hash and its data:
I'm now proposing that we try to find a fourth string that you can add to this specific block such that the whole block's hash starts with at least six zeroes. We'll put this new string in after the block number but before the previous hash. Again, I have a solution that I managed to get my computer to find, to show you the idea:
We call the additional string such as "XmHY" in the example a nonce (a "nonce word" is "a word invented for the occasion", which is a definition that makes a lot of sense here - the right nonce is only right for the specific block it's used in). Let me extend the definition of our blockchain a little bit by adding an area for the nonce, and adding an appropriate nonce to each block to ensure our "six leading zeroes" condition:
Have a look at those "previous hash" readouts - every single one begins with a run of six zeroes. The nonce of each block has been carefully chosen to fit with the rest of that block's content to give this result. The same nonce wouldn't work with any other block.
This is what people mean when they talk about proof of work in blockchains like Bitcoin's: finding the appropriate nonce that lets you create a block that'll be accepted as valid by the rest of the network.
A good illustration of the point is that even coming up with the small example chain above and making sure I had valid nonces for the three blocks was difficult, and took me time. I had to use the program I wrote to crunch through nonces and find those ones that worked. There is no other way I could just "come up" with good nonces, even for an example in my own chain. This is how proof of work proves work.
I encourage you to take a look at this randomly chosen recent block from the real-life Bitcoin blockchain (live link):
Notice a few things: this block's hash starts with a load of zeroes, and there's a field down there called "nonce". Bitcoin works on pretty much the exact same proof of work strategy I've been describing here, except they want you to find a nonce that gives a lot more than six zeroes and it's very very difficult to do so.
We're finally in a position to talk about what Bitcoin is, but let me just wrap up our example here.
With our new proof of work-based distributed ledger system, what have we gained in terms of security from unwanted edits on our blockchain?
New blocks which are accepted as valid and legitimate by the majority of nodes are happily accepted and propagate through the network, becoming the new state. These validations can be performed very quickly, since they're based on calculating a few hashes, and we saw last time that that's a fast operation. It's a trivial matter to run through a chain that another node is suggesting to you and make sure that every nonce and every previous hash checks out. Note that we're making an assumption here - that the majority of the network is friendly and wants the chain to remain legitimate.
A malicious actor would find it impractically difficult to edit a block in the middle of the chain. Not only would the attacker have to work out a new nonce for the edited block, they'd then have to go through every block after it in the chain and find a valid nonce for those too. Worse still, they'd have to do it one by one, in order, since the correct nonce is always dependent on the state of the chain before the current block. This is the true power of proof-of-work: once a block is found with a valid nonce and accepted into the chain, it becomes near-impossible to tamper with any part of the chain.
Here we go then. Deep breath.
The examples that we've worked through up til now give you all the tools to understand Bitcoin, so let's get jump in and describe it properly.
Bitcoin is a blockchain. It's a specific blockchain, the "Bitcoin blockchain". It's quite a lot like our example blockchain with a few more bells and whistles, which I'll variously go into in more depth and not talk about at all, as appropriate.
The first thing to know is what is actually stored inside a Bitcoin block. Whereas our blockchain has short strings such as "Tom paid £110 to Alice", Bitcoin blocks contain records of many many Bitcoin transactions. The exact number varies (the actual limit was until recently 1MB of transaction data per block, but there was controversy around that measure and it's since updated to a more complex version), but it's in the region of a few thousand transactions per block. The example block in the screenshot above included 2254 transactions.
Alongside the previous block's hash, a nonce and all of the transactions, we also hash a Bitcoin version number and a target difficulty. This "difficulty" is a measure of how many zeroes we want people to find in their hashes before we'll accept the next block as valid. The fact that the difficulty of the proof of work is also on the chain means that the chain can reactively change the difficulty in response to how easy people seem to be finding it to come across valid nonces.
As with our example blockchain, there is no "bank". The network is entirely made up of nodes that don't have to trust each other in the slightest. The act of making sure a node can prove work (via a valid nonce) for a new block it suggests to the rest of the network provides a consensus on updates to the chain (ie new transactions being rolled in). As long as there aren't a majority of nodes on the network trying to suggest a tampered-with version of the chain, consensus on their forgery won't be reached and their changes won't be accepted. This is known as a 51% attack and is assumed to be infeasible to organise given the size of the network and the fact that anybody can run a node.
Given that proof of work is essential to keeping the chain legitimate and secure, and that it's also by definition a difficult and expensive task, the Bitcoin blockchain incentivises nodes to carry out the work. If your node can be the one that wins the race and successfully finds a nonce that goes with a packet of pending transactions, and your new block gets accepted onto the chain by the network, you stand to profit! It is accepted by the rest of the network that such a node is entitled to a reward in Bitcoin. In the block example above it was 12.5BTC, which is worth a rather tidy $102,000 at time of writing.
Digest that for a second - successfully finding a nonce to go with a set of new transactions and having your new block accepted by the network is worth over one hundred thousand dollars in Bitcoin. It's incredibly lucrative, and there's a huge amount of competition out there - farms of thousands upon thousands of machines all crunching through absurd quantities of potential nonces, hashing every single one, until they strike gold.
That "strike gold" description is why the process of trying to find a good nonce for a new block is referred to as Bitcoin mining - when you find one, you're very much in the money. Bitcoin mining is performed by incredibly high-performance dedicated electronics these days, designed to do nothing but calculate SHA256 hashes in astronomical quantities at speed.
When a lucky miner finally mines a new block into the chain, the race has to start again - the right nonce for the next block won't be the same as it was for the previous one, because the block number and transaction data contained in the next one will be different. The unlucky miners all have to throw out their previous task, choose a set of new transactions and set to work finding a valid nonce again.
This is already a long and winding tale, so I'm going to keep this brief. Consider this last section a little taster as to the other blockchain delicacies out there right now - there's a lot more on offer than "just currency".
Ethereum is a totally different blockchain to Bitcoin. It also runs as a distributed ledger, uses proof-of-work and has a concept of a cryptocurrency baked in - the "Ether" or ETH. However, there's a big difference between this and Bitcoin. It lies in what they refer to as a "transaction".
Bitcoin transactions have a very simple format: address X sent Y Bitcoins to address Z. Ethereum transactions have a little more going on - each transaction still includes an amount of Ether moving between addresses, but importantly it has another field, called input data.
Ethereum transactions' input data enable its flagship feature - smart contracts. A smart contract is a piece of code that can be "deployed" (stored) on the blockchain and given an "address", which is just a unique ID number as far as it matters.
This code can provide a number of different actions (or calls) that can be performed by anybody on the blockchain. Let me give you an illustration of the idea without using the actual programming language involved, since that'll complicate things needlessly. If you're interested in having a go at programming a smart contract after reading this, you need to look into Solidity, the actual programming language used for Ethereum smart contracts.
Let's say I've written a really simple contract for a shiny new tradeable token / coin in my contract programming language (this is the kind of thing that's behind all those "ICOs" or Initial Coin Offerings you may have heard of - tradable tokens largely built using Ethereum's contracts):
contract UnwttngToken: state: balances sendTokens(amount, fromAddress, toAddress): if currentAddress != fromAddress: fail() if balances(fromAddress) < amount: fail() balances(fromAddress).remove(amount) balances(toAddress).add(amount) giftTokens(amount, address): if currentAddress != adminAddress: balances(address).add(amount)
As I said, this is contract pseudo-code and not real Solidity, the language of contracts on the Ethereum blockchain. I hope you can roughly see what this simple contract tries to provide though, even if you're not a programmer. There are two calls defined here - sendTokens and giftTokens.
The first allows anyone to send UnwttngTokens from their own wallet address to another address, as long as they have enough to send. The second allows the administrator of the contract to simply add to the balance of an address, as a gift.
I've tried to build in a few features here that are common in real contracts: a balance associated with each address, and checks within each call that the call is valid. We can't have people sending tokens to each other that they don't have, and we can't have just anybody able to create tokens as gifts.
Anyway, the content of the contract isn't super important. The important thing is how Ethereum makes it usable. Now that I've written my contract I need to deploy it to the Ethereum blockchain. I do this by sending a transaction to the network with a specially constructed input data field.
This input data effectively says "hey network, I would like to deploy a new contract, and this is the code for it". Validating nodes on the network will check it over, and if the code is good, they'll accept this transaction. Voila, the contract now has an address on the chain. Remember, this all happened within a single transaction on Ethereum's blockchain. As I said, we're not just moving currency on this chain.
Aside: the contract code isn't actually sent as readable code like the above, it goes through a compilation step to a machine-readable format. If you'd like to see a real-world example, check out this Ethereum transaction, which deployed a popular token named EOS to the Ethereum chain. Notice that the input data is a bunch of gibberish like 0x606060405260.... This is the compiled contract code.
OK, the contract's deployed, let's say at the address 0x12345 (that "0xSomething" format is the standard in Ethereum). How do people on the network make use of it now? As an example, I'll attempt to gift myself some tokens at my wallet address 0xabcde.
Remember transaction input data? I'll use it again here. I'll make another transaction, and I'll declare that I want to send 0 Ether to the contract address 0x12345. In the transaction's input data, though, I'll include something like:
giftTokens amount: 1000 address: 0xabcde
Here's where the magic happens. Nodes on the network that come to validate this transaction have to go through a couple of steps:
1) Get the contract code. This transaction is telling the node it wants to call a function on the contract at 0x12345, so the node fetches the code for that contract (the code we wrote and deployed above). Remember, this doesn't take much time since each node has a copy of the entire chain to hand.
2) Run the contract code. The node takes the values given to it in the transaction (1000 tokens, to address 0xabcde), and actually executes the function code of giftTokens. Since the only line of code in that function says that it should increase the balance of 0xabcde by 1000, that's what it does. Once this transaction's been accepted by the network, and a miner has included it in a block, it forms part of the history of the contract. From now on, whenever anybody wants to know the UnwttngToken balance of address 0xabcde, they can walk through the chain's history and see "oh, here's a transaction that gifted that address 1000 tokens", or "oh, here's a transaction that transferred 500 tokens to a different address", and get a full picture of the contract's state in that way.
Check out a real-live example here - this transaction moves 10000 TRONIX tokens from one address to another. Pay special attention to the text in the "Input Data" box at the bottom.
In this way, Ethereum's managed to do an incredible thing - enable an abstraction (the smart contract), which can effectively have a real, persistent state (in our case, balances, but it could be anything), which is validated and stored on the chain. The Ether currency itself is tradable and valuable just as Bitcoin is, but the fact that anybody can create a totally new token, or small piece of software, and have the code that governs it run and checked by the nodes of the network is a huge enabler of generic innovation on-blockchain.
Rather than incentivising block miners simply to validate transactions, as Bitcoin's chain does, Ethereum incentivises them to execute arbitrary code and agree on the output, all in a decentralised manner. I hope you can see that this represents a much more flexible platform for all kinds of innovations.
I think we'll need to stop there. From the basic workings of a blockchain, through all of distributed, consensus-based ledgers and proof-of-work, to Bitcoin and Ethereum's smart contracts, we've covered a lot of ground.
The hope is that you feel much more able to go forth into the world of blockchains and cryptocurrencies / smart contracts feeling a bit more confident in what you're talking about.
When the news talks about Bitcoin and tells you "it works by people's computers solving difficult puzzles" you can say to yourself "I know what's really happening - miners are racing to find nonces so that their new blocks have a very specific type of hash".
When you see people talking about new and valuable Ethereum-based tokens you can say to yourself "Aha, that's a new contract on the Ethereum blockchain that lets people transfer tokens between themselves".
However, I said "everything you wanted to know about blockchains", and I'm aware I may still be falling short on that one. There's a lot to want to know about blockchains, and I've probably only encouraged you to think of more. With that in mind I thought I'd include some further reading to whet your appetite and give you some interesting food to continue your journey.
1) Chain explorers - for any well-adopted blockchain, there're a hundred websites allowing you to browse it. For Bitcoin, the go-to is blockchain.info, and for Ethereum it's etherscan.io. I encourage you to go and have a gander - on both of these sites you can browse any block, look at the details of any transaction, and in the case of Ethereum inspect contracts.
2) Exchanges - if you get into cryptocurrencies, and acquire some for yourself, you're going to want to trade them sometimes. Websites like Poloniex, Bitstamp and Coinbase let you do this, and Coinbase lets you easily buy Bitcoin and Ether for your Dollars, Pounds or Euros.
3) Energy usage - one of the hottest topics in blockchain technology right now is its energy usage. A recent Motherboard article looked into the astronomical energy costs of mining Bitcoin transactions. These energy costs are still seen as worth paying (financially at least) thanks to the very high value of Bitcoin. Similarly, this article compared the network's consumption for mining to the total energy consumption of entire nations. Spoiler: it's higher than quite a lot of them. This is a vast price to pay simply to uphold proof of work and protect the validity of the blockchain. For this reason, hopes are high for a less hungry method of securing the chain than hash-based proofs of work, which leads us nicely on to...
4) Proof of stake and alternative proofs of work - an alternative concept to using proof of work for securing the chain, proof of stake allows those who can prove they own a lot of a certain currency to create new blocks preferentially to someone who has less of it. This is many times more energy-efficient, but comes with its own set of controversies. One currency based heavily on proof of stake is Peercoin. Other blockchains exist which attempt to at least harness proof of work for useful (outside their own context) means: for example, Primecoin's proof of work is based on finding long chains of prime numbers, with the hope that incentivising innovation in finding such things is bound to lead to Good Things.
Thanks so much for reading, and I hope you've got something out of this two-parter. As I said, I'm sure you have more questions - feel free to reach out to me on Twitter at @unwttng via the link below.
Until next time.