A Guide To Merkle Trees

0

Have you heard about Merkle trees when discussing blockchain technology? That’s because the Merkle tree is at the core of the technology itself. 

So, what exactly is it?

In this article, we will go through the Merkle tree in-depth and understand its role in making blockchain technology a success.

Blockchain Certification Course

merkle tree

Let’s get started.

 

What is A Merkle Tree?

A Merkle tree is a data structure that is used for secure verification of data in a large content pool. It is also efficient and consistent when it comes to verifying the data.

Ethereum and Bitcoin both utilize Merkle Trees.

The Problem: At the core of the centralized network, data can be accessed from one single copy. This means that they do not have to do much to store or access data. However, when it comes to the decentralized blockchain network, things go haywire as each data is copied among the nodes. So, it is a challenge to efficiently access data. The challenge is also to make a copy of the data and share it among nodes. On top of that, the shared data needs to be verified for each of the receiving nodes.

The Solution: Merkle Trees enable decentralized blockchains to share data, verify them, and make them trustworthy. It organizes data in such a way that not much processing power is needed to share and verify data. It also facilitates the secure transaction thanks to the use of hash functions and cryptography.

Satoshi Nakamoto was the first person who implemented Merkle trees in blockchain technology via Bitcoin. His usage opened up a new branch of computer science where there is no need for a centralized authority. He also used Merkle trees to an excessive degree and used Fast Merkle trees.

However, the concept was first introduced by Ralph Merkle, who patented it in 1979. It was named after him.

Also read, Hyperledger Cactus: A New Hyperledger Framework

Blockchain Certification Course

Cryptographic Hash Functions

Before we discuss Merkle trees, we need to get a better understanding of cryptographic hash function.

A hash function is responsible for mapping any form of arbitrary data of any length to a fixed-sized output. It is a cryptographic function and hence is widely used in cryptography.

The hash functions are efficient and are known for their one property, i.e., the function cannot be reversed. It is a one-way function that is designed to work this way only. 

Hashing has multiple uses including

  • Password protection
  • File integrity checks and verification
  • Cryptocurrency

There are multiple hash families out there including Message Direct(MD), Secure Hash Function(SHF), and RIPE Message Direct(RIPEMD).

If you use a SHA256 hash algorithm and pass 101Blockchains as input, you will get the following output

fbffd63a60374a31aa9811cbc80b577e23925a5874e86a17f712bab874f33ac9

To sum it up, the key properties of hash functions include:

  • Deterministic
  • Pre-Image Resistant
  • Computationally Efficient
  • Cannot be Reversed Engineered
  • Collision Resistant

If you want to learn more about the Cryptographic Hash Functions, then check out the detailed articles here: 

How Do Merkle Trees Work?

Now that we have a somewhat good understanding of Hash functions, it is now time to learn more about Merkle Trees.

So, technically, Merkle trees are data structure trees where the non-leaf node is defined as a hash value of its respective child nodes.

This also means that the Merkle tree is inverted down where the leaf nodes are the lowest node. 

To get a better understanding of what I am trying to convey, let’s take a look at the Merkle tree example:

Merkle-trees

Source: Wikipedia

At the core of Merkle trees, we need to learn three important terms. They are as below:

  • Merkle Root
  • Leaf Nodes
  • Non-Leaf Nodes

If you take a look at the Merkle tree as a whole, it is an upside-down tree. The tree is capable of summarizing a whole set of transactions by itself. This means that the user can verify if a transaction is part of the block or not.

To make Merkle trees work, hashing is used. It simply does the hashing pairs of nodes repeatedly until only one hash value is left. The left hash value is known as Merkle Root or the Root Hash. The tree is created from the bottom up using the individual transactions hashes. The individual transaction hashes are also known as Transaction IDs. 

The leaf nodes are the nodes that contain transactional data hashes. In the case of the non-leaf nodes, they store the hash of the two previous hashes.

Another important property of Merkle trees is that it is binary in nature. This means that it requires leaf nodes to be even for it works. In case, if there is an odd number of leaf nodes, it will simply duplicate the last hash and make it even.

 

An Example

Let’s try to understand it by taking an example.

merkle-tree-example

Merkle Tree Example

Here, we see that four transactions have taken place in the block. These transactions are named X, Y, Z, and W. The transactions are then hashed and then stored in leaf nodes which we name as Hash X, Hash Y, Hash Z, and Hash W.

Once done, the leaf nodes of Hash X, Y, Z, and W are again hashed and created into a combined hash of XY and ZW. Finally, these two hashes are used to create the Merkle Root or Root Hash.

The whole process of hashing can be done on a very large data set which makes the Merkle Trees data structure useful in the case of decentralized networks.

As we discussed earlier, hashing algorithms usage depends on the implementation. However, one of the most common hash functions that are used includes the SHA-2 cryptographic hash function. 

So, a transaction can be verified if the previous transactions are verifiable, thanks to the hash values.

 

What About Data Integrity?

Merkle tree is ideal for data integrity. Also, there is no need to go through the whole transaction to see its verifiability. The transactions can be verified with the use of the information stored in the block header. The Merkle root value is also changed depending on the previous transactions.

This also means that the root values are changed frequently and can be used to verify transactions almost instantly. 

All of these can sound a little bit similar to hash-list, however, this is not true. For a hash-list, you need to download the full list to verify transactions or data.

In the case of the Merkle tree, you can download the branch and then use it to verify the transactions.

There is no need to download the whole tree to verify transactions. This also means that the whole tree can be divided into small data blocks which can be used to verify transactions all across the network. The concept is known as Merkle proofs.

You can also check out Merkle tree python —  a Merkle tree implementation in Python article.

 

How Merkle Trees Work in Bitcoin

Bitcoin was the first cryptocurrency that employed Merkle trees effectively. To ensure that the hash values are protected and cannot be reversed easily, it utilizes the famous Secure Hashing Algorithm SHA-256. This also means that the hash values output is 256 bits long. At the core, Merkle trees are used to store data and also prune transactions.

Also read, How To Get Started With Blockchain

In bitcoin, each block is connected to previous blocks using hash values. This is how the whole blockchain is created. In a block, there are block headers which contain important information such as:

  • Merkle Root Hash
  • Block Version Number
  • Timestamp
  • Nonce
  • Mining Difficulty Target
  • Previous Block Hash

To get a better understanding, let’s take a look at the diagram below. It is taken from the Bitcoin whitepaper itself.

merkle-tree-in-bitcoin

Caption: Merkle trees in Bitcoin

As you can see, it requires miners to include the transactions into the block. Once done, it is hashed and becomes part of the Merkle tree.

The use of Merkle Trees, this way, can lead to multiple benefits. This includes one notable benefit, i.e., Simple Payment Verification(SPV). These SVP’s are nodes that can also be termed as lightweight clients. So, what do they do? They simply download the longest chain block headers and hence do not have to download the whole blockchain. To do all of these, they need to verify if it has the stored block headers for the longest chain. This is how Merkle tree implementation is done in bitcoin.

In the end, an SPV can then use the Merkle Proof of Map and verify a transaction using the Merkle tree’s root hash. 

How Merkle Trees is Used In Ethereum

Ethereum blockchain also utilizes Merkle trees. However, the approach here is different than that of how bitcoin utilized it. In Ethereum, Merkle Patricia Tree is used which is a complex version of the Merkle tree. This is possible because Ethereum is Turing-complete.

If you want to learn more about how Merkle trees work in Ethereum, then check out the detailed post here.

Other Merkle Trees Implementation: Use Cases

There is, of course, other Merkle trees implementation out there. One of the most popular ones is Git — a distributed version control system. It is used by programmers from all over the world to manage their projects. 

Another useful implementation is seen in Interplanetary File System — a peer-to-peer distributed protocol. It is also open-source and enables computing devices to join and use a ubiquitous file system.

Even certificate authorities utilize Merkle trees to their advantage. They use it in the mechanism to create verifiable certificate transparency logs. As the log is huge, Merkle trees enable computers to verify it without wasting too much time and effort.

The last use-case that we are going to discuss is database systems such as Amazon DynamoDB and Apache Cassandra. These No-SQL distributed databases take control of inconsistencies using Merkle trees during the data replication process. If there are any issues, it can update or repair the data using the anti-entropy repair process.

In short, the use of cases of Merkle trees include

  • Data synchronization
  • Data verification
  • Consistency verification

Merkle Trees benefits

In this section, we will take a quick look at the Merkle tree benefits.

  • Validate the integrity of data: It can be effectively used to validate the integrity of the data.
  • Takes little disk space: Merkle tree takes little disk space compared to other data structures.
  • Tiny information across networks: Merkle trees can be divided into tiny information for verification.
  • Efficient verification: The data structure is efficient and takes only a while to verify the integrity of the data.

Conclusion

The Merkle tree is one of the important concepts in computer science. It is widely used in many use-cases and its use in cryptocurrency has given rise to a revolutionary technology, — blockchain.

So, what do you know about Merkle trees? Comment below and let us know.


About Author

Nitish holds a BSc in computer engineering. He is a blockchain enthusiast and in spare time likes to read about the moon. His articles have published on Dzone, InfoWorld, and Hongkiat.

Leave A Reply