Decentralized Storage

Technical Overview

In this module, we’ll explore how IPFS stores files. IPFS, or the Interplanetary File System, uses a blockchain-based peer-to-peer network to share files directly between participants. Because IPFS does not have one core server, the blockchain is needed to help the nodes identify bad data, and ensures safe and reliable communication for long term file storage.

Incentives and Blockchains

In order for a decentralized network to operate, it needs two things:
  1. A common source of truth
  2. A way to reward good behaviour
In the case of decentralized storage, a blockchain is used to track network activity, and participants must typically stake a fee, which is returned if their activity is deemed to be non-malicious.

For the sake of this course, we’ve mostly focused on IPFS, but there are a number of players entering the decentralized storage space, and each of them has chosen to accomplish this goal in slightly different ways. Generally, the framework for a decentralized storage network can be summarized as shown below. If you would like a deeper dive into this decentralized storage technology, check out our full course on IPFS.


1: The data (A) is broken into many pieces, or shards (B)
2: Each Shard is encrypted (C) using the public key of the user who wants to store the file
3: A hash (D) is generated for each shard
4: The encrypted shards (C) are distributed to the peer nodes for storage
5: The encrypted shards are replicated across many peer nodes, which each share a copy of the common ledger (F)
6: The shard hashes are recorded to the blockchain (E) for reference during retrieval
While not depicted above, the creators of IPFS have implemented a token called Filecoin, which enables rewards for participants who successfully provide the correct shard when it is requested.

Because blockchains must be stored on all nodes of the network, data on them is expensive. As a result, hashes are used to represent an image of each piece of a file, and the pieces can then be safely distributed to storage nodes without risk of substitution of compromise.