The following two lessons were written with the goal of providing a good foundation for building or working with Bitcoin wallets. As such we’ll cover a bit of history and then the current de facto standard for Bitcoin wallets which consists of four BIPS, 39, 32, 43 & 44.
These BIP’s cover mnemonic seeds and hierarchical deterministic(HD) wallets.
Cryptocurrency wallets have evolved substantially since the first wallet which was included in the original version of the Bitcoin core client. Let’s first discuss some of the history of Bitcoin wallets to understand why these improvements came about.
This type of wallet is simply a collection of randomly generated private keys with no particular relation to one another.
While these wallets worked, there was an issue with backing up the wallet data. For greater privacy, addresses were not reused. This means new addresses, and thus new keys, were generated regularly. New addresses were created on each receive transaction, with a cap of 100 addresses/key kept at any time.
An early method for backing up a wallet was simply to copy the wallet.dat file that contained the private keys and then store this file in a secure location. However, if regular backups were not taken then they would quickly become outdated.
If a user attempted to restore a heavily used JBOK wallet, they may find that the keys in their backed up wallet.dat file no longer control any value on the blockchain. Essentially, a backup could quickly become useless.
Also, backing up a wallet meant storying what was essentially a database file. This made backups not at all user-friendly.
To solve these problems mnemonic seeds and HD wallets were created.
The process outlined in BIP 39 is the current best practice for creating wallets with mnemonic seeds.
So, what is a mnemonic seed?
It is a set of data encoded as a series of words, usually 12 or 24, which can be used to restore an entire wallet.
HD wallets, which we will cover in depth in the next lesson, have a parent private key which can be used to derive many child keys. In this way, one set of data, which is called the “seed”, can be used to migrate or restore a wallet with many keys and addressed.
It is, however, important to note that the “seed” is more than just the parent private key. The seed also contains the “chain code”. This is a bit of data which is necessary for the derivation of the child keys, which we’ll discuss in the next lesson.
This is a simplified version of the process which is meant simply to give you a starting point. For a more in-depth explanation please see the BIP 39 repo, and chapter 5 of Master Bitcoin 2nd Edition.
First, the wallet should have some source of entropy which is used to generate 128-256 bits of data.
This data is then mapped to a predefined dictionary of 2048 words. This is how those 12-24 words that make up the seed are selected.
Once the seed words have been selected, a “salt” and optionally a password are selected. In BIP39 compatible wallets, the salt is automatically set to “mnemonic” plus the password if one was added.
Once the mnemonic phrase and salt has been selected, they are run through a “key stretching function”. This function hashes the data 2048 times! Which produces the 512-bit wallet seed.
The mnemonic phrase, the parent private key and the seed are not the same things.
The seed is built from the mnemonic phrase + salt and contains both the parent private key and the chain code.
The seed contains all the components necessary to migrate or restore an HD wallet.
Before we move on let’s take a quick look at wallets in the context of internet security and usage. Often, when wallet types are being discussed the speaker is referring to what device the wallet seed is stored on or accessed from. This is important from the context of accessing funds and keeping them safe from theft.
Further Reading:
We’ll have a look at chain codes and HD wallets in the next lesson.