Introduction to Bitcoin

Posted by Balaji Srinivasan

Introduction to Bitcoin

This document provides a fast-paced interactive introduction to basic Bitcoin concepts. The purpose is to give you enough knowledge of the Bitcoin protocol to write micropayments-capable apps and demos. Let’s dive in!

How to learn (enough) Bitcoin in one week

Bitcoin is a very highly interconnected subject and it can be tricky to explain one piece without assuming some knowledge of other pieces (or prerequisites). The approach we recommend you take is as follows:

  • Step 1: First, go and read the original paper by Satoshi Nakamoto (bitcoin.pdf)

  • Step 2: Next, set up your development environment using the instructions below.

  • Step 3: Now go through the Introduction to Bitcoin section and read through the questions to get a high level overview of Bitcoin

  • Step 4: Next, do the Interactive Introduction section and type in the commands. You won’t understand everything, but will learn by doing - kind of like immersion in a foreign language.

  • Step 5: Finally, skim the reference section if you see fit

You will then know enough Bitcoin to write Bitcoin apps!

How we’ve abstracted away Bitcoin guts so you can build Bitcoin apps

Note: don't worry if you don't fully understand everything you type in this first lab. The goal is to learn by doing, and you actually won't need to know much in the way of Bitcoin internals to do the projects in this class. As an analogy, you don't need to keep in mind the full seven layer model of the internet to simply perform an HTTP request. In the same way, we have abstracted away a big chunk of Bitcoin knowledge behind a little snippet of Python code that allows you to turn any web-accessible URL into a machine-payable endpoint. Here's a simple example of how that works (you don't need to run this now):

#!/usr/bin/env python3
from subprocess import call
from uuid import uuid4

from flask import Flask
from flask import request
from flask import send_from_directory

# Import from the 21 Bitcoin Library
from two1.wallet import Wallet
from two1.bitserv.flask import Payment

# Configure the app and wallet
app = Flask(__name__)
wallet = Wallet()
payment = Payment(app, wallet)

# Charge a fixed fee of 3000 satoshis per request to the
# /tts endpoint
@app.route('/tts')
@payment.required(3000)
def tts():
    text = str(request.args.get('text'))
    file = str(uuid4()) + '.wav'
    call(['espeak', '-w', '/tmp/' + file, text])
    return send_from_directory('/tmp', file, as_attachment=True)

if __name__ == '__main__':
    app.run(host='::')

The key line there is the @payment.required(3000), which turns the call to this text to speech API into something which costs 3000 Satoshis per request. It does this by taking the existing tts function and wrapping it with a Python decorator. As you may already know, a decorator is a function that takes another function as an argument and modifies it by adding some shared setup or teardown code. In this case it modifies the tts function by requiring a Bitcoin micropayment for the function to execute.

Again, though the tricky parts of doing that are mostly hidden for you; for the purposes of this class, you should be able to just type payment.required before an endpoint and make it micropayments-capable. Our focus after next week is going to be at a higher level of the stack, more on the micropayments apps themselves than the mechanics of how the payment is completed.

Step 1: Read the original Bitcoin paper and skim the codebase

Satoshi Nakamoto's original manuscript (bitcoin.org/bitcoin.pdf) is a classic of technical exposition. It describes the core concepts behind Bitcoin in lucid prose in just 9 pages. Of course, in its own right the paper is not a full implementation of the Bitcoin protocol; if you need definitive reference information about the Bitcoin protocol, you'll want to search the C++ codebase for the core implementation at github.com/bitcoin/bitcoin.

Step 2: Set up your Bitcoin development environment

The tutorial Introduction to 21 provides a complete set of instructions to help you set up a Bitcoin development environment.

Step 3: Introduction to Bitcoin

Let's go through Bitcoin in an interactive way, starting by answering questions.

Why is Bitcoin interesting?

To motivate the ideas behind Bitcoin, let's start with physical cash.

  • Physical cash. Suppose that Alice gives Bob a $1 bill. Then Alice no longer has that $1 bill and Bob does have it. Moreover, both Alice and Bob know that the bill has been transferred from Alice to Bob. This is such an intrinsic property of physical cash that we take it for granted - namely, that only one person can possess a particular dollar bill at a time.
  • Naive digital cash. As a next step, suppose that Alice wants to send Bob $1 over the internet. An extremely simple way of doing this would be for Alice to look at the bill, find the unique serial number, and then email that number to Bob. If Bob tries to treat this number itself as cash, a problem arises: Alice still has a copy of the number. Bob cannot know for sure that Alice has deleted her serial number. Alice can, if she so chooses, attempt to double spend the bill.
  • Centralized digital cash. This double spending problem is the key issue for digital cash. The only way to solve it before Bitcoin was by interposing trusted third parties - banks - between Alice and Bob. Even if Alice and Bob don't know or trust each other, if they both trust Charlie at Wells Fargo, then the transfer of $1 from Alice to Bob can occur as follows:
- Alice tells Charlie that she wants to pay Bob $1

- Charlie debits Alice's account by $1

- Charlie credits Bob's account by $1


Charlie maintains the ledger of who has what money. The fact that
Alice and Bob both trust Charlie to hand out credits and debits - to
validate transactions by updating a ledger - is the way that
scarcity can be re-introduced into the digital realm. This is the
way the double spend problem is typically handled today. The
problem, of course, is that Charlie must be trusted with immense
power: the power over Alice and Bob's bank accounts. It would be
elegant to remove the need for this power while preserving the
ability to transfer cash digitally.
  • Decentralized digital cash. And that's where Bitcoin comes in. It solves the double spend problem in a different way: with a decentralized network of transaction processors called miners. Transaction validation in Bitcoin happens not by introducing trust, but by (partially and temporarily) removing privacy. That is, suppose that:
- Alice and Bob took the pseudonyms 1F2gspw7 and 3MVBBzaD

- 1F2gspw7 then broadcast to the entire Internet (not just Charlie)
  that it was paying 1 BTC (1 bitcoin) to 3MVBBzaD

- A miner listening to the network hears that transaction, ensures
  1F2gspw7 owns at least 1 BTC, and adds a record to  a database
  (the Blockchain) such that 1F2gspw7's account is debited by 1 BTC
  and 3MVBBzaD is credited by 1 BTC

- The miner then in turn broadcasts this database update to all
  other miners, such that their transaction databases are now in
  sync

That is roughly how Bitcoin works. By removing the taken-for-granted concept that only the trusted intermediary Charlie knows that Alice is paying Bob, it allows anyone with an internet connection to listen for the transaction between 1F2gspw7 and 3MVBBzad and then process that transaction if they have sufficient computational power to mine it. Processed (mined) transactions are stored in the Bitcoin Blockchain, the database of every Bitcoin transaction that has ever happened, which every node in the Bitcoin system validates and stores locally.

There's another wrinkle: in return for spending the compute power to mine Bitcoin transactions, a miner will periodically get the chance to earn a "block subsidy" with a chunk of new bitcoin. This "block subsidy" is the means by which new bitcoins are introduced into the system, and it successively decreases over time such that only approximately 21 million Bitcoin will ever be mined.

We'll get into some of the details of how mining works, but just think about that for a second: the decentralized network of Bitcoin miners - people running computers around the world - replaces the need for banks. That's pretty amazing! This is the problem Bitcoin is intended to solve: decentralized digital cash without the need for a trusted third party intermediary.

What is Bitcoin?

Now we have some sense of why Bitcoin was invented: to solve the double spend problem, to create a true decentralized digital cash. But what is Bitcoin in practice? That is, going beyond the headlines and news articles, how is it actually implemented? Here's the simple way to think about the guts of Bitcoin:

  • There is a C++ codebase at github.com/bitcoin/bitcoin that is the "standard" implementation of the Bitcoin protocol.

  • This is called "Bitcoin Core" and it's available for download at bitcoin.org

  • You can download and run that software on any Mac, Windows, or Linux computer with an internet connection

  • This software provides:

    • the ability to discover and connect to peer nodes running the Bitcoin software

    • the ability to download, validate, and propagate blocks from peers. These blocks are updates of the blockchain, the distributed database of all Bitcoin transactions.

    • the ability to update the blockchain by processing transactions via Bitcoin mining

    • the ability to create, sign, and send transactions to the Bitcoin network, via a wallet that holds the private keys (like passwords) for each of your public bitcoin addresses (like disposable email addresses you can send money to)

There are many qualifications one can append to this simple description:

  • Beyond the C++ codebase, there are now several other good implementations of the protocol, including BitcoinJ (in Java), btcd (in Go), and libbitcoin (in C)

  • While the code is functional, the Bitcoin Core client isn't anywhere near capable of mining blocks anymore on a typical computer, as Bitcoin mining (transaction processing) now requires very fast custom hardware

  • Many of the services bundled into the Bitcoin Core client have been split up and distributed into individual pieces of software - wallets, full nodes, miners, and the like all now abound as specialized pieces of software

But this overview of Bitcoin Core (the C++ codebase developed at github.com/bitcoin/bitcoin and downloadable at bitcoin.org) gives you a good sense of how the Bitcoin protocol is actually implemented in practice.

What is Bitcoin useful for?

Depending on how you measure, as of early 2016 there are a few million Bitcoin holders. Daily use in the sense of transaction volume has been ticking up, but right now there is arguably only one legal killer app for Bitcoin: buy Bitcoin and hope the price of Bitcoin increases. This was actually an excellent application in most recent years (2010, 2011, 2012, 2013, and 2015) other than 2014.

However, when thinking about future applications, Bitcoin has the potential to be good for transactions that are:

  • very large
  • very small
  • very fast
  • very international
  • and/or very automated

To understand this in more detail:

  • very large: you can send $1M or more easily with Bitcoin (see here for Wences's demo)

  • very small: you can also send fractions of a cent (feasible with payment channels)

  • very fast: you can send money and settle it such that the other party has full custody and can spend it within 60 minutes (much faster than typical SWIFT times, especially for international transfers)

  • very international: you can send money across borders between any two parties with an internet connection

  • very automated: you can easily programmatically send money without setting up a bank account

If you are dealing with a use case in which 2-3 or more of these aspects are in play at the same time, you start to have something which would be difficult or impossible to do within the confines of the legacy financial system. For example, suppose that you were running an international crowdfunding site that took in micropayments from any country. You'd need to collect very small payments from people in many nations, and collect them fast enough to keep the counter increasing at a good clip. That uses the “very small, very international, and very fast”.

What is the Bitcoin Blockchain?

One word that keeps coming up above is the blockchain (and you've probably heard this in the popular press as well). In the context of Bitcoin, the blockchain is the database of all past Bitcoin transactions. It's updated on average roughly every 10 minutes by Bitcoin miners, who add on a new block of transactions to the "chain" of all past "blocks".

A good way of thinking about the Bitcoin blockchain is that it's the replayable, auditable history of the entire Bitcoin economy, stored in a single downloadable database. Anyone can download it from the internet, and 21 Bitcoin Computer ship with a recent copy in the ~/.bitcoin/blocks directory.

The Blockchain is like an accounting ledger on steroids. As we'll see, because the blockchain is based on cryptographic hashing and is broadcast/replicated worldwide, even a tiny difference of 1 thousandth of a cent in this ledger is detectable by anyone on the Bitcoin network.

What does it mean when people talk about blockchains outside of Bitcoin? Frankly, the concept is still somewhat inchoate, but it essentially means a database that has support for blockchain-like data structures but can be updated within a trusted community without the so-called Bitcoin mining process. Over time this may lead to an intranet/internet-like division, where organizations set up trusted intranets/blockchains within a (potentially large) circle of trusted peers, and then periodically spend BTC to broadcast their transactions to the Bitcoin Blockchain.

What is Bitcoin Mining (and why mine?)

Mining is the part of Bitcoin that is perhaps the most different from the existing financial system and hardest to understand. Gold mining might be the closest analog, as gold miners compete over a relatively fixed supply of gold. But the key is to understand the problem that mining is intended to solve: to distribute the power of processing Bitcoin transactions so that no one party can block/reverse transactions (thereby freezing accounts/seizing funds) or reward themselves large amounts of bitcoin out of nowhere (print money).

To do this you need:

  • some way for an arbitrary party to onboard as a transaction processor without anyone else's approval (no licensing)

  • but also some way to limit their power and make it expensive for a new party to just waltz in and process a transaction that awards themselves a billion dollars (tamper resistance)

And, if you do make it expensive in this way, some incentive for legitimate miners to take on this expensive business of transaction approval

Bitcoin mining solves these three interrelated issues by marrying transaction processing with seignorage (money printing). Anyone who runs Bitcoin mining software, listens to the network for pending transactions, and successfully processes a block of transactions gains the ability to award themselves a so-called block subsidy of 25 BTC [as of early 2016]. At $400/BTC, this is worth about $10k per block, representing a fairly good incentive to mine.

Mining thus allows anyone in the world with sufficient computational power to immediately onboard and start validating transactions, without pre-approval from anyone else in the Bitcoin network.

What is a Bitcoin transaction?

Recall when we spoke above about Alice (1F2gspw7) broadcasting her intent to send 1 BTC to Bob (3MvBBzad) to the network of miners. How does that actually work? The answer is as follows:

  • Alice learns about Bob's public Bitcoin address 3MvBBzad, in the same way she might learn Bob's public email address

  • Alice then creates a datastructure called a transaction that expresses her intent to transfer 1 BTC from one of her so-called "Bitcoin addresses" (in this case 1F2gspw7) to one of Bob's Bitcoin addresses (in this case 3MvBBzad)

  • Alice signs this transaction with her so-called private key, which is like a password for her public 1F2gspw7 Bitcoin address

  • Alice sends this transaction out to the Bitcoin network

  • A miner processes the transaction, confirming that Alice's signature is valid and that her private key does indeed give her the authority to transfer the 1 BTC from the 1F2gspw7 Bitcoin address to Bob

  • Everyone else in Bitcoin has an opportunity to check the miner's work. If the miner did the work correctly, the miner is allowed by others in the Bitcoin Network to spend the 25 BTC block subsidy.

Note that Alice essentially needs to use a password (a private key) to prove to the Bitcoin network (as represented by the miners doing transaction processing) that she owns the 1 BTC that she is trying to transfer to Bob.

While the relationship between a private key and a public Bitcoin address is similar to the relationship between a private password and a public email address, there is one crucial difference: the private key/public Bitcoin address relationship is a mathematical one in the sense that the private key completely determines the public Bitcoin address, while the private password/public email address relationship is an arbitrary one in that you pick the password and username separately and join them on the server side database.

Understanding the full details of how public and private keys work is outside the scope of this class, but go and read here about public key cryptography in the context of Bitcoin.

What is a Bitcoin wallet? What does it mean to "have bitcoin"?

A Bitcoin wallet is a container for the above mentioned private keys. Each wallet contains multiple private keys corresponding to multiple Bitcoin addresses, kind of like you can have multiple bank accounts as a customer at the same bank.

Because Bitcoin is a digital currency, it's not immediately obvious what it means to say that you "have 10 BTC". In practice, if you (and only you) have custody of the private key for the public Bitcoin address that contains a 10 BTC balance, you have possession of those 10 bitcoin. It's analogous to email in that you "have the foobar Gmail account" if you have the private password for the public email address foobar@gmail.com.

Step 4: An Interactive Introduction to Bitcoin

Now that you’ve set up your device (Step 2) and we’ve gone through some of the high level concepts behind Bitcoin (Step 3), we’re going to go through several interactive exercises that will help you understand how Bitcoin actually works. Specifically, we go through the following concepts:

  • What does a block look like on disk?

  • How does one parse a block in the blockchain?

  • How does mining work at a high level?

  • How do I mine a block of transactions?

  • How do I just quickly mine some bitcoin for programming?

For each of these concepts, you will type in some commands and immediately see the guts of Bitcoin - how blocks actually look on disk, what the major data structures look in Bitcoin, how mining actually works at a low level, and so on.

Exercise 1: What does a block look like on disk?

What you'll learn

As you may recall from the introduction above, the Bitcoin Blockchain is the record of every transaction that has ever happened in Bitcoin. It is a new kind of database that represents the replayable history of a rapidly growing economy. On average, every 10 minutes, a new block of transactions is incorporated into the blockchain. You can see this happening in realtime here.

You’ve heard of the blockchain. But what does the first block in the blockchain look like? All the way back at the beginning of Bitcoin on January 3, 2009, Satoshi Nakamoto mined the so-called Genesis Block. In this tutorial we’ll look at the hexadecimal representation of the Genesis Block and learn where the Blockchain is stored on disk by Bitcoin Core.

The Genesis Block

The first block on the Bitcoin Blockchain, block 0, is called the Genesis Block. According to the timestamp in the block header, it was mined on 3 Jan 2009 at 18:15:05 UTC. Let's take a look at it on disk by running the following commands:

cd ~/.bitcoin/blocks
hexdump -n 255 -C blk00000.dat

The cd command changes your directory into the Bitcoin Core block data directory. The hexdump command above displays the first 255 bytes of the file. It should look like this:

  00000000  f9 be b4 d9 1d 01 00 00  01 00 00 00 00 00 00 00  |................|
  00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  00000020  00 00 00 00 00 00 00 00  00 00 00 00 3b a3 ed fd  |............;...|
  00000030  7a 7b 12 b2 7a c7 2c 3e  67 76 8f 61 7f c8 1b c3  |z{..z.,>gv.a....|
  00000040  88 8a 51 32 3a 9f b8 aa  4b 1e 5e 4a 29 ab 5f 49  |..Q2:...K.^J)._I|
  00000050  ff ff 00 1d 1d ac 2b 7c  01 01 00 00 00 01 00 00  |......+|........|
  00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
  00000080  ff ff 4d 04 ff ff 00 1d  01 04 45 54 68 65 20 54  |..M.......EThe T|
  00000090  69 6d 65 73 20 30 33 2f  4a 61 6e 2f 32 30 30 39  |imes 03/Jan/2009|
  000000a0  20 43 68 61 6e 63 65 6c  6c 6f 72 20 6f 6e 20 62  | Chancellor on b|
  000000b0  72 69 6e 6b 20 6f 66 20  73 65 63 6f 6e 64 20 62  |rink of second b|
  000000c0  61 69 6c 6f 75 74 20 66  6f 72 20 62 61 6e 6b 73  |ailout for banks|
  000000d0  ff ff ff ff 01 00 f2 05  2a 01 00 00 00 43 41 04  |........*....CA.|
  000000e0  67 8a fd b0 fe 55 48 27  19 67 f1 a6 71 30 b7 10  |g....UH'.g..q0..|
  000000f0  5c d6 a8 28 e0 39 09 a6  79 62 e0 ea 1f 61 de     |\..(.9..yb...a.|
  000000ff

The output has three columns:

  1. The left side is the byte count; it starts at zero and increases by 16 bytes on each line except the last.

  2. The middle part is each byte with a space between it and the next byte. Bytes are represented here in hexadecimal which takes two characters to represent a single byte. When we refer to byte sequences below, we'll use the common prefix of 0x. For example, the first byte in the output is 0xf9.

  3. On the right is the ASCII text representation of the hexadecimal data. Bytes that don't map to displayable ASCII are shown as periods. Almost none of the data in the blockchain is ASCII, but Nakamoto left us a surprise here (see below for details).

Note the first four bytes, 0xf9beb4d9. This is called "Bitcoin's magic number)", although it isn't really magical. It's just four arbitrary bytes chosen by Nakamoto as a byte sequence otherwise unlikely to appear in a Bitcoin datastream. In the block data files, these bytes identify the start of a new block but they aren't part of the block themselves. These bytes also start each message in the Bitcoin peer-to-peer network protocol to help clients tell when one message ends and another begins.

You'll note that the early part of the block contains many zero bytes (0x00). That's because the Genesis Block, unlike every other Bitcoin block, doesn't reference a previous block, it just references a long string of zeroes. This is truly the beginning of the Bitcoin blockchain.

Finally, in the ASCII text section, you'll see the message Nakamoto left us:

The Times 03/Jan/2009 Chancellor on brink of second bailout for banks

This was part of a special field in the special first transaction of a block, called a coinbase transaction. In Nakamoto's day, this field allowed miners to include up to 100 bytes of arbitrary data in each block they made. Today, that's been reduced to 97 bytes, but still most miners include a message of some sort in every block they make.

Nakamoto's message is especially important. It's the actual headline (although slightly shortened) of a real newspaper, the Financial Times, from the indicated date. This message, which may also express some of Nakamoto's frustration with the pre-Bitcoin financial system, provides extremely strong evidence that Nakamoto couldn't have mined his block earlier than the morning of January 3rd.

(Based on the second block on the blockchain, dated January 8th, it's possible that Nakamoto didn't actually finish mining the Genesis Block until a few days later.)

Why is this important? We also know that Nakamoto announced that the Bitcoin software was available on the 8th, and that there was another notable user (cryptographer Hal Finney) later that day, so the proof that he didn't start mining until the 3rd shows that Nakamoto didn't attempt to mine all the early blocks himself in an attempt to create some sort of Ponzi scheme.

If you want to see more of the blockchain, replace the number 255 in the command above with a higher number, like 50000 or so:

hexdump -n 50000 -C blk00000.dat

...but beware, the blockchain is pretty boring for its first few ten thousand blocks!

Exercise 2: How to parse the blockchain

What you'll learn

You just saw what a single block (the Genesis Block) in the Blockchain looks like on disk in terms of its raw hexadecimal representation. However, the hexadecimal representation alone is only understandable once we understand how to parse that binary data and turn it into useful data structures. In this exercise you will go through the data structures in the blockchain.

How the blockchain is organized

At a high level, the blockchain is organized on disk as follows:

  • The blockchain is kept in the ~/.bitcoin/blocks directory

  • The blockchain is made of files of the form blk00000.dat

One of the best ways to start understanding Bitcoin is to just learn how to parse the Blockchain, which is the record of every transaction that’s ever happened in Bitcoin. Your Bitcoin Computer has a recent copy of the blockchain in ~/.bitcoin/blocks.

In this section you will get hands on experience with every single data structure in the blockchain.

Although some of the most interesting features in Bitcoin are part of blocks, none of them has any purpose if people aren’t making transactions, so let's start by importing a raw transaction from hex. Start python3,

python3

And enter the code from lines with prompts (>>>):

>>> from two1.bitcoin import txn

>>> tx = txn.Transaction.from_hex('01000000014dff4050dcee16672e48d755c6dd25d324492b5ea306f85a3ab23b4df26e16e9000000008c493046022100cb6dc911ef0bae0ab0e6265a45f25e081fc7ea4975517c9f848f82bc2b80a909022100e30fb6bb4fb64f414c351ed3abaed7491b8f0b1b9bcd75286036df8bfabc3ea5014104b70574006425b61867d2cbb8de7c26095fbc00ba4041b061cf75b85699cb2b449c6758741f640adffa356406632610efb267cb1efa0442c207059dd7fd652eeaffffffff020049d971020000001976a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac30601c0c060000001976a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac00000000')

The first 4 bytes of a transaction are its version; in this case, version one. In these examples, we'll place the raw data in hex as a comment above each line that displays the parsed information:

## 01000000
>>> tx.version
1

Each transaction needs to contain a list of inputs that indicate where the bitcoins used in this transaction came from, similar to a list of dollar bills used to pay a cash transaction. Just like how you sometimes pay cash with multiple bills (two $10 bills instead of a $20), you sometimes create Bitcoin transactions with multiple inputs.

The next byte of the transaction indicates how many inputs it has.

## 01
>>> tx.num_inputs
1

We only have one input in this transaction. Although this is only one byte here, it can be up to 9 bytes, as Bitcoin transactions use a custom form of variable-length integer for this field called compactSize. For numbers up to 252, a single byte is used.

Let’s grab the input:

>>> input = tx.inputs[0]

The first part of the input is called an outpoint. It’s similar to the serial number on a dollar bill in indicating where this transaction came from. Outpoints have two parts: the first part is a 32-byte hash of the transaction that last spent these bitcoins; the second part is an zero-based index number indicating where in that transaction we can find information about who’s authorized to spend these bitcoins.

## 4dff4050dcee16672e48d755c6dd25d324492b5ea306f85a3ab23b4df26e16e9
>>> import codecs
>>> codecs.encode(input.outpoint.__bytes__(), 'hex')
b'4dff4050dcee16672e48d755c6dd25d324492b5ea306f85a3ab23b4df26e16e9'

## 00000000
>>> input.outpoint_index
0

When you receive bitcoins, they’re secured by a Bitcoin Script which has two parts: the first part is a function definition (pubkey script) and the second part is a set of inputs to that function (signature script). Anyone who can provide an input to the function that causes it to return True can spend the bitcoins secured by that function.

In the input section of a transaction, we’re spending bitcoins, so we see the signature script inputs used to satisfy a previously-created pubkey script function. We start with another compactSize integer (1 to 9 bytes) indicating the size of the this signature script.

## 8c

Then we have the signature script itself; in this case, it consists of two pieces of data:

## 493046022100cb6dc911ef0bae0ab0e6265a45f25e081fc7ea4975517c9f848f
## 82bc2b80a909022100e30fb6bb4fb64f414c351ed3abaed7491b8f0b1b9bcd75
## 286036df8bfabc3ea5014104b70574006425b61867d2cbb8de7c26095fbc00ba
## 4041b061cf75b85699cb2b449c6758741f640adffa356406632610efb267cb1e
## fa0442c207059dd7fd652eea
>>> str(input.script)
'0x3046022100cb6dc911ef0bae0ab0e6265a45f25e081fc7ea4975517c9f848f82bc2b80a909022100e30fb6bb4fb64f414c351ed3abaed7491b8f0b1b9bcd75286036df8bfabc3ea501
0x04b70574006425b61867d2cbb8de7c26095fbc00ba4041b061cf75b85699cb2b449c6758741f640adffa356406632610efb267cb1efa0442c207059dd7fd652eea'

We won’t go into detail about this data here so that we can move onto the rest of the transaction.

The final element of each input is a sequence number. Sequence numbers were added by Satoshi Nakamoto to allow one version of an unconfirmed transaction to be replaced by a later version of the same transaction. This feature is only now starting to enter common use; most transactions set their sequence number to the max, 0xffffffff (4,294,967,295), to indicate that they shouldn’t be replaced.

## ffffffff
>>> input.sequence_num
4294967295

Now we move to the output section of the transaction. Outputs specify how much to pay and what conditions need to be fulfilled before their bitcoins can be spent. We start with the number of outputs in this transaction:

## 02
>>> tx.num_outputs
2

A great many transactions have two outputs. When you spend an input, its entire value gets used up in the transaction. Some of it you pay to the recipient; some of it you typically pay back to yourself as “change” (like the change a cashier gives you), and the rest gets paid to the miner who confirms your transaction. If you don’t include change, the miner gets all of the leftover bitcoins.

The first field in an output is its value. Although we talk about bitcoins, Bitcoin's atomic unit of value is satoshis (named in honor of Nakamoto after he left development). There are one hundred million satoshis for each bitcoin, so the output below pays 105 bitcoins.

>>> output0 = tx.outputs[0]

## 0049d97102000000
>>> output0.value
10500000000

The next field is the number of bytes in the pubkey script, the function that defines how the bitcoins can be spent:

## 19

The script itself follows:

## 76a91461cf5af7bb84348df3fd695672e53c7d5b3f3db988ac
>>> str(output0.script)
'OP_DUP OP_HASH160 0x61cf5af7bb84348df3fd695672e53c7d5b3f3db9 OP_EQUALVERIFY OP_CHECKSIG'

The OP_ parameters are opcodes in the Bitcoin Script programming language. The data parameter is the hash of a public key. This style of output is known as a Pay-to-Public-Key-Hash (P2PKH):

>>> output0.script.is_p2pkh()
True

That’s all there is in output0. The second output, output1 is pretty much the same:

>>> output1 = tx.outputs[1]
## 30601c0c06000000
>>> output1.value
25972990000
## 19
## 76a914fd4ed114ef85d350d6d40ed3f6dc23743f8f99c488ac
>>> str(output1.script)
'OP_DUP OP_HASH160 0xfd4ed114ef85d350d6d40ed3f6dc23743f8f99c4 OP_EQUALVERIFY OP_CHECKSIG'

The final field of every transaction is the locktime. This specifies the earliest time (or block number) when a transaction may be included in a block (also known as being confirmed). By preventing a transaction from being confirmed for a time, it’s easier to use the transaction replacement feature described in the sequence number section to enable certain types of replacement. In this transaction, locktime is set to zero, meaning the transaction may be added to any block.

## 00000000
>>> tx.lock_time
0

Now that you’ve seen a transaction, it’s time to turn our attention to blocks. Blocks consist of two parts: a header and a series of transactions. The transactions are like those you’ve just seen, placed one after another with no extra bytes between them.

Headers are exactly 80 bytes. Let’s import one:

>>> from two1.bitcoin import block
>>> from codecs import decode
>>> raw_header = decode('020000007ef055e1674d2e6551dba41cd214debbee34aeb544c7ec670000000000000000d3998963f80c5bab43fe8c26228e98d030edf4dcbe48a666f5c39e2d7a885c9102c86d536c890019593a470d', 'hex_codec')
>>> header, _ = block.BlockHeader.from_bytes(raw_header)

The first field in a header is the version number:

## 02000000
>>> header.version
2

There have been four versions of blocks so far, starting with 1 and currently on 4. In the future, the block version field will be used as a bitfield, so there won’t be discrete version numbers any more.

The second field is the SHA256d hash of the previous block’s header. This chains blocks together, creating the blockchain, so that it’s not possible to modify the headers of previous blocks making the later blocks invalid.

## 7ef055e1674d2e6551dba41cd214debbee34aeb544c7ec670000000000000000
>>> codecs.encode(header.prev_block_hash.__bytes__(), 'hex')
b'7ef055e1674d2e6551dba41cd214debbee34aeb544c7ec670000000000000000'

The third field is another hash, the merkle root of all the transactions in this block. The merkle root is constructed by individually hashing each transaction in the block, then hashing pairs of the hashes together, then hashing pairs of those hashes together, etc, until only one hash remains---the merkle root.

This prevents anyone from changing any of the transactions in a previous block without changing the merkle root, which changes the block header, which invalidates the header of subsequent blocks. This guarantees that everyone downloads identical copies of the blockchain.

## d3998963f80c5bab43fe8c26228e98d030edf4dcbe48a666f5c39e2d7a885c91
>>> codecs.encode(header.merkle_root_hash.__bytes__(), 'hex')
b'd3998963f80c5bab43fe8c26228e98d030edf4dcbe48a666f5c39e2d7a885c91'

The fourth field is the block header time. Blocks provide their time so that the proof of work difficulty can be adjusted to ensure that about one block is produced every 10 minutes on average. Miners don’t have to keep really accurate time, but each block must have a time greater than the median time of the previous 11 blocks and no more than two hours in the future according to the clocks of receiving nodes.

Time is stored as a Unix epoch time.

## 02c86d53
>>> header.time
1399703554
>>> import datetime
>>> datetime.datetime.fromtimestamp(header.time)
datetime.datetime(2014, 5, 10, 6, 32, 34)

(Your converted datetime might be slightly different due to timezone differences.)

The fifth field is nBits. In order to find a hash to demonstrate proof of work, you need to find a hash below a certain target value, which changes when blocks are being produced too fast or too slow. nBits compactly encodes the target value using the base-256 form of scientific notation.

## 6c890019
>>> header.bits
419465580

The sixth and final field is a Number Used Once (nonce). The work in proof of work is generating different hashes using slightly different versions of the same data; the nonce provides that variability. Miners typically try hashing with the nonce at zero; if that doesn’t provide a hash below the target value, they hash with the nonce at one; if that doesn’t work, they keep incrementing and re-hashing until they find a hash below the target value. Let’s look at the nonce for this block:

## 593a470d
>>> header.nonce
222771801

If we hash all 80 bytes of the header, we should get a hash with a bunch of zeros on one end, indicating a relatively low number (relatively because we’re dealing with 2^256 as the maximum):

>>> codecs.encode(header.hash.__bytes__(), 'hex')
b'5472ac8b1187bfcf91d6d218bbda1eb2405d7c55f1f8cc820000000000000000'

Remember that this number needs to be below the target value specified by nBits (and everything else in the header needs to valid too). The library provides a way to verify that:

>>> header.valid
True

You’ve now gone through all the fields of a block. Since the blockchain consists entirely of blocks, you now have the knowledge to parse the blockchain.

Exercise 3: The underlying concept behind mining: proof of work

What you’ll learn

Each bitcoin can be thought of as a digital good. Because digital goods can be perfectly copied, it's possible for evil Mallory to give the same bitcoin to Bob that he previously gave to Alice. This is called the double spend problem.

The Bitcoin system provides a solution to the double spend problem: each transaction on the Bitcoin blockchain (global ledger) is placed in order. If Mallory first gives his bitcoins to Alice---and that is recorded on the blockchain---he can't give those same bitcoins to Bob at a later time.

Because Bitcoin is a decentralized system, we need a way for Bitcoin peers to trustlessly come to agreement (called consensus) about the order of transactions on the blockchain. The method used for that is called Proof Of Work (POW), and it was pioneered by a program called Hashcash. Here you’ll learn how POW works.

Hashcash and the concept of proof of work

Hashcash is a system that predates Bitcoin by about a decade. It was initially designed to reduce email spam by requiring senders to attach the equivalent of a penny worth of CPU time (electricity) to their emails in the form of POW. This was cheap for legitimate senders who sent a few dozen emails a day at most, but expensive for spammers who sent millions of emails per day. Unfortunately Hashcash never became widely used---probably as result of it being invented during a time when email infrastructure was centralizing (e.g. webmail), and the centralized providers finding cheaper ways to authenticate data between themselves (e.g. DKIM).

In 2004, Hashcash was used by cryptographer Hal Finney to build Reusable Proof of Work (RPOW), a pre-Bitcoin cryptocurrency that may have heavily influenced Satoshi Nakamoto.

Hashcash works by repeatedly hashing the same data over and over with tiny variations until a hash is found with a certain number of leading zero bits. Let’s use hashcash to repeatedly hash some data until it finds a hash that has a sufficient number of leading zero bits---just like we do in Bitcoin.

## Install hashcash (it's a tiny program)
sudo apt-get install -y hashcash

## Use Hashcash to mine a token containing the string “test”
time hashcash -m -r test > mined.txt

Hashcash will start mining and, when it finishes, return control to your prompt and print three time fields. The user time field gives you an idea of how much CPU time hashcash used; we know the average time on your Bitcoin Computer will be about 2 seconds, but about 10% of users will complete in less than 1/10th of a second, and for a different 10%, it will have taken more than 4 seconds. This is because mining to find a hash with a sufficient number of leading zero bits is a random search. By producing this hash with a specific number of leading zeros, you have proven (on average) that you've expended a certain amount of computational resources.

Hashcash CDF

Now look at the output hashcash created. Notice that the string “test” that we provided is contained there, and that it’s surrounded by a bunch of other data.

## Display the data created by hashcash
cat mined.txt

That data should look similar to, but not identical to:

1:20:151001:test::xYzc5A1pnts9FhC3:00000000000002OQc

Most of the other data is irrelevant to us---it’s part of the anti-spam protection that Hashcash was designed to provide---but the last field is interesting; it’s the Number used Once (nonce). Bitcoin block headers also contain a nonce for the same reason: so that mining code can iterate through nonces in an attempt to produce a block header hash with the correct number of leading zeroes.

## Show the SHA1 (not SHA256) hash of the mined data
cat mined.txt | tr -d "\n" | shasum

This shows the hash of the data Hashcash created. Note that it starts with five zeros. This is Proof Of Work (POW)it’s proof that hashcash checked (on average) about 220 hashes (1,048,576 hashes). Now let’s try generating a hash with even more proof of work:

## Run hashcash again but at a higher “difficulty”
time hashcash -b 25 -m -r test2 > mined2.txt

For most of you, the command will take more than 22 seconds to run, so now is a good time for a short break. For about 10% of you, the command will take more than 5 minutes to run.

After hashcash finishes mining, display the hash the same way you did before:

## Show the hash of the mined data
echo -n $( cat mined2.txt ) | shasum

Note there is now an additional leading zero. By increasing the proof of work difficulty, the amount of time it took to run the command noticeably increased. For each additional bit of proof of work security, the average amount of CPU time doubles.

Now that you know about Proof Of Work (POW), you may want to explore Bitcoin's first block into which Bitcoin creator Satoshi Nakamoto put both a special message and an unusual amount of POW (see section below).

Exercise 4: How to mine a Bitcoin block

What you'll learn

We'll write a short Python3 script to re-mine Bitcoin's first block, the Genesis Block. In doing so, you'll learn about the six fields that make up a Bitcoin block header. This will complement the previous exercises which showed you how to parse the blockchain. The big difference here is that by mining you are writing to the blockchain (or trying to) rather than reading from it.

The Block Header

In previous sections, you got a look at the Genesis Block. Now let's learn more about the first part of that block, the block header.

Each Bitcoin block header is exactly 80 bytes long and is composed of the following fields.

Bytes Name Description
4 version Which set of block validation rules to follow
32 Previous block header hash The previous block’s header
32 Merkle root hash A hash derived from hashes of all transactions included in this block
4 Time The approximate time the block was created
4 nBits The target threshold this block’s header hash must be less than or equal to
4 Nonce An arbitrary number miners change during mining

Let's start our script by importing some modules and defining the header fields. Use the nano text editor to open a file named miner.py

nano miner.py

Paste in the following code:

## Objective: mine the Bitcoin genesis block

from struct import pack
from hashlib import sha256
from codecs import decode
from binascii import hexlify

Bitcoin uses the SHA256-double (sha256d) hash function. It also does some weird byte reversals, so let's add a couple of helper functions:

## Bitcoin uses the SHA256d hash function, which is the SHA256 function
## run twice (double).
def sha256d(data):
  return sha256(sha256(data).digest()).digest()

## We want to display our results as hex in RPC Byte Order, so
## we need to reverse the byte order
def internal2rpc(hash):
  return hexlify(hash[::-1])

The Version

The first field in the block header is the version field, which indicates what block validation rules should be followed for this block. Blocks started with version 1 (0x01000000) and are currently on version

  1. Since we're mining the Genesis Block, we'll add the same version 1 it used to our code:
## The version number 1 as a little-endian (<) unsigned-long (L)
version = pack("<L", 0x01)

The Previous Block Hash

The block header always contains a hash pointer to the hash of the previous block header linking the new block to all previous blocks in the chain. In the unique case of the Genesis Block, there was no previous block, so Nakamoto filled this field with zero bits.

## The previous block header hash is 32-bit zero
previous_header_hash = decode("0000000000000000000000000000000000000000000000000000000000000000", 'hex')

The merkle root

Bitcoin block headers include a merkle root, which is the final hash in a merkle tree that connects all of the transactions in a block to the block header using a cryptographic hash that can prove that none of the transactions have been modified.

Merkle root

The merkle tree is created by looking at all of the transactions in the block in the order they appear. The rules for creating a Bitcoin merkle root are mildly complex for most blocks, but for the genesis block they're easy---the merkle root is the same as the hash of the first (and only) transaction in the Genesis Block. Let's add that hash to our script:

## Merkle root (in this case, also a txid). We have to reverse it
## into Internal Byte Order
merkle_root = decode("4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b", 'hex')[::-1]

The Timestamp

Bitcoin block headers use Unix epoch time, often simply called Unix time or epoch time. This is the number of seconds elapsed since midnight 1 January 1970 UTC. As you might suspect, it’s easy to get that time on a Unix-like system such as that installed on the 21 Bitcoin Computer:

## Get the current Unix time at the command line
date +%s

You can also get the Unix time for an arbitrary date, such as the date Nakamoto used when mining the Genesis Block:

## Get the corresponding Unix time for the provided date
## at the command line
date +%s -d '3 Jan 2009 18:15:05 UTC'

Let's add that time to our mining script. Note that this is a line of code in miner.py, not a line of code to be executed at the command line!

## Date in Unix time format
date = pack("<L", 1231006505)

nBits: encoding the target

In the proof of work section, you learned that Hashcash allowed you to increase your Proof Of Work (POW) security by increasing the number of leading zero bits, but that for each additional bit, you had to do twice as much work. Bitcoin also has a way of increasing the amount of POW security when miners are producing blocks too fast, but more fine-grained control was needed than simply waiting until miners were producing blocks twice as fast. So Nakamoto created the target number.

Bitcoin block header hashes are interpreted as numbers that must be less than the target value, which provides more granularity than whole bits do. To help keep track of the current target, a field in the block header stores the current 256-bit target compressed into just 32 bits (reducing granularity somewhat). To do this, the base-256 form of scientific notation is used.

nbits

The lowest allowed target in Bitcoin is called difficulty 1. In nBits, it’s 1d00ffff. The Genesis Block was officially mined at that difficulty, although the amount of proof of work in the Genesis Block is much higher than a typical difficulty-1 block.

Let's add the difficulty to our block:

## nBits is stored as a little-endian (>) unsigned-long (L)
nbits = pack("<L", 0x1d00ffff)

Let's also add some code to convert nBits into the target threshold so that we can determine when we've found a difficulty-1 block later:

## Convert current nbits into a big-endian string
nbits_calc = hexlify(nbits[::-1])

## The nbits calculation is base-256
base = 256

## The nbits exponent is the the first byte of the nBits
exponent = int(nbits_calc[0:2], 16) - 3
## The nbits significand is the other three bytes
significand = int(nbits_calc[2:8], 16)

## Do the nbits calculation
target = significand * ( base ** exponent )

The Nonce

When trying to create successful Proof Of Work (POW) for blocks, miners need to create many different hash results with minimal changes to the hashed data. Nakamoto gave them an easy way to do that by including a Number used Once (nonce) field in the block header. Mining software can change just the number in that field without affecting the other fields.

Typically miners start with a nonce of zero and iterate up to its maximum value, 0xffffffff. If none of those values work and it took them more than a second to search all possible nonces, they can simply update the time field to the new current time. If they were able to search all the fields in less than a second, they can change one of the transactions in the block to get a different merkle root. After either change, they can start searching nonces again.

In our case, we don't want to spend all day searching nonces, so we're going to enter a pre-computed nonce that is very close to the nonce we need to generate the Genesis Block header hash. Add this to your mining script:

nonce = 0x7c2bac10

Mining the Genesis Block on a CPU

Now you have all the pieces in place, let's use a simple loop to check header hashes until we find a block header hash with a sufficient amount of proof of work. Add the following code to the end of your script:

while nonce < 0x7c2bac1e:
  header = (
    version
    + previous_header_hash
    + merkle_root
    + date
    + nbits
    + pack("<L", nonce)
  )

  ## Get the header hash corresponding to the header
  header_hash = sha256d(header)

  ## If the header hash is less than the target, print the results and
  ## break the loop
  if int(hexlify(header_hash[::-1]), 16) < target:
    print("Nonce      Header Hash")
    print(nonce, internal2rpc(header_hash))
    break

  ## Increment the nonce
  nonce += 1

Save and exit the file using the instructions that nano prints on the bottom of the screen. The ^ character means press-and-hold the Ctrl key. Then run your code like this:

python3 miner.py

In case you copied and pasted anything wrong, you can also download a copy of miner.py like this, and then run it the same as above:

wget https://gist.githubusercontent.com/harding/3e0874746baea1f10fda/raw/miner.py

When the script finishes running after a second or two, it should print the header hash of the Genesis Block. Notice the leading zeroes demonstrating proof of work. Congratulations, you mined the Genesis Block using a CPU just like Satoshi Nakamoto did (although he had to run his code much longer because he didn't know what nonce to start with).

Re-mining the Bitcoin Genesis block is fun, but the 21 Bitcoin Computer is capable mining new blocks at hardware-accelerated speeds, allowing you to earn a steady stream of satoshis (a denomination of bitcoins) so that you can use and build real Bitcoin applications. Learn how in the next section.

Exercise 5: How to rapidly mine bitcoin with a Bitcoin Computer

What You’ll Learn

The 21 Bitcoin Computer includes a fast and convenient way to get bitcoin for programming purposes: the 21 mine command, which uses a mining chip under the hood. Let’s see how to use it, and then we’ll explain how it works.

How to mine Bitcoin

Start by booting up your 21 Bitcoin Computer and SSHing in. We assume you have already run 21 update once. Then execute these commands one by one. Don’t paste them all in at once, but observe the output.

# Start up your Bitcoin mining chip. This will
# prompt you to create a wallet and 21.co account.
21 mine

# Look at your status - after 30-45 seconds, you should have 20,000
# satoshis (satoshis are the smallest indivisible units of one bitcoin,
# currently set at 100 million satoshis per bitcoin)
21 status

# Look at your log to see that you received the satoshis for booting
21 log

# Do a 21 flush to move your 21.co balance to the blockchain
# This will take a little while and occur in the background.
21 flush

# See the flush taking place in your 21 status. The 
# “Amount flushing from 21.co balance to Blockchain balance”
# should be 20,000 satoshis
21 status

# Now invoke 21 mine a second time with no arguments.
# If the mining chip is already running, you will get an advance of
# 20,000 on your future mining earnings!
# Note that it will take a little while to mine.
21 mine

# Do it again just for kicks!
21 mine

# See your status again. You may also see an increment due to bitcoin
# being mined in the background by the chip.
21 status

After you’ve done those commands, your status should look something like this:

twenty@bitcoin-computer-8tml:~$ 21 status
21.co Account
    Username        : dorian_satoshi

Mining
    Status           : 21 mining chip running (/run/minerd.pid)
    Hashrate         : ~50 GH/s (warming up)
    Mined (all time) : 55 Satoshis

Type 21 mine --dashboard to see a detailed view. Hit q to exit.

Balance
    Your spendable balance at 21.co [1]                       : 40055 Satoshis
    Your spendable balance on the Blockchain [2]              : 0 Satoshis
    Amount flushing from 21.co balance to Blockchain balance  : 20000 Satoshis

    [1]: Available for bittransfers (21.co/micropayments)
    [2]: Available for on-chain (21.co/micropayments)

    To see all wallet addresses, do 21 status --detail

How many API calls can you buy?
    Search Queries        : 0    (800 Satoshis per search)
    SMS Messages          : 0    (1735 Satoshis per SMS)

Use 21 buy to buy API calls for bitcoin from 21.co.
For help, do 21 buy --help.

This means you mined a total of 60,055 satoshis. Of these:

  • 20,000 came as a bonus for booting up the device and beginning the mining process, by running 21 mine the first time

  • 55 (in this example) came via background bitcoin mining over the last few seconds, started when you ran 21 mine the first time

  • 40,000 more came from your second and third invocations of 21 mine as an on demand advance against future mining proceeds

Moreover, in terms of where the bitcoin is:

  • 40,055 satoshis are in your 21.co buffer

  • 20,000 are being flushed to your blockchain balance from your 21.co balance

How did your bitcoin get to those spots?

  • You flushed the first 20,000 satoshi (the bootup bonus) from your 21.co balance to your blockchain balance when you did 21 flush

  • You got another 55 from background mining over the last few seconds and 20,000 from each of your two foreground mining invocations (your second and third invocations of 21 mine)

The basic idea is to allow you to mix background and foreground mining along with on- and off-chain balances to ensure that you always have enough bitcoin for programming and micropayment purposes.

Your 21.co balance and your blockchain balance

The reason there are two balances is that mining bitcoin is slow under the best of circumstances, and not fast enough to permit rapid acquisition of bitcoin for programming purposes. As such, we have implemented something we call "buffered pooled mining" to speed up the process. The short version is that we’ve extended the concept of pooled mining to buffer against several sources of randomness and delay:

You do not need to wait for a block to be mined. Instead, as soon as your chip connects to our pool we begin streaming you a pro-rata share of your mined Bitcoin.

You do not need to pay transaction fees on each of these small awards of Bitcoin. Instead, we buffer the balance for you at 21.co. You can use this to buy digital goods from other 21 developers, and you can also do 21 flush to move the balance to the blockchain at any time. You control the private keys.

You do not need to wait for 100 blocks before accessing your mined bitcoin. Finally, you do not need to send N hashes to the server before getting N hashes worth of mined bitcoin. That is, by invoking 21 mine your 21 Bitcoin Computer receives bitcoin in advance of future mining at the expense of a small asymptotic slowdown in the rate of bitcoin streamed to your device.

The basic idea is that this is a new way of getting bitcoin: not by buying large quantities slowly for investment purposes on an exchange, but by mining tiny quantities on demand for programming purposes at the command line, rate-limited by a mining chip.

The net result is that you now have 40,000+ satoshis in your 21.co balance, with 20,000 more in the middle of flushing to the blockchain. You can get more satoshis by running 21 mine. If you run it too many times in a row you will be rate-limited by a difficult hashing problem sent to your chip, so don’t abuse it!

Step 5: Self Test

Now that you’ve gone through Step 3 (Introduction to Bitcoin) and Step 4 (Interactive Introduction to Bitcoin), please go through the self test here.