Blockchain for DBAs

In Data, Databases, DBA on October 30, 2017 at 09:25

“Instead of putting the taxi driver out of a job, blockhchain puts Uber out of a job and lets the taxi driver work with the customer directly.” – Vitalik Buterin

A blockchain database consists of two kinds of records: transactions and blocks. Blocks contain the lists of the transactions that are hashed and encoded into a hash (Merkle) tree. The linked blocks form a chain as every block holds the hash pointer to the previous block.

The blockchain can be stored in a flat file or in a database. For example, the Bitcoin core client stores the blockchain metadata using LevelDB (based on Google’s Bigtable database system).

The diagram above can be used to create the schema in PostgreSQL. “As far as what DBMS you should put it in”, says Ali Razeghi, “that’s up to your use case. If you want to analyze the transactions/wallet IDs to see some patterns or do BI work I would recommend a relational DB. If you want to setup a live ingest with multiple cryptocoins I would recommend something that doesn’t need the transaction log so a MongoDB solution would be good.”

If you want to setup a MySQL database: here are 8 easy steps.

But what is the structure of the block, what does it look like?

The block has 4 fields:

1. Block Size: The size of the block in bytes
2. Block Header: Six fields in the block header
3. Transaction Counter: How many transactions follow
4. Transactions: The transactions recorded in this block

The block header has 6 fields:

1. Version: A version number to track software/protocol upgrades
2. Previous Block Hash: A reference to the hash of the previous (parent) block in the chain
3. Merkle Root: A hash of the root of the merkle tree of this block’s transactions
4. Timestamp: The approximate creation time of this block (seconds from Unix Epoch)
5. Difficulty Target: The proof-of-work algorithm difficulty target for this block
6. Nonce: A counter used for the proof-of-work algorithm

More details, like for example details on block header hash and block height, can be found here.

But how about blockchain vs. relational database: Which is right for your application? As you can see, because the term “blockchain” is not clearly defined, you could argue that almost any IT project could be described as using a blockchain.

It is worth reading Guy Harrison’s article Sealing MongoDB documents on the blockchain. Here is a nice quote: “As a database administrator in the early 1990s, I remember the shock I felt when I realized that the contents of the database files were plain text; I’d just assumed they were encrypted and could only be modified by the database engine acting on behalf of a validated user.”

The Blockchain technology is a very special kind of a distributed database. Sebastien Meunier’s post concludes that ironically, there is no consensus on the definition of what blockchain technology is.

I particularly, like his last question: Is a private blockchain without token really more efficient than a centralized system? And I would add: private blockchain, really?

But once more, what is blockchain? Rockford Lhotka gives a very good DBA-friendly definition/characteristics of blockchain:

1. A linked list where each node contains data
2. Immutable:
– Each new node is cryptographically linked to the previous node
– The list and the data in each node is therefore immutable, tampering breaks the cryptography
3. Append-only
– New nodes can be added to the list, though existing nodes can’t be altered
4. Persistent
– Hence it is a data store – the list and nodes of data are persisted
5. Distributed
– Copies of the list exist on many physical devices/servers
– Failure of 1+ physical devices has no impact on the integrity of the data
– The physical devices form a type of networked cluster and work together
– New nodes are only appended to the list if some quorum of physical devices agree with the cryptography and validity of the node via consistent algorithms running on all devices.

Kevin Ford’s reply is a good one to conclude with: “Based on this description (above) it really sounds like your (Rockford Lhotka’s) earlier comparison to the hype around XML is spot on. It sounds like in and of itself it isn’t particularly anything except a low level technology until you structure it to meet a particular problem.”

The nature of blockchain technology makes it difficult to work with high transnational volumes.

But DBAs can have a look at (1) BigchainDB, a database with several blockchain characteristics added: high-transaction, decentralized database, immutability & native support for assets and (2) at Chainfrog if interested in connecting legacy databases together. As far as I know, they support as of now at least MySQL and SQL Server.

  1. Reblogged this on AFROWARE.

  2. Hi Julian, firstly, you’ve definitely captured the DBA angle on blockchain in this article. Blockchain isn’t going to replace databases, and as you rightly point out, it’s not even well-defined. It’s also interesting that the first thing you do in a blockchain network node when you join a blockchain is to extract the data into a database, precisely because they’re optimised to allow you to search through the data and find the information you’re actually interested in.

    Part of the reason that blockchain is somewhat odd is that the main components have been around for years (personally I think it’s these four components that distinguish a true blockchain). Hash-linked lists were invented in 1955 by Newell, Shaw and Simon. Asymmetric key cryptography goes back to Diffie and Hellman in 1970, Consensus protocols such as Byzantine fault tolerance are a 1982 invention, and although I coudn’t find a clear date for peer-to-peer networking, Napster was the first to popularize it in 1999.

    But it’s the combination of these four elements (first done in 2008) that result in a few interesting emerging properties that aren’t obvious until you think about the combination for a while. The fact that the combination solves the double-spend problem allows the creation of unique unforgeable digital assets (so far mainly applied to crytocurrencies) – you might find of interest on that. And secondly, the transfer of data over low-trust boundaries should also be of interest to DBAs – see for my take on that.

    Thanks for providing an insight into the DBA perspective, and I look forward to your next post on blockchain. Hope I don’t have to wait too long!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: