Exploding Costs of storing information on a Blockchain

Important note: This article uses approximations with regards to volatile data or values that are difficult to determine. Please see notes for extended information on why and how we chose certain examples and calculated prices. Storing information on a blockchain, in addition to the mere transaction data, could have significant benefits. In general, blockchains offer a highly secure and highly available, independent and censorship-free storage of information by their very nature. Another huge benefit for special use cases is that information stored on a blockchain cannot be altered retroactively and is independent of any third party. But prices and transaction costs on blockchains like Ethereum or Bitcoin have been exploding during the last couple of months. If you wanted to store information like our ISCC identifiers on one of the major blockchains, you would have to pay a significant premium. One reason for this is that, as blockchains and their networks are growing, the resource consumption and the total costs to operate the networks are increasing. But the main reasons are probably the strongly increasing public interest in Bitcoin and similar currencies along with new venturers and speculators entering the market during the past 12 months and driving the demand and price for many crypto currencies. Since the end of 2015, the average price for a Bitcoin or Ethereum transaction has increased constantly 1 2:
 BTCETH
Q4 20150.062 USD0.008 USD
Q1 20160.085 USD0.028 USD
Q2 20160.129 USD0.03 USD
Q3 20160.175 USD0.034 USD
Q4 20160.241 USD0.33 USD
Q1 20170.621 USD0.064 USD
Q2 20172.443 USD0.608 USD
Average costs of transactions increased by a factor of 39 for Bitcoin and 76 for Ethereum in the course of only 20 months.3 For simplification we assume to store as few as 100 bytes of information as a text comment with each transaction, so the cost of storing 100 bytes of information represents the transaction price. Assessing how expensive these transactions are compared to other “data storage solutions” is not easy. It’s not helpful to just compare these costs with the costs of storing 100 bytes of information on a hard disk (currently roughly 0.000000003 USD) because a single HDD is at no time highly secure, highly available and immutable like a blockchain. The closest alternative to the distributed nature of a blockchain would be a global-scale, distributed database network. There are several possibilities and existing products for such a database. In this article we specifically chose Google Cloud Spanner as an example.4 Cost5 of writing 100 bytes in Google Cloud Spanner: 0.0002 USD. Cost of reading 100 bytes 1000 times in Google Cloud Spanner: 0.000015 USD. The cost of storing only a few bytes of information on the Ethereum or Bitcoin network is currently roughly 2000 to 8000 times higher than using a global-scale, internet-grade database. Additionally, an actual database is faster than a blockchain by multiple orders of magnitude. It could be argued that comparing a public blockchain to a distributed database is necessarily like comparing apples and oranges. This objection is certainly correct: a blockchain and a distributed database are completely different things built for different purposes, but they are comparable as to the storage of arbitrary information in a structured way. Additionally, it could be argued that a private / permissioned blockchain would drastically reduce the cost of storing information on a blockchain. This is true (in many cases), but a private blockchain is not suitable for creating an open ecosystem for content. Furthermore, there are “storage blockchains” like Storj, Sia, Filecoin, MaidSafe, Ethereum Swarm, which are “offering” lower costs, but they come with other constraints and design limitations that make them less favorable for a project like ours. One could argue that the actual transaction costs on the commonly used blockchains are in fact much lower and that the prices that users have to pay are so high because of all the speculative investors that are boosting prices. With this in mind, let’s take a look at the “real” costs of a Bitcoin transaction. The current Bitcoin network hashrate is about 5 Exa-Hashes/s (17.07.2017); the efficiency of the best publicly available mining hardware ranges between .05 and .098 W power consumption per GH/s. The average daily number of Bitcoin transactions is roughly between 260,000 and 330,000.6 Result: the power consumption of a single Bitcoin transaction is approximately 35 kW/h (ranging from 26 kW/h to 43 kW/h in July 2017 on the Bitcoin network). Globally, the price per kW/h ranges from 0.01 USD (production cost at a very cheap location) to 0.35 USD (retail price including all infrastructure costs, taxes, fees etc.). So even if we assume a price of 0.01 USD (without any follow-up costs), the minimum cost for a Bitcoin transaction is currently 0.35 USD. But this cost assumption completely ignores other important factors like environmental sustainability, geographic region, infrastructure costs, investments in mining hardware etc. Accordingly, the realistic “intrinsic” costs of a Bitcoin transaction are probably 5-20 times higher (1.80 USD – 7.00 USD). What do we conclude from this? That Bitcoin and Ethereum are unsuitable for storing information and that they are expensive and slow? No, the point was to demonstrate the extra charge that (currently) has to be paid for the independence of any third party and the security of a blockchain like Bitcoin or Ethereum.

General conclusions and particular insights for the Content Blockchain Project

  • Currently, storing even low amounts information on a blockchain like Bitcoin or Ethereum is indeed expensive in comparison to traditional solutions – probably much too expensive to store hundreds of millions of ISCCs and much too expensive to handle the content microtransactions of the Content Blockchain Project.
  • Blockchains are poor databases [regarding storage of large amounts of data]. Using a blockchain for a project should be a well-conceived decision.
The reason we still believe a public blockchain is the right environment for our project is very straight forward: If you want to built a truly open, secure, reliable and censorship-free content ecosystem, a public blockchain is currently the best way to do it. Our current goal is to find a (better) blockchain solution that is suitable for content microtransactions and ISCC registration in an economic sense. We are investigating towards creating a new blockchain specially for ISCC and content transactions in order to make transactions affordable in the long run.
1 Storing information on a blockchain 
Some blockchains like Bitcoin don’t support the storage of additional information other than small amounts of transaction metadata. The Ethereum blockchain takes a completely different approach; a transaction or an Ethereum contract might be used to store information. Consequently, the cost of creating a new contract is quite high, but data operations within a contract would be much cheaper. Depending on the individual use case it might be necessary to create a new contract for each individual data transaction; alternatively, a single contract could process and store the results of multiple transactions. It is thus nearly impossible to determine the exact cost of storing arbitrary information on the Ethereum blockchain, as it depends on a number of factors like ether price, gas price, gas / resource consumption, the exchange used and the exact use case. Accordingly, we decided to use a very simple price indication for Bitcoin and Ethereum, the average transaction price.
2 Sources of prices
All pricing information for Bitcoin and Ethereum was gathered from https://bitinfocharts.com/ The data is probably not entirely accurate, because exchange rates and prices per transaction vary by up to several percent depending on region and exchange. Still, the data should be sufficient for the purpose of this article, i.e. to roughly illustrate the general price development for Bitcoin and Ethereum.
3 Bitcoin and Ethereum 
The reasons for focussing on Bitcoin and Ethereum in this article are straightforward. Factors like network size, age in the open field, distribution of users, user acceptance, hashrate, and developer community are essential with regard to network security and the probability that the particular blockchain will continue to exist during the next 24 months. In planning to use a blockchain for any serious project, these are the main factors to be considered when choosing a specific blockchain environment, and they make Bitcoin and Ethereum so popular..
4 Selection of an exemplary distributed database 
Running MySQL on two dedicated servers provides you with a distributed database. If you put them in two different datacenters, you have a very cheap globally distributed database. There are countless other ways to build a globally distributed database, as almost all major database systems support some kind of clustering or replication, some even between continents on a global scale.
We used Google Cloud Spanner as an example simply because it operates on a truly global scale, comprehensive pricing information is publicly available and its pricing is just in the middle between really cheap and extraordinarily expensive.

5 Calculation of Costs for Google Cloud Spanner in this Example 3 Cloud Spanner nodes (US, Europe, East Asia) = $0.90 * 3 * 720h = $1,944 per month 
Costs of Write
0.0000001 GB (100 Bytes) * $0.30 per GB storage per month, ingress traffic free = $0.00000003
Retention time 120 months = 120 * $0.00000003 = $0.0000036 (storage price for 100 Bytes for 120 months)
Costs for 3 nodes / 10M writes per month = $1,944 / 10,000,000 = $0.0001944
Assumed Costs for storing 100 Bytes for 10 years on 3 Cloud Spanner nodes = $0.000198, rounded $0.0002

As a blockchain is generally a write once, read many use case, we assume for our calculation a ratio of 1000 reads on 1 write. Furthermore, we assume that all node costs and storage costs are calculated into the write price. Thus the costs for reads are basically the egress traffic costs. Google’s egress network traffic prices vary between $0.08 and $0.23 depending on region and volume; for our calculation we use $0.15
0.0000001 GB * $0.15 * 1000 times = $0.000015

6 Calculation of Energy Consumption 
It is very hard to quantify the current power consumption of the blockchain networks. A mere rough guess can be made on the basis of the publicly available hashrates and the publicly available information about the efficiency of the mining equipment. It is likely that the worldwide power consumption just for the bitcoin network ranges between 350 MW/h and 600 MW/h.

Data source for the current Bitcoin hashrate: https://blockchain.info/charts/hash-rate 
Further information about calculation and efficiency of mining hardware 
https://www.theguardian.com/sustainable-business/2017/jul/13/could-a-blockchain-based-electricity-network-change-the-energy-market
https://motherboard.vice.com/en_us/article/ypkp3y/bitcoin-is-still-unsustainable
http://digiconomist.net/bitcoin-energy-consumption
http://digiconomist.net/ethereum-energy-consumption