Backblaze (you know, the guys that offer the same $5/month unlimited backup service as Mozy but that nobody’s ever heard of…) have made headlines this week by posting lots of details about their storage hardware. Their data center isn’t filled with the latest “cloud” products from EMC, NetApp and the likes; instead, they rely on home-built servers.
I won’t go into the technical details; if you’re into that, head over to their blog. But what I do find interesting are the cost savings they claim. According to Backblaze, the price per petabyte of storage is an order of magnitude lower than any off-the-shelf solution; even using relatively simple Dell MD1000 arrays without any fancy storage tiering, single filesystem, etc would cost them $826,000; their own servers cost them just $117,000.
The key message to take home from this is that when you buy a large-scale storage system, you’re not just buying a bunch of disks. You’re buying very expensive software that allows you to perform advanced functions such as serving those disks up as iSCSI, FC or NFS targets, make snapshots, masks hardware failures, performs replication tasks and more. The hardware itself is a relatively small part of what you’re buying.
Building a cloud includes not only deploying a large quantity of hardware, but, critically, deploying software to manage it. At Backblaze we have developed software that de-duplicates and chops data into blocks; encrypts and transfers it for backup; reassembles, decrypts, re-duplicates, and packages the data for recovery; and monitors and manages the entire cloud storage system. This process is proprietary technology that we have developed over the years.
Does that sound familiar? It should, because it’s exactly what the largest storage users in this world are doing. Google has been building their own hardware for years. Amazon built S3 on top of commodity hardware. Mozy is providing their backup service using a homegrown system for spreading storage over multiple servers (at least, that was the case before EMC bought them…)
Which leaves me wondering who the big storage vendors are targeting with their new cloud offerings. For most companies that use over a petabyte of storage, their storage is needed for a single application, that in most cases is developed in-house. If building a middleware layer that resides between your application and the back-end storage will save you over a million dollars per petabyte, hiring a couple of extra developers might suddenly make a lot of sense. Some companies are even trying to bring some of the technology they develop into the hands of open source. That doesn’t mean there isn’t a market for large storage systems; but there is a limit on the size of the systems that are sold.
Related posts:

Your RSS reader
Daily e-mail
Twitter