“Memory is the new disk,” Jim Gray liked to say.
The database pioneer died in 2007 after he was lost at sea, but like so much of the man, his words about memory and disk live on. Fred Holahan, the vice of president of marketing at VoltDB, uses those words in plugging his company — an outfit offering a database that stores all data in memory rather than on disk — and, yes, they describe a larger movement across the database business and beyond.
In addition to VoltDB — the latest venture from another well-known database pioneer, Mike Stonebraker — in-memory databases are now available from the giants of the software game, including Oracle, IBM, and SAP. And there are many others making headway in the world of open source, including Redis and MemcacheDB. Just a few years ago, the processors used in the average server couldn’t handle enough memory to accommodate an entire database. But now they can, and this has sparked the beginnings of a revolution in the database business. If you store your data in memory rather than on hard disk, you can access it several times faster.
The revolution continues next month: A San Francisco startup known as Birst will take the wraps off a new in-memory database designed to speed what’s commonly called “business intelligence” or BI software — software that seeks to gain insight from the vast amounts of digital information collected by the modern business. And with this database, the company hopes to serve the average business — not just the massive corporation or a cutting edge-web shop.
Founded by an ex-Oracle man, Birst has long offered business intelligence software over the internet — i.e., you can use it without installing it on your own servers — and this software was originally designed to work in tandem with traditional on-disk databases from the likes of Oracle and SAP. Now, Birst hopes to streamline things even further by pairing its service with an in-memory database. “Things that took minutes are going to take seconds,” boasts Brad Peters, Birst’s CEO, who spent several years leading the data analytics group at Siebel, the software outfit that was acquired by Oracle in 2005.
Peters and company have yet to benchmark test their database, and it’s not yet available to the outside world. But unlike in years past, building this sort of database is now a practical proposition, and there’s certainly a need for it. MongooseMetrics — a phone-call-tracking company based in Ohio — uses Birst’s existing data analytics service, and according to Tom Cooper, the company’s information technology manager, it’s pushing to be one of the first outfits with access to the new in-memory database.
Mongoose lets businesses track phone calls generated by online ads. Using Birst’s data analytics service and a traditional on-disk database, it generate call-tracking reports for its customers about every eight hours, processing as many as 500 million records. But the amount of data facing each customer is growing, and in an age when “real-time” is so often the ideal, Mongoose is intent on significantly reducing the time between each report. “Today, we’re moving farther and farther away from real-time,” Cooper says. “We’re hoping the in-memory database can get us down to an hour or at least a couple of hours.”
With its online service, Birst will provide remote access to individual machines running its in-memory database, and yes, datasets will be restricted by the amount of memory available on each machine: about half a terabyte, or 500 gigabytes. But this is more than enough for Mongoose’s purposes, and Birst is also able to compress data before it’s moved into memory. The drawback is that if the machine goes down, you lose whatever is in memory, but Birst is designed to work in tandem with systems where the data is permanently stored on disk. In essence, it regularly exports data to disk, but if the system goes down between exports, you will indeed lose any data you’ve generated in the meantime.
Birst is akin to in-memory databases offered by Oracle and SAP, but the idea is to make it much easier to use — and cheaper. Peters bills his company as a kind of anti-Oracle. Unlike Oracle, Birst will offer its database as an online service. But it will also include it with a “virtual appliance” you can install on your own servers, and according to Peters, this will be a significantly less expensive option that the beefy analytics appliances offered by the likes of his former employer.
The new database is different from Mike Stonebraker’s VoltDB in that it’s designed for deeper analysis. VoltDB is meant to monitor data even closer to real-time, but it can’t slice and dice it to quite the same extent as Birst. And unlike open source “NoSQL” databases such as Redis, Birst’s database retains the structure of a traditional “relational” database, where data is stored in neat rows and columns. This means that Birst can provide the sort of analysis you can’t get from the NoSQL camp, but it’s not designed to handle as much data.
As Stonebraker has told us, the database business is evolving in many directions. Databases are now being designed for very specific tasks, and in many cases businesses are using several different databases to serve different needs. Birst’s database is a just the latest example that demonstrates this trend.
But it also highlights that trend Jim Gray spoke of so many years ago. If memory isn’t the new disk, it will be.