I found an interesting post on The Future of Software minisite run by the GigaOM network, best known for Om Malik and his GigaOM blog. The post is entitled “Data 2.0: How the Web disrupts our relational database world” and is written by Nitin Borwankar.
The post begins with:
The great online shift is creating massive amounts of data – whether it is videos on YouTube or social networking profiles on MySpace. And that data is stored in databases, making them the key component of the new web infrastructure. But managing that information isn’t easy
I think he nails the problem statement. The Web world is changing fast. And relational databases are having trouble keeping up.
The good news is that database management will be vastly different in the future. In fact, change has already begun; it just isn’t (cliché alert!) “evenly distributed” yet.
He then goes on to describe some leading examples of companies or problems that are pushing the relational database envelope.
- Yahoo’s creation of its own user management software based on BerkeleyDB
- Google’s MapReduce
- Amazon’s S3 (simple storage service) and SQS (simple queue service) which externalize operations normally done by a database.
- The general use of Lucene, Nutch, and Solr to do indexing of unstructured content, “something an old relational database cannot do well.”
- The graph-structured data problem (also known as the parts explosion problem) inherent in social networking and which remains an Achilles’ heel for relational databases
So while I generally agree with his thesis, the examples cited are basically all technology companies who are able to write their own system-level software to bypass and/or accommodate the limitations of relational databases.
My question is: what about everybody else? What are they supposed to do?
My short answer is — perhaps not shockingly — MarkLogic. At MarkLogic, we call Data 2.0 “content.”
- We manage XML natively
- We manage graph-structured data easily
- We manage, search, storage and index text and XML natively
Some companies will always be able to write their own stuff to get around problems. But the reason MarkLogic exists is provide a commercial DBMS that “the rest of us” can use when managing content and building web applications with it.
See this post on top-to-bottom XML for more.