Mike Stonebraker spoke today at SIGMOD (see Tweetstream) where, among other things there was a 40-year anniversary celebration of the relational DBMS and, in what I suspect is non-coincidental timing, Mike did a post on the CACM site entitled The End of a DBMS Era (Might be Upon Us).
Moreover, the code line from all of the major vendors is quite elderly, in all cases dating from the 1980s. Hence, the major vendors sell software that is a quarter century old, and has been extended and morphed to meet today’s needs. In my opinion, these legacy systems are at the end of their useful life. They deserve to be sent to the “home for tired software.”
His key argument is all about performance: in any given use-case, Stonebraker thinks RDBMSs can be beaten by about a factor of 50.
- In data warehousing he says a column store wins by 50x
- In OLTP he says a memory-resident DBMS wins by 50x
- For scientific data, he says a DBMS specialized for the job can win by 50x
- For RDF, he says column stores do a reasonable job and is confident that specialized RDF triple stores will do better, i.e., 50x or more. (I’d add that at MarkLogic we think we do a reasonable job as well.)
- For text, he points out that no major search engine uses a relational database so they didn’t even qualify for consideration.
- For XML, he cites a private report I sent him a while back done for one of our customers comparing MarkLogic performance to a relational DBMS. When on “our turf,” we usually win by no less than 10x and sometimes 100x or more. Sometimes, queries are not even processable in an RDBMS and/or need to be hand-optimized and hand-joined between a DBMS and a search engine.
He reduces to three cases how special-purpose DBMS vendors get their advantage:
- A non-relational data model
- A different implementation of tables
- A different implementation of transactions
We’re in the first category, using XML as our data model instead of a table. It’s a great post. Check it out and check out the cited references as well.
It's interesting to consider memory resident a separate class of RDBMS, but the normalized relational model still rules the OLTP world in terms of efficient transaction locking.The other aspect I would through into this is the prevalence of caching layers, such as memcached, that are driving a lot of high volume applications. By adding a caching layer to the RDBMS, you end up getting the in memory database performance on the OLTP side, but also get to utilize the galaxy of tools and techniques based on the relational model.Where the MarkLogic model really shines for me is in querying semi-structured documents and in flexible schemas. It really offers a strong framework for combining text and fielded queries, more like a search engine than a database.
On Mark Logic, thanks for putting our power alley so clearly and tersely.
Pingback: Is Oracle Database a Legacy Technology? (Part 1) « So Many Oracle Manuals, So Little Time