Deconstructing Databases

Here’s a post where O’Reilly’s Dale Dougherty writes about a talk at EuroOSCON by Greg Stein, of Google Code, where Stein talks about building a bug-tracking system.

In describing the new bug tracking system, he said, that while he liked many existing bug systems, he realized there was an opportunity to redesign a new, much simpler bug tracking system for Google Code. The key he said was understanding that they had great full-text search tools available. That made them think differently about how to collect and organize the information in the bug “database.”

He believed that existing systems spent too much time deciding how to structure data entry and presenting a detailed form for users to fill out. They also then lock down the display of the information. He decided to keep structured data entry to a minimum and rely on text entry. A lot happens with labels/tags/keywords, for instance, to assign priority. The new bug submission form consisted of a text area with a few questions already inside it.

Traditional applications have been built on a traditional database view. That view requires that everything be decomposed into “square tables.” When you do this you invariably end up making lots of fine-grained fields into which information should be placed (e.g., bug number, short description, full description, assigned-to, fixed-in, related-to, error-number, severity, impact, etc.)

As it turns out Mark Logic built its own bug tracking system, bugtrack, based on a special-purpose DBMS platform that has rich full-text searching capabilities (i.e., MarkLogic Server). And, given that assumption for the underlying platform, we did something similar to Google. Per Ian Small, our VP of products:

Yes, we did close to the same thing [as Google described]. We minimized the structured data (I think there’s less than 5 required fields to enter a bug) and [instead] provided 5 big loosely structured buckets into which to dump full-text information. A few more structured fields get populated through the workflow process (e.g., bug assignment, bug scheduling, bug status)

It’s interesting to see how application development itself changes, when you change the underlying platform assumption. In my mind, you end up with more powerful, more flexible applications in so doing. See the comments on Dale’s post for more discussion of this topic.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.