Category Archives: MarkLogic

The Pillorying of MarkLogic: Why Selling Disruptive Technology To the Government is Hard and Risky

There’s a well established school of thought that high-tech startups should focus on a few vertical markets early in their development.  The question is whether government should be one of them?

The government seems to think so.  They run a handful of programs to encourage startups to focus on government.  Heck, the CIA even has a venture arm right on Sand Hill Road, In-Q-Tel, whose mission is to find startups who are not focused on the Intelligence Community (IC) and to help them find initial customers (and provide them with a dash of venture capital) to encourage them to do so.

When I ran MarkLogic between mid-2004 and 2010, we made the strategic decision to focus on government as one of our two key verticals.  While it was then, and still is, rather contrarian to do so, we nevertheless decided to focus on government for several reasons.

  • The technology fit was very strong.  There are many places in government, including the IC, where they have a bona fide need for a hybrid database / search engine, such as MarkLogic.
  • Many people in government were tired of the Oracle-led oligopoly in the RDBMS market and were seeking alternatives.  (Think:  I’m tired of writing Oracle $40M checks.)  While this was true in other markets, it was particularly true in government because their problems were compounded by lack of good technical fit — i.e., they were paying an oligopolist a premium price for technology that was not, in the end, terribly well suited to what they were doing.
  • Unlike other markets (e.g., Finance, Web 2.0) where companies could afford the high-caliber talent able to use the then-new open source NoSQL alternatives, government — with the exception of the IC — was not swimming in such talent.  Ergo, government really needed a well-supported enterprise NoSQL system usable by a more typical engineer.

The choice had always made me nervous for a number of reasons:

  • Government deals were big, so it could lead to feast-or-famine revenue performance unless you were able to figure out how to smooth out the inherent volatility.
  • Government deals ran through systems integrators (SI) which could greatly complexify the sales cycle.
  • Government was its own tribe, with its own language, and its own idiosyncrasies (e.g., security clearances).  While bad from the perspective of commercial expansion, these things also served as entry barriers that, once conquered, should provide a competitive advantage.

The only thing I hadn’t really anticipated was the politics.

It had never occurred to me, for example, that in a $630M project — where MarkLogic might get maybe $5 to $10M — that someone would try to blame failure of what appears to be one of the worst-managed projects in recent history on a component that’s getting say 1% of the fees.

It makes no sense.  But now, for the second time, the New York Times has written an article about the fiasco where MarkLogic is not only one of very few vendors even mentioned but somehow implicated in the failures because it is different.

Let me start with a few of my own observations on from the sidelines.  (Note that I, to my knowledge, was never involved with the project during my time at MarkLogic.)

From the cheap seats the problems seem simple:

  • Unattainable timelines.  You don’t build a site “just like” using government contractors in a matter of quarters.  Amazon has been built over the course of a more than a decade.
  • No Beta program.  It’s incomprehensible to me that such a site would go directly from testing into production without quarters of Beta.  (Remember, not so long ago, that Google ran Beta’s for years?)
  • No general oversight.  It seems that there was no one playing the general contractor role.  Imagine if you built a house with plumbers, carpenters, and electricians not coordinated by a strong central resource.
  • Insufficient testing.  The absent Beta program aside, it seems the testing phase lasted only weeks, that certain basic functionality was not tested, and that it’s not even clear if there was a code-freeze before testing.
  • Late changes.  Supporting the idea that there was no code freeze are claims that the functional spec was changing weeks before the launch.

Sadly, these are not rare problems on a project of this scale.  This kind of stuff happens all the time, and each of these problems is a hallmark of a “train wreck” software development project.

To me, guessing from a distance, it seems pretty obvious what happened.

  • Someone who didn’t understand how hard it to build was ordered up a website of very high complexity with totally unrealistic timeframes.
  • A bunch of integrators (and vendors) who wanted their share of the $630M put in bids, probably convincing themselves in each part of the system that if things went very well that they could maybe make the deadlines or, if not, maybe cut some scope.  (Remember you don’t win a $50M bid by saying “the project is crazy and the timeframe unrealistic.”)
  • Everybody probably did their best but knew deep down that the project was failing.
  • Everyone was afraid to admit that the project was failing because nobody likes to deliver bad news, and it seems that there was no one central coordinator whose job it was to do so.

Poof.  It happens all the time.  It’s why the world has generally moved away from big-bang projects and towards agile methodologies.

While sad, this kind of story happens.  The question is how does the New York Times end up writing two articles where somehow the failure is somehow blamed on MarkLogic.  Why is MarkLogic even mentioned?  This the story of a project run amok, not the story of a technology component failure.

Politics and Technology

The trick with selling disruptive technology to the government is that you encounter two types of people.

  • Those who look objectively at requirements and try to figure out which technology can best do the job.  Happily, our government contains many of these types of people.
  • Those who look at their own skill sets and view any disruptive technology as a threat.

I met many Oracle-DBA-lifers during my time working with the government.  And I’m OK with their personal decision to stop learning, not refresh their skills, not stay current on technology, and to want to ride a deep expertise in the Oracle DMBS into a comfortable retirement.  I get it.  It’s not a choice I’d make, but I can understand.

What I cannot understand, however, is when someone takes a personal decision and tries to use it as a reason to not use a new technology.  Think:  I don’t know MarkLogic, it is new, ergo it is a threat to my personal career plan, and ergo I am opposed to using MarkLogic, prima facie, because it’s not aligned with my personal interests.  That’s not OK.

To give you an idea of how warped this perspective can get (and while this may be urban myth), I recall hearing a story that one time a Federal contractor called a whistle-blower line to report the use of MarkLogic on system instead of Oracle.  All I could think of was Charlton Heston at the end of Soylent Green saying, “I’ve seen it happening … it’s XML … they’re making it out of XML.

The trouble is that these folks exist and they won’t let go.  The result:  when a $630M poorly managed project gets in trouble, they instantly raise and re-raise decisions made about technology with the argument that “it’s non-standard.”

Oracle was non-standard in 1983.  Thirty years later it’s too standard (i.e., part of an oligopoly) and not adapted to the new technical challenges at hand.  All because some bright group of people wanted to try something new, to meet a new challenge, that cost probably a fraction of what Oracle would have charged, the naysayers and Oracle lifers will challenge it endlessly saying it’s “different.”

Yes, it is different.  And that, far as I can tell, was the point.  And if you think that looking at 1% of the costs is the right way to diagnose a struggling $630M project, I’d beg to differ.  Follow the money.


FYI, in researching this post, I found this just-released progress report.

The 20th Century Called. It Wants Its Relational Database Back.

I saw this piece of creative the other day for a tradeshow ad and loved it.  Remember, Ted Codd invented the relational database in 1970 with his paper “A Relational Model for Shared Data Banks.”  This PDF of the classic looks about as old as the ad.  (Do PDFs age?)  Enjoy!

My Slides from the MarkLogic 2010 Digital Publishing Summit

Just a quick post to share my slides from this year’s standing-room-only 2010 Digital Publishing Summit at the Plaza Hotel.

Thank you to everyone for attending!

Six Thoughts on The NoSQL Movement

We are in the middle of one of our periodic analyst tours at MarkLogic, where we meet about 50 top software industry analysts focused in areas like enterprise search, enterprise content management, and database management systems.  The NoSQL movement was one of four key topics we are covering, and while I’d expected some lively discussions about it, most of the time we have found ourselves educating people about NoSQL.

In this post, I’ll share the six key points we’re making about NoSQL on the tour.

Our first point is that NoSQL systems come in many flavors and it’s not just about key/value stores.  These flavors include:

  • Key/value stores (e.g., Hadoop)
  • Document databases (e.g., MarkLogic, CouchDB)
  • Graph databases (e.g., AllegroGraph)
  • Distributed caching systems (e.g., Memcached)

Our second point is that NoSQL is part of a broader trend in database systems:  specialization.  The jack-of-all-trades relational database (e.g., Oracle, DB2) works reasonably well for a broad range of applications — but it is a master of none.  For any specific application, you can design a specialized DBMS that will outperform Oracle by 10 to 1000 times.  Specialization represents, in aggregate, the biggest threat to the big-three DBMS oligopolists.  Examples of specialized DBMSs include:

  • Streambase, Skyler:  real-time stream processing
  • MarkLogic:  semi-structured data
  • Vertica, Greenplum:  mid-range data warehousing
  • Aster:  large-scale (aka “big data”) analytic data warehousing
  • VoltDB:  high volume transaction processing
  • MATLAB:  scientific data management

Our third point is that NoSQL is largely orthogonal to specialization.  There are specialized NoSQL databases (e.g., MarkLogic) and there are specialized SQL databases (e.g., Aster, Volt).  The only case where I think there are zero examples is general-purpose NoSQL systems.  While I’m sure many of the NoSQL crowd would argue that their systems can do everything, is anyone *really* going to run general ledger or opportunity management on Hadoop?   I don’t think so.

Our fourth point is that NoSQL isn’t about open source.  The software-wants-to-be-free crowd wants to build open source into the definition of NoSQL and I believe that is both incorrect and a mistake.  It’s incorrect because systems like MarkLogic (which uses an XML data model and XQuery) are indisputably NoSQL.  And it’s a mistake because technology movements should be about technology, not business models.  (The open source NoSQL gang can solve its problem simply by affiliating with both the NoSQL technology movement and the open source business model movements.)

As CEO of a company that’s invested a lot of energy in supporting standards, our fifth point was that, rather ironically, most open source NoSQL systems have proprietary interfaces.  People shouldn’t confuse “can access the source code” with “can write applications that call standard interfaces” and ergo can swap components easily.   If you take offense at the word proprietary, that’s fine.  You can call them unique instead.  But the point is an application written on Cassandra is not practically moved to Couch, regardless of whether you can access the source code both Couch and Cassandra.

Our sixth point is that we think MarkLogic provides a best-of-both-worlds option between open source NoSQL systems and traditional DBMSs.  Like open source NoSQL systems, MarkLogic provides shared-nothing clustering on inexpensive hardware, superior support for unstructured data, document-orientation, and high-performance.  But like traditional databases, MarkLogic speaks a high-level query language, implements industry standards, and is commercial-grade, supported software.  This means that customers can scale applications on inexpensive computers and storage, avoid the pains of normalization and joins, have systems that run fast, can be implemented by normal database programmers, and feel safe that their applications are built via a standard query language (XQuery) that is supported by scores of vendors.

Slides from Mark Logic Digital Publishing Summit

I’m at the Mark Logic Digital Publishing Summit at The Plaza Hotel in New York. While I’m not sure what the “official” means will be for sharing presentation slides, based on a few requests at lunch I’ve uploaded my slides and David Worlock’s slides to SlideShare and embedded them here.

Great event, over 550 registered, almost ran out of chairs at lunch. Thanks to everyone for coming!

My slides:

David’s slides: