I’ve never liked what I call “the next” positioning. Examples:
- Dan Quayle trying to position as the next Jack Kennedy until Lloyd Bensten put an abrupt end to the feeble effort.
- The object DBMS vendors in the early 1990s positioning as the next Oracle. (Gartner analyst and database guru Donald Feinberg once told me that if he had quarter for every startup that announced that they were the next Oracle that he’d have a lot of quarters.)
- Numerous singers trying to be the next Elvis. There won’t be another Elvis. The original one still lives today in Vegas and works as a pit boss. (There is, however, an alternative Elvis, Johnny Hallyday, very much the French Elvis who lived.)
- Lots of SaaS vendors trying to position as the next Salesforce. No one has really succeeded, somewhat amazingly. SaaS today strikes me as a one-success category. Go read the Netsuite S-1 if you’re feeling differently.
In fact, in a Murphy’s Law sort of way, positioning as “the next something” seems an almost certain guarantee that you won’t be. Thus, it should come as no surprise that no one consulted me before deciding the positioning of Powerset, a much-hyped natural language search engine, in their recent launch.
If you don’t believe that they’re positioning on this angle, then see these stories:
- Search Startup Ready to Challenge Google
- Powerset Natural Language Search Engine to Dethrone Google?
- Powerset: The Search Engine That Would Outdo Google?
- Will Powerset Pull a Google?
- Is Google Search Vulnerable in 2007?
In fact, if you run the Google search “Powerset Google” you come up with 1.4M results.
Don’t get me wrong. There are many things about Powerset that I like. I like the argument that discarding stopwords can cause a major loss of meaning. I like their characterization of search”keywordese.” I loved the grunting pigeons. I like their blog and loved their attempt to parse Miss Teen South Carolina.
But I have two problems with Powerset. First, they shouldn’t have done a “next Google” positioning which is bound only to disappoint. Second, they have failed to learn a key lesson from the business intelligence (BI) market: it’s not about natural language search — it is about database query.
All search vendors seem obsessed with a quest to “figure out what you mean” based either a few grunted keywords or a short phrase. In the early days of business intelligence people went on that quest, too. I recall the DataTalker from Berkeley-based Natural Language, Inc. The idea was you could ask seemingly innocent database queries like “who sold the most new products in New York on Tuesdays?”
The problem was:
- What do you mean by sold? Market value or net to us? Before or after allowance for doubtful accounts?
- What do you mean by Tuesday? Do you literally mean Tuesday or do you mean the second day of the work week?
- What do you mean by New York? Do you mean the city or the state? If you mean the city, do you mean Manhattan or the five Boroughs?
- What do you mean by new products? Launched within the last 6 weeks or 6 months?
There are reasons why Business Objects went on to become a $1.5B company while NLI was sold to Microsoft for a pittance:
- Natural language is notoriously imprecise
- Devising a simple-to-use interface for specifying precisely what you want seems infinitely superior to all sorts of advanced technology that guesses
- Similarly, creating a semantic layer that defines precise answers to all the “what do you mean” questions seems infinitely superior to more guessing
As technologists, we are drawn to interesting questions and whizzy technology that imputes meaning from language. (Heck, I like it, too.) And if you like such technology you can use it in conjunction with Mark Logic (just use it to enrich XML content and add tags).
But the lesson to me is that database-style query beats either grunted keyword or short-phrase natural language search when building enterprise systems.
I won’t predict whether natural language search will beat keyword search on the broad Internet. But I do believe that for enterprise content systems in publishing, government / defense / intelligence, pharma / life sciences, and financial — that what’s needed is database-style queries on content, not the ContentTalker.