I’ve never liked what I call “the next” positioning. Examples:
- Dan Quayle trying to position as the next Jack Kennedy until Lloyd Bensten put an abrupt end to the feeble effort.
- The object DBMS vendors in the early 1990s positioning as the next Oracle. (Gartner analyst and database guru Donald Feinberg once told me that if he had quarter for every startup that announced that they were the next Oracle that he’d have a lot of quarters.)
- Numerous singers trying to be the next Elvis. There won’t be another Elvis. The original one still lives today in Vegas and works as a pit boss. (There is, however, an alternative Elvis, Johnny Hallyday, very much the French Elvis who lived.)
- Lots of SaaS vendors trying to position as the next Salesforce. No one has really succeeded, somewhat amazingly. SaaS today strikes me as a one-success category. Go read the Netsuite S-1 if you’re feeling differently.
In fact, in a Murphy’s Law sort of way, positioning as “the next something” seems an almost certain guarantee that you won’t be. Thus, it should come as no surprise that no one consulted me before deciding the positioning of Powerset, a much-hyped natural language search engine, in their recent launch.
If you don’t believe that they’re positioning on this angle, then see these stories:
- Search Startup Ready to Challenge Google
- Powerset Natural Language Search Engine to Dethrone Google?
- Powerset: The Search Engine That Would Outdo Google?
- Will Powerset Pull a Google?
- Is Google Search Vulnerable in 2007?
In fact, if you run the Google search “Powerset Google” you come up with 1.4M results.
Don’t get me wrong. There are many things about Powerset that I like. I like the argument that discarding stopwords can cause a major loss of meaning. I like their characterization of search”keywordese.” I loved the grunting pigeons. I like their blog and loved their attempt to parse Miss Teen South Carolina.
But I have two problems with Powerset. First, they shouldn’t have done a “next Google” positioning which is bound only to disappoint. Second, they have failed to learn a key lesson from the business intelligence (BI) market: it’s not about natural language search — it is about database query.
All search vendors seem obsessed with a quest to “figure out what you mean” based either a few grunted keywords or a short phrase. In the early days of business intelligence people went on that quest, too. I recall the DataTalker from Berkeley-based Natural Language, Inc. The idea was you could ask seemingly innocent database queries like “who sold the most new products in New York on Tuesdays?”
The problem was:
- What do you mean by sold? Market value or net to us? Before or after allowance for doubtful accounts?
- What do you mean by Tuesday? Do you literally mean Tuesday or do you mean the second day of the work week?
- What do you mean by New York? Do you mean the city or the state? If you mean the city, do you mean Manhattan or the five Boroughs?
- What do you mean by new products? Launched within the last 6 weeks or 6 months?
There are reasons why Business Objects went on to become a $1.5B company while NLI was sold to Microsoft for a pittance:
- Natural language is notoriously imprecise
- Devising a simple-to-use interface for specifying precisely what you want seems infinitely superior to all sorts of advanced technology that guesses
- Similarly, creating a semantic layer that defines precise answers to all the “what do you mean” questions seems infinitely superior to more guessing
As technologists, we are drawn to interesting questions and whizzy technology that imputes meaning from language. (Heck, I like it, too.) And if you like such technology you can use it in conjunction with Mark Logic (just use it to enrich XML content and add tags).
But the lesson to me is that database-style query beats either grunted keyword or short-phrase natural language search when building enterprise systems.
I won’t predict whether natural language search will beat keyword search on the broad Internet. But I do believe that for enterprise content systems in publishing, government / defense / intelligence, pharma / life sciences, and financial — that what’s needed is database-style queries on content, not the ContentTalker.
P.S.I failed to mention that AltSearchEngines.com is also a great place to read articles by Nitin Karandikar of Software Abstractions fame.Charles Knight
Hi Mark, You are mistaken in thinking that Powerset applies NLP to the search keywords. Actually that’s not what they do. What they do is apply NLP to the documents to derive a search semantic index.They are not trying to derive semantics from the user’s keywords.Having said that I do agree with all your other points, that comparison with Google was unwarranted and will backfire.Google today doesn’t compete on technology anymore. They compete on brand power. They also have strong technology for now, massive scalability and great advertiser relationships.I also believe that enterprises need much more than a Google style search engine, since enterprises most often need exact results for specific queries. XML Content servers may be the way to go.IBM has something caled UIMA(Unstructured Information Management Architecture) for enterprise semantic search. However , it looks at content as a search problem and doesn’t go as far as doing structured XQuery based search as XML content servers attempt to do.-Radha
Hey Mark, I’d be tempted to quote Jessica Rabbit and claim “we’re not bad… we’re just drawn that way…” but the truth is that it’s hard to get the press off the idea that anybody who is coming to innovate in search is out to kill Google. Powerset is really trying to do something qualitatively different that breaks with the status quo and demonstrates brand new capabilities in search. If you have any suggestions about how to get the press to not go on wildly with the Google-killer story, let us know.Lorenzo Thione Powerset Founder & Product Architect
Hi Mark, give http://geo.anaphoric.com/ a spin and see if you still hold to your opinion about NL phrasal search of databases.Regards.
geo.anaphoric.com … right I asked “longest lake west of rocky mountains” got bitched at about “rocky” [so rocky mountains was not recognized as a denotation] and then got the puzzling error message “Sorry there is no record of how long interstates are.”Dont change your mind because of Anaphoric, Mark Logic.