Earlier this week a new Internet search engine with an oh-so-hip name, Cuil (pronounced “cool”), launched to great hype about ex-Googlers taking on their former employer with a 121,617,892,992 page index (that’s 121B if you’re not good at counting digits), supposedly making Cuil “the world’s biggest search engine.” Excerpt from their info page:
Rather than rely on superficial popularity metrics, Cuil searches for and ranks pages based on their content and relevance. When we find a page with your keywords, we stay on that page and analyze the rest of its content, its concepts, their inter-relationships and the page’s coherency.
PageRank a “superficial popularity metric”? There’s a clear roundhouse thrown at Google if I’ve ever seen one. I’m sorry, but didn’t PageRank crush keyword frequency in Internet search because the latter was too easily gamed by spammers? At first blush they sound like they going back to the past (which I doubt), but they’re not clearly backing their arguments, either. I’m fine with throwing punches at Google, but the punches better connect. This explanation in their FAQ doesn’t connect either:
So we started from scratch—with a fresh approach, an entirely new architecture and breakthrough algorithms […] our approach is to focus on the content of a page and then present a set of results that has both depth and breadth […]
So Cuil searches the Web for pages with your keywords and then we analyze the rest of the text on those pages. This tells us that the same word has several different meanings in different contexts. Are you looking for jaguar the cat, the car or the operating system?
We sort out all those different contexts so that you don’t have to waste time rephrasing your query when you get the wrong result.
The early PR seems to suggest that Cuil has been busted for over-reaching in their claims. See this Wall Street Journal blog as just one example. For more fun, read the comments which take no prisoners. Edited excerpts:
The last one reminded me of a comment I heard from a disgruntled Autonomy user the other day that went something like: “I guess I’m not smart enough to understand why the Bayesian relevancy algorithms failed to get the right result; all I know is they didn’t.”
When will search vendors stop peddling PhDs and algorithms, seemingly all the while ignoring results? Particularly in the world of Internet search where any clown (e.g., me) can go to a site, enter a few queries (almost always including a vanity search) and get an opinion of whether “it works” in seconds?
This whole episode reminds me of the Powerset launch (where I was also critical) arguing that calling yourself the next Google was almost a guarantee that you wouldn’t be. I nailed that prediction, but never had time to rant about it, so I’ll do so here.
After all the next-Google hype, when Powerset finally launched they could not search the entire Internet (as one might have reasonably expected) or even the entire English language Internet (as one could have very reasonably conceded given all the natural language processing) but instead a rather small content set called Wikipedia.
You raise $20M in total capital, you call yourself the next Google, you generate more press than Miley Cyrus, and when you launch you can only search Wikipedia? Are you kidding me? By the way, have you *ever* met anyone who’s complained that they can’t find information on Wikipedia!?
Fortunately (for them) Microsoft bought the company a few months later for an estimated $100M, certainly yielding a nice return on the $12.5M in VC invested and a nice windfall for the founders. Not a bad outcome, mind you. But, the next Google? Pluh-ease. See here for WebGuild’s take.
Anyway, I tried Cuil today and, like many others, was disappointed. I liked the Spartan search screen. I was mildly disappointed with the results of my vanity search; I prefer Google’s result because, among other reasons, I beat the realtor in Colorado. I liked Cuil’s multi-column presentation. I also liked the categories to refine searches, though I found them hard to find. At first blush, I liked the eye-candy, too, which reminded me of a toned-down version of SearchMe.
In fact, the whole thing reminded me of a weak version of SearchMe. Now I can’t remember if SearchMe has its own index or whether it’s adding value above an underlying Google search, but frankly, I don’t care. As a user of the site, all I care about is the user experience and the results. Results-wise, I like SearchMe better. User experience-wise, I like SearchMe much, much better.
In fact, my single biggest complaint on Cuil is the eye-candy. While SearchMe renders a very cool iTunes-like rolodex of each returned webpage, Cuil renders a bit of seemingly random eye candy, presumably using a whizzy algorithm to find the “best” image that, not to put too fine a point on it — doesn’t work.
For example, when you run the query “Mark Logic” on Cuil, the eye-candy includes:
- Two reversed company logos
- Four regular company logos
- An image from a documentation newsletter
- A Wipro logo (one of our partners)
- An image from one of our smaller marketing programs (asking if you’re missing the DITA bus?)
- A photo of Step
hen Buxton, our director of product management
- A photo of Jason Hunter, principal technologist and creator of MarkMail
So the collection of eye-candy is both rather boring (i.e., repetitive) and random. Simply put, if you ran the query and did a quick skim of the results, you’d think that Stephen or Jason ran Mark Logic.
More interestingly, it’s not at all obvious how they’re assembling the eye-candy. When you visit the pages associated with the displayed images, the images aren’t there. Hence, my speculation that they have a whizzy algorithm that finds eye-candy on or near the referenced page, but that nevertheless, uh, doesn’t work.
Hence the four reasons I like SearchMe better:
- The SearchMe UI is unquestionably cooler than Cuil
- I find the SearchMe results better than Cuil’s
- Both SearchMe and Cuil have categories for search refinement
- And SearchMe doesn’t use “magic” in assembling the eye candy so it just works better
Note that I’m generally distrustful of magic (see Uh Oh, It’s Magic) and greatly prefer SearchMe’s straightforward approach of just rendering the referenced page as opposed to trying to whiz-up some potentially relevant JPEG.