Monthly Archives: October 2006

The “Content Is Ancillary” Test

I’m going to start applying a test to new prospective customers outside Mark Logic’s core publishing and government markets. In the middle of a conversation with a senior business person I’ll stop and say something like:

“Well, you know that many leading publishers trust their content to MarkLogic. And for them, that’s a major commitment because content *is* their business. So handling your problem, where content is really ancillary to your primary business of [xyz], should be no problem for us.”

I stumbled onto this test more than a year ago, during a corporate visit by JetBlue. I uttered the seemingly innocuous statement: “obviously, flight manuals are ancillary to your primary business of flying airplanes,” only to have my head immediately bitten off, chewed up, and spit out by their then-head of flight operations.

“We fly documents, not planes,” he said. “Without the documents, the planes don’t fly!”

The same thing happened to me the other day when meeting with an information technology VP at a major financial institution. “Well, obviously, the documents we’re discussing are ancillary to your core business of doing transactions,” I said, at which point my tete was again threatened by a similar response.

“Well, it may seem to you that these documents are ancillary, but without these manuals, policies and procedures in place, then there are no transactions to worry about!”

It was, as Yogi Berra so famously said,“déjà vu all over again.”

Having learned from this accidental discovery, henceforth I’m going to make this provocation deliberate. Because if your content is sufficiently important (dare I say “mission critical”) to your operations that your business basically stops when there is problem with it, then you should probably be looking for a server to hold it that is:

  • Built for content, not data (i.e., not an relational DBMS)
  • High performance
  • Scalable to terabytes and beyond
  • Highly-available
  • Used by the companies where content is their business

For many people, stuffing content into square tables might be good enough. But for those who pass the “content is ancillary” provocation, I’d say that you should be looking at special-purpose, content-optimized systems like MarkLogic and not general-purpose DBMSs, like Oracle, DB2, or SQL Server.

Stock Options: It’s A Darn Shame

It’s a darn shame what’s happened to stock options.

When I first started in Silicon Valley in the 1980s, stock options were seen as a real innovation, a creative way to align employees with shareholders, a way to help employees particpate in the value they were helping to create. Ever since then, and even now, I have remained a believer in stock options.

Then a number of things went wrong: greed, the Internet bubble, stock option expensing, and now the backdating scandals that are making headlines, every day.

To me, greed is the #1 culprit. There’s nothing wrong with stock options, per se. There is something wrong with boards who enable executives to make annual packages in the 10s or 100s of millions of dollars. While I know this sounds a bit like the “guns don’t kill people, people kill people” argument, stock options aren’t the problem; corporate governance is.

Then the Internet bubble exposed the “rising tide lifts all ships” problem with stock options. You could be an average CEO, doing an average job, running your average company, and if Wall Street 5x’ed the stocks in your category, then you made a fortune. That people don’t like this exposes a great irony. On one hand, as shareholders we just want the shares to go up (e.g., “I don’t want excuses, I don’t want to hear about enablers like product quality and customer satisfaction, I want results”). On the other, when the shares go up simply because everyone else’s did too, we don’t like it.

Even if you feel it’s irrational to solve it, there is a simple solution to the rising tide problem: link the option strike price to a stock market index or a market basket of competitors.

But that’s not what happened. Because CEOs were greedy — and boards let them be — the absolute numbers got out of control. So rather than fix the actual problem with options, the country decided to throw the stock-options baby out with the rising-tide bathwater: thus was born stock option expensing.

I think stock option expensing is a bad idea. Why?

  • I majored in math at Berkeley and I can barely understand the Black-Scholes model, so I can’t imagine why anyone would want “costs” calculated using it in a P&L statement.
  • A P&L statement should include real, known revenues and costs or simple allocations thereof (e.g., it makes sense to depreciate the cost of a factory that lasts 30 years over its 30-year life). I don’t like including probabilistic costs into a P&L and think it’s inaccurate to do so.
  • Expensing options has the ironic consequence of making stock options less democratic. If you were trying to solve the executive greed problem, you didn’t solve it. The typical corporate response to option expensing has been to continue granting options (and/or restricted stock) to top management and to stop granting options to rank-and-file employees. So the “fix” actually made the concentration problem worse, not better.
  • Eliminating rank-and-file stock options eliminates one of the few ways regular people can have economic class mobility. That leaves winning the lottery, successful stock and real estate speculation, and founding a company as the primary paths to the rags-to-riches part of the American dream.

In fact, the only thing I like about stock option expensing is that it gives a competitive advantage to startups, like Mark Logic. Because startup strike-prices tend to be low, the total value of the grants (strike-price x shares) tends to be small, thus the total expense tends to be small. (What’s more, few startups are measured on GAAP net income anyway).

If stock options didn’t have enough problems, now we have the backdating scandals. These seem to fall into two groups:

  • Sloppiness: companies with poor administration would fail to legally approve options on the date they intended, and thus would either knowingly or accidentally backdate them. (Accidental backdating could happen, for example, when an option was granted by unanimous board consent, which is effective not on the date the consent is sent for signature, but on the date the last consent is signed.)
  • Fraud/greed: some companies apparently set option dates back in time to build more profit into their grants. There is nothing illegal about pre-vesting shares in stock option grants (e.g., you can start out 2/48ths vested) and there is nothing illegal about granting in-the-money options. It is illegal, however, to backdate options before their actual grant dates and to pretend that in-the-money grants (which have immediate tax consequences) were not.

These things are all bad. But let’s not blame stock options themselves. Let’s blame corporate governance, greed, and sloppiness. Stock options are good. What many companies have done with them is bad. Let’s separate the two.

Sequoia Hits Again: YouTube Sells for $1.65B

If you somehow haven’t heard, Google today announced that it will acquire YouTube in a stock transaction worth $1.65B. The official press release is here.

The New York Times has an excellent article on the transaction. I’ve pulled a few excerpts related to YouTube (and Mark Logic) investor Sequoia Capital below.

Still in their 20’s and working from a garage, the two [YouTube founders] secured $3.5 million in capital from Sequoia Capital, the same firm that helped finance Google when it was still a fledgling company, as well as other Silicon Valley stars like Apple, Cisco, Oracle and Yahoo …

Sequoia’s stake in YouTube has been estimated at approximately 30 percent, which means it would be worth about $495 million based on the acquisition price …

Experts say Sequoia’s go-it-alone investment in YouTube represents the kind of aggressive move for which Sequoia is known. A more typical, and safer, approach would have been to bring in other investors, …

Per their corporate boilerplate, YouTube was founded in February 2005, making the company 18 months old. This means that YouTube has created slightly less than $100M per month in value for its owners during its brief existence. I’m not positive, but I’d have to bet that this is a record rate of equity value creation.

I must say that I’m pleased to have Sequoia as the lead investor in Mark Logic and that they continue to impress me not only with their day-to-day support of our company (e.g., introductions, advice, board participation) but also with their vision (e.g., brainstorming, thought leadership) and also their execution. On execution, three miracle transactions come to mind:

  • A tremendously successful IPO with Google in the middle an IPO drought (2004)
  • A $1.65B exit with YouTube in the middle of a general exit drought (sauf software connsolidation which is mostly a big-eat-big phenomenon today)
  • Doing the same thing with PayPal for ~$1.5B in the bleakest part of the Internet bust (2002)

Visit Sequoia here to learn more. I particularly like the left column of this page, where they explain what they think makes for sustainable companies.

Update (10/12/06): Here’s a link to a Mercury News article that profiles other major Sequoia success stories.

Fingerprinting Content

I recently ran across a cool (sister, Sequoia-backed) company, called Port Authority Technologies, that sells information leak protection solutions. They sell an appliance that sits next to your firewall and looks, at a network level, for content that’s getting sent outside that shouldn’t be. What kind of content might that be? Contracts, marketing plans, draft financial press releases, source code, customer databases, and such.

On discovering them, I first wondered: where were they when HP needed them? (If you’ve not yet heard, it seems that Patricia Dunn and three other people involved in the HP affair are going to be indicted.) It seems clear that monitoring outbound network traffic is a much better way of protecting information than calling phone companies under false pretexts. Then, always in touch with my inner marketer, I dreamed of the PR they could be generating on the back of the HP scandal.

Then I connected the dots on a series of companies using what I call “content fingerprinting” technology for interesting applications. Content fingerprinting is about scanning content, recognizing and remembering at a deep level, and then re-recognizing it when it’s about to head out the door, even if it has been transformed in some way, such as broken into network packets for transmission, zipped, deliberately fragmented into pieces, been through a series of global substitutions to mask it, or even (I think) encrypted. See here for more.

It seems the most popular applications of this technology are intellectual property protection, leak prevention, and compliance. Companies focused on these applications include:

For more background on these types of applications, check out this article.

I know another company, using similar fingerprinting technology, for a very different application. Palamida effectively sells an open-source detector to software companies. So instead of using the technology to first crawl sensitive corporate documents and then sniff for them on the way out, Palamida themselves go crawl every open source repository they can find. Then, they call up the VP of Engineering — which is particularly fun during a proposed acquisition — and say: “are you sure that none of your engineers have ever incorporated a bit of open source code that they shouldn’t have … and put it in your product, and potentially per the GPL, therefore turned your entire product into open source?”

Content fingerprinting technology also seems to be a great way for publishers to crawl the Internet and look for scraped, stolen, or otherwise misappropriated content. I’m sure publishers do some of this today, but my guess is they are using more basic methods, which means they could be missing a lot.

Speaking of publishers whose content has been misappropriated or scraped for someone else’s profit, look at what these guys did with my soccer post. That’s this post, scraped, with some irrelevant tags added, with ads alongside that are generating revenue for someone else. I’m not sure if it’s illegal, but it’s certainly not my intent to have someone else scrape my posts and effectively sell them.

Deconstructing Databases

Here’s a post where O’Reilly’s Dale Dougherty writes about a talk at EuroOSCON by Greg Stein, of Google Code, where Stein talks about building a bug-tracking system.

In describing the new bug tracking system, he said, that while he liked many existing bug systems, he realized there was an opportunity to redesign a new, much simpler bug tracking system for Google Code. The key he said was understanding that they had great full-text search tools available. That made them think differently about how to collect and organize the information in the bug “database.”

He believed that existing systems spent too much time deciding how to structure data entry and presenting a detailed form for users to fill out. They also then lock down the display of the information. He decided to keep structured data entry to a minimum and rely on text entry. A lot happens with labels/tags/keywords, for instance, to assign priority. The new bug submission form consisted of a text area with a few questions already inside it.

Traditional applications have been built on a traditional database view. That view requires that everything be decomposed into “square tables.” When you do this you invariably end up making lots of fine-grained fields into which information should be placed (e.g., bug number, short description, full description, assigned-to, fixed-in, related-to, error-number, severity, impact, etc.)

As it turns out Mark Logic built its own bug tracking system, bugtrack, based on a special-purpose DBMS platform that has rich full-text searching capabilities (i.e., MarkLogic Server). And, given that assumption for the underlying platform, we did something similar to Google. Per Ian Small, our VP of products:

Yes, we did close to the same thing [as Google described]. We minimized the structured data (I think there’s less than 5 required fields to enter a bug) and [instead] provided 5 big loosely structured buckets into which to dump full-text information. A few more structured fields get populated through the workflow process (e.g., bug assignment, bug scheduling, bug status)

It’s interesting to see how application development itself changes, when you change the underlying platform assumption. In my mind, you end up with more powerful, more flexible applications in so doing. See the comments on Dale’s post for more discussion of this topic.