Category Archives: XML

The Publishing [R]evolution

I’m posting the slides that Darin McBeath from Elsevier presented at the XML Holland conference a few months back. I’m sorry about the delay, but I wanted to be sure it was OK with Darin and the process got stuck on my back burner.

In addition to an all-around great speech, Darin introduced two concepts that I liked a lot.

  • Fewer moving parts
  • Find the ringtones

“Fewer moving parts” was Darin’s metaphor on simplicity in building pure XML-based systems (with XML content and XQuery as the query / programming language). It’s always hard to argue the business benefits of simplicity without doing detailed costing analysis. I thought it was creative of Darin to use this metaphor to drive the point home. We know jet engines are safer than piston engines because they are simpler and have fewer parts. The same could be said of Nokia vs. Motorola phones. Fewer parts works. When you build content applications on XML content with XQuery and an XML content server, you have fewer moving parts. No Java layer. No relational mappings. See this post, The Virtues of Top-to-Bottom XML, for more.

“Find the ringtones” was another cool Darin idea. As you probably know, ringtones are a multi-billion dollar business. The amazing thing about ringtones is that you can charge $3.00 for fifteen seconds of a song which in its three-minute entirety would sell for $0.99. Less really can be more. Darin’s challenge to publishers was to “find the ringtones” in their content. Where, in different sections of the publishing business, can you deliver higher value and increased revenue — by offering less? That’s a cool question. And in an increasingly information-overloaded world, a smart one.

In the better late than never department, here are Darin’s slides.

Microsoft 8-K Filed in Latest XBRL

See this IDG news service story that reports on Microsoft filing its latest form 8-K using the latest and greatest version of XBRL (extensible business reporting language), an emerging XML-based standard to define and exchange business and financial performance information.

Excerpts:

Microsoft said it’s the first company to submit data using a new XBRL taxonomy released on Wednesday that allows the description of data according to U.S. Generally Accepted Accounting Principles (GAAP). The taxonomy defines, for example, what tags should be used to label data such as “net profit.”

The advantage of XBRL is that it is machine-readable, and computers can use the tags to pull out comparable data from different companies from their filings.

Microsoft is one of about three dozen companies participating in a one-year pilot program to submit reports in XBRL, according to the SEC. The SEC has run a voluntary XBRL filing program since 2005.

If you’ve never seen an example of XBRL, go here to see the complete filing in an XML marked-up text file. Go here to see the XBRL instance file itself.

When I look at the files, I get two impressions:

  • Wow, think of the powerful queries you’ll be able to do with all this mark-up. XML really database-izes content.
  • Wow, is XML verbose! I am so glad we are experts in compression at MarkLogic. When dealing with XML, you need to be.

Celebrating XML Independence

Today, I’d like to highlight a (4th of July holiday) post on Matt Turner’s Discovering XQuery blog. Matt’s post refers to this article, entitled XQuery: The Server Language, on XML.com, written by Kurt Cagle.

I’d read Kurt’s article when it was posted on June 6 and had meant to blog on it, but didn’t get around to it (or frankly, much blogging at all) during the busy month of June. Nevertheless, here are few chunky morsels from Kurt’s article:

As an XML developer, one of the problems that I come across almost invariably within these [server-side scripting] languages is the fact that they are shaped by people who view XML as something of an afterthought, a small subset of the overall language that’s intended to satisfy those strange people who think in angle brackets.

He then shows an example (that warmed Matt Turner’s heart) of how often people have to create HMTL by composing strings in-line. More morsels:

The original intent of the developers of XQuery was to use it, not surprisingly, as an XML-oriented query language. XQuery is not itself XML based (nor for that matter is XPath), but all of its operations are designed to work with XML documents or XML databases to provide a way of filtering or manipulating that XML to produce some form of output, most typically as XML or HTML.

Intriguingly, as a filter on XML, XQuery has seen only limited success. Part of this has to do with the fact that a significant number of the databases currently in use are SQL based, not XML based, so the benefits to gained by using an XML query filter are offset by the need to convert relational data into XML in the first place.

While I’d agree with Kurt thus far on the market adoption of XQuery and the hassle introduced by having to map XML to an RDBMS (see this post on Top-to-Bottom XML Apps), we at Mark Logic like to think of ourselves as the exception to the slow XQuery adoption rule. While XQuery is not a huge wind at our back, we have been able to grow the company eight-fold since I joined in 3Q04 and that growth is most definitely helped by the de-risking that comes with XQuery by virtue of it being both an industry standard and an eventual, inexorable replacement for SQL.

(If green is the new black, then XQuery is the new SQL, and SQL the new COBOL.)

Kurt concludes his article with:

This article serves as a very basic introduction to XQuery as a server language. I will be addressing this topic in more detail in subsequent articles in this series, examining some of the more sophisticated capabilities and the gotchas inherent in working with XQuery and eXist, and showing what explosive power you can release when you combine eXist or other rest based XQuery engines with XForms and Ajax.

My prediction is that REST based XML databases like eXist will seriously challenge the existing raft of server languages, from ASP to Ruby, within the next couple of years. Right now, it’s something of a closed secret among a few developers, but the power, sophistication and ease of use inherent in working with the XML as if it were a natural part of the server landscape can only be understood by trying it.

I couldn’t agree more with the bolded statement and we all look forward to seeing the subsequent articles in the series.

Web Applications: The Virtues of Top-to-Bottom XML

I think that most people now correctly perceive our product, MarkLogic Server, as an XML content server, a special-purpose DBMS designed specifically for handling XML marked-up content. That’s the good news.

The better news is that many of these same people are figuring out what that means when it comes to developing web applications – specifically, that you can use an XML content server to build web applications using XML top-to-bottom. No Java required. No relational tables required. No application server required. (And no expense for all those supporting products.)

Don’t get me wrong. Many customers choose to use MarkLogic as the XML repository and query system in their architecture, building their applications in Java, using an application server, and making calls out to MarkLogic to process XML queries. Lots of people use the product in that way. That’s fine.

But, people soon realize, when you have a DBMS and query language (XQuery) that directly outputs XML (e.g., xHTML) which can be directly rendered by a browser, and when that “query” language is really a misnamed and underpositioned programming language easily capable of developing entire applications, you can say:

“Wait a minute. My content’s in XML. My browser speaks XML. Why not build my whole app top-to-bottom in XML and XQuery?”

Good question. And the answer is you can. And in many cases, you probably should. What’s the advantage of so doing?

  • Use of a high-level, standard, powerful programming language, XQuery. High-level and powerful translate to greater development and maintenance productivity. Standard translates to risk reduction and freedom of choice. (Aside: While XQuery is not a big-hype, overnight-success type of technology like Ajax, XQuery continues to march along with certain inevitability. In my mind, there is no question that XQuery will be the database programming language of the future – it is superior to SQL, it is more general than SQL and ergo applicable to a broader class of problems, and all major DBMS vendors are already committed to it. The question is not will XQuery become mainstream, but when?)
  • Elimination of three impedance mismatches: Java/XML, XML/relational, and Java/relational. Java is object-oriented, XML is hierarchical, and relational databases are tabular. The mapping between these three different data models generates a lot of zero-value-added work in developing an application. When you’re XML top-to-bottom, poof, that work’s all gone.
  • Elimination of tiers. I had lunch a while back with a top engineer at Oracle who told me that he believed the limiting factor on database application performance was becoming scheduling. That is, hardware and databases are becoming so fast that scheduling work across tiers was becoming the limiting factor in performance. His suggested solution? Eliminate tiers. Well top-to-bottom XML does exactly that.

Boulevard of Broken XML Dreams

“I walk a lonely road, the only road that I have ever known,
I don’t know where it goes, but it’s only me and I walk alone.”

- Green Day, Boulevard of Broken Dreams

I’ve been traveling a bit of late, talking to customers, and a theme that I mentioned in this post, Pushing 1994 Hot Buttons, keeps coming up. The theme is that while the XML grand vision has been around for nearly ten years, that the tools to fully implement it are quite new. As a result, I often find myself describing our target market as broken-hearted visionaries.

What was that vision again?

“A central XML repository for all content. Presentation- and product-neutral. Content repurposed seamlessly into multiple products. The same content delivered painlessly to multiple channels, like print, web pages, RSS feeds, PDAs, and sometimes even voice. The ability to find small bits of content and re-assemble them to make new products — recombinant content, if you will. Easy transformation and integration of content from multiple sources. Custom publishing systems. Print on demand. Single-source publishing.”

You can hear the excitement building as they describe the vision.

“Sure, well, we wanted to do that.” The vision deflates. “We thought we could that. We hoped we could do that. But we couldn’t. I suppose we have to live in the real world.”

Broken dreams.

Here’s the thing. In the past few years, the technology has caught up the vision. Now, you can implement these systems; we do it at Mark Logic every day. Don’t let what was not possible yesterday using a franken-combination of databases and search engines limit what you try to do today.

Simply put: XML + XQuery + XML content servers (like MarkLogic) now deliver on the XML grand vision. Dust off your inner visionary and see!

Pushing 1994 Hot Buttons

I was on a sales call in the UK recently, where we had a telling interchange with a prospective customer (PC). It went something like this.

PC: “So tell me what problems other publishers are solving with MarkLogic?”

ML: We described how our customers repurpose content, build custom publishing systems, integrate content, deliver content through multiple channels, and perform powerful search and discovery.

PC: “Well, you’ve pushed all my hot buttons … “

ML: “Super”

PC: “… as of 1994.”

ML: “Ugh.”

PC: “But, then again, I suppose you can actually do these things now.”

ML: “Indeed.”

Is there anything wrong with this interchange? In my mind, no. So often in technology vision gets way ahead of the ability to implement it. Smalltalk had a virtual machine for a decade; it took Java made it ubiquitous. AI and data mining technologies have existed for years; it took decades for data warehouses of clean data to make them useful.

Yes, the XML vision has been around for 10 years. But only now can customers actually do many of the things with their XML content that they envisioned years ago when they set out of their XML journey.

I often say that we sell to disenchanted visionaries: people who set out on a XML vision years ago, people who made a a big investment creating XML markup only to find that the tools (e.g., ECM, RDBMS) weren’t there to deliver on their vision. No wide receiver ran out under the Hail Mary pass.

Then along comes MarkLogic and we do precisely that. It’s one reason our User Conferences are so incredibly positive in tone. Put in a phrase: we let you do what you thought you could do when you set out to use XML in the first place.

(Don’t worry, this won’t replace “Unlock Content” as our official tagline any time soon.)