2007-03-08

XPath as a first-class citizen

The DOM standard packs great power, one of them that of XPath, for slicing and dicing DOM trees (whether originally from XML or HTML input). Compared to the very good integration of similar javascript language features -- E4X in recent javascript implementations, or the RegExp object, with us since Javascript's earliest days, the DOM XPath interface is a very sad story. I will refrain from telling it here, and primarily focus on how it should be redone, to better serve script authoring. My scope in this post is the web browser, so as not to confuse things with, say, ECMAScript, which is a language used in many non-DOM-centric environments too.

First, we might recognize that an XPath expression essentially is to a DOM Node tree what a RegExp is to a string. It's a coarse but valid comparison, and one that should guide how we use XPath in web programming. From here, let's thus copy good design and sketch the outlines of an XPath class that is as good and versatile for the DOM as RegExp is for strings.

Javascript RegExps are first class objects, with methods tied to them that take a data parameter to operate on. XPaths should be too. Instantiation, using the class constructor, thus looks like:
var re = new RegExp("pattern"[, "flags"]);
var xp = new XPath("pattern"[, "flags"]);
The respective patterns already have their grammars defined and are thus left without further commentary here. RegExp flags are limited to combinations of:
"g"
global match
"i"
case insensitive
"m"
multiline match
The XPath flags would map to their XPathResultType counterparts (found on the XPathResult object) for specifying what properties of the resulting node set you are interested in (if you write docs for that horrible API, please copy this enlightening table):
Nodes wantedBehaviourUnOrderedOrdered
MultipleIteratorUNORDERED_NODE_ITERATOR_TYPE=4ORDERED_NODE_ITERATOR_TYPE=5
MultipleSnapshotUNORDERED_NODE_SNAPSHOT_TYPE=6ORDERED_NODE_SNAPSHOT_TYPE=7
SingleSnapshotANY_UNORDERED_NODE_TYPE=8FIRST_ORDERED_NODE_TYPE=9

that are reducible to permutations of whether you want a single or multiple items, want results sorted in document order or don't care, and if you want a snapshot, or something that just lets you iterate through the matches until you perform your first modification to a match. There are really ten options in all, but NUMBER_TYPE=1, STRING_TYPE=2 and BOOLEAN_TYPE=3 were necessitated by a design flaw we shall not repeat, and ANY_TYPE=0 is the automatic pick between one of those or UNORDERED_NODE_ITERATOR_TYPE=4. Let's have none of that.

Copying some more good design, let's make those options a learnable set of three one-letter flags, over the hostile set of ten types, averaging 31.6 characters worth of typing each (or a single digit, completely devoid of semantic memorability). When desigining flags, we get to pick a default flag-less mode and an override. In RegExp the least expensive case is the default, bells and whistles invokable by flag override; we might, for instance, heed the same criteria (or, probably better still, deciding on what constitutes the most useful default behaviour, and naming the flags after the opposite behaviour instead):
"m"
Multiple nodes
"o"
Ordered nodes
"s"
Snapshot nodes
I briefly mentioned a design error we shouldn't repeat. The DOM document.evaluate, apart from having a long name on its own, and further drowning you in mandatory arguments and 30-to-40 character type names, does not yield results you can use right away as part of a javascript expression. Instead it hands you some ravioli, in the form of an XPathResult object, which you may pry the actual results off, by jumping through a few hoops. This is criminally offensively bad interface design, in innumerable ways. Again, let's not go there.

It might be time we decided on calling conventions, so we have some context to anchor up what the results returned are with. Our XPath object (which, contrary to a result set, makes lots of sense sticking into an object, to keep around for doing additional queries with later on without parsing the path again) has an exec() method, as does RegExp, and it takes zero to two arguments.

xp.exec( contextNode, nsResolver );

The first argument is a context node, from which the expression will resolve. If undefined or null, we resolve against document.documentElement. The context node may be anything accepted as a context node by present day document.evaluate, or an E4X object.

The second argument is, if provided as a function, a namespace resolver (of type XPathNSResolver, just as with the DOM API). If we instead provide an object, do a lookup for namespace prefixes from it by indexing out the value from it, as with an associative array. In the interest of collateral damage control, should the user have cluttered up Object.prototype, we might be best off to only pick up namespaces from it whose nsResolver.hasOwnProperty(nsprefix) yields true.

The return value from this method is similar in intended spirit to XPathResult.ANY_TYPE, but without the ravioli. XPaths yielding number, string or boolean output returns a number, string or boolean. And the rest, which return node sets, return a proper javascript Array of the nodes matched. Or, if for some reason an enhanced object be needed, one which inherits from Array, so that all (native or prototype enhanced) Array methods; shift, splice, push and friends, work on this object, too.

Finally, RegExps enjoy a nice, terse, literal syntax. I would argue that XPaths should, as well. My best proposal (as most of US-ASCII is already allocated) is to piggy-back onto the RegExp literal syntax, but mandate the flag "x" to signify being an XPath expression. Further on, as / is a too common character in XPath to even for a moment consider having to quote it, make the contents of the /.../ containment a '-encased string, following all the common string quoting conventions.

var links_xp = /'.//a[@href]'/xmos;
var posts_xp = /'//div[@class="post"]'/xmos;


for instance, for slicing up a local document.links variant for some part of your DOM tree, and for picking up the root nodes of all posts on this blog page respectively. And it probably already shows that the better default is to make those flags the default behaviour, and name the inverse set instead. Perhaps these?
"s"
Single node only
"u"
Unordered nodes
"i"
Iterable nodes

When requesting a single node, you get either the node that matched, or null. Illegal combinations of flags and XPath expressions ideally yield compile time errors. Being able to instantiate new XPath objects off old ones, given new flags, would be another welcome feature. There probably are additional improvements and clarifications to make. Good ideas shape the future.
blog comments powered by Disqus