spacer

Webref WebRef   Sitemap · Experts · Tools · Services · Newsletters · About i.com

home / experts / xml / column36

Xparse-J Update 1.1

Technical Lead
Thomson Reuters (Markets) LLC
US-NY-New York

Justtechjobs.com Post A Job | Post A Resume
Developer News
Microsoft Shows Off Silverlight 4, IE9 Plans
Metasploit Expands Vulnerability Test Framework
HyperCard Reborn?


Xparse-J grew out of the need to come up with the smallest possible XML parser to be plugged into the RSSViewerApplet. While the parser seems to work great in the parsing phase, it is apparent that accessing the parsed content afterwards is not as comfortable as it could be.

The XML document is parsed into a tree of Nodes which can be navigated using the standard elementAt(int position) method of JSArray. Since this navigation is fairly awkward, Node provides a find() method to locate nodes at a certain position in the XML tree.

This Node.find() was not very generic so far, it only worked properly for the scenario needed in the RSSViewerApplet, which was to extract an RSS channel's title, description and items. With more and more people using Xparse-J for other projects this deficit became more and more obvious.

While a full DOM and XPath implementation is beyond the scope (and size) of a small XML parser, a limited version of XML path specifications should be provided for more conveniently locating nodes in the XML document tree.

So far Node.find() had as arguments:

The problem here is that an occurrence parameter would need to be added to every element of the path in order to be generic and unambiguous: Does Node.find("/item/title", 2) denote the first title of the second item, or the second title of the first item? In the case of RSS the answer is obvious as their should be only one title per item, but in arbitrary XML documents this is not so clear.

The desired functionality is something equivalent to the XPath expression "/item[x]/title[y]", where x and y specify the respective occurrence for each element of the path expression. While implementing this syntax is certainly feasible (what isn't in software?) the parsing code for these expressions would once again add unpleasant weight to the parser.

A compromise was found in changing the occurrence parameter from int to int[], from a single to an array of integers:

This disambiguates the aforementioned example by making it possible to distinguish Node.find("/item/title", {1, 2}) from Node.find("/item/title", {2, 1}). The former tries to find the second title element in the first item (which should not exist in RSS) whereas the latter correctly finds the title of the second item.

Note that this is equivalent:

Splitting the occurrences from the path string simply saves some extra parsing code for separating them out again in software.

Let's look at how the code was changed.

Produced by Michael Claßen

internet.commediabistro.comJusttechjobs.comGraphics.com

Search:

WebMediaBrands Corporate Info

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | Shopping | E-mail Offers | Freelance Jobs

webref The latest from WebReference.com Browse >
Rolling Out Your Own HTML Application Version Control · HTML 5: Client-side Storage · Working with Ajax Server Extensions
Sitemap · Experts · Tools · Services · Email a Colleague · Contact FREE Newsletters 
 The latest from internet.com
Wi-Fi Product Watch, November 2009 · Chip Market Recovering From '08 Collapse · Low-Cost Tools to Kickstart Your New Business

Created: Aug 01, 2001
Revised: Aug 01, 2001