WebRef Update: Featured Article: X(ML) Marks the Spot | WebReference

WebRef Update: Featured Article: X(ML) Marks the Spot

X(ML) Marks the Spot

Never before has an Internet development witnessed such rave reviews before thorough implementation and testing as the eXtensible Markup Language (XML). XML is definitely the latest buzzword in the developer community. However, you must crawl before you can walk. The idea of creating an XML based e-commerce solution is exciting, but what about a simple XML page which has standard Web media elements such as graphics or Flash files?

XML promises to give us intelligent document structures, object oriented document manipulations, synchronized media and a whole lot more. But what is XML exactly, and why has it created such a stir? This article is for those developers who are looking for a hands-on explanation of XML basics.

What is XML?

The eXtensible Markup Language and HTML are both subsets of Standard Generalized Markup Language (SGML). SGML is a very powerful technology that can be viewed as the parent of many markup languages, which include HTML and XML. With XML, it is possible to create new variations such as the Wireless Application Protocol Markup Language (WAPML or WML), which makes communicating and transactions between a mobile phone and a Web server possible.

Of all the aspects of XML, the following is probably the most important: XML only recently became an official W3C recommendation. This means that the consortium still hasn't made a decision about standard XML. Many XML elements used in Explorer 5.0 are based on the W3C draft and they will probably be included in the official XML specs. Netscape has probably made the wise decision to wait with releasing their XML compliant version 5 browser until the official specs have been determined.

Enough background information. Let's get into the real deal. The big difference between XML and HTML is the following: an HTML document has three different elements: The first element being the text (e.g. "Welcome to my homepage"). The second element is the document structure such as tables and linebreaks. The third element is the visual markup such as bold text, italic text, graphics and other visual elements.

An XML document, however, can actually consist of two or three different pages. Because seeing is believing, I've included a short example below.

1. The first page is the actual XML information you wish to display. In first generation XML sites, this information will probably be text contained in the page called "whatever.xml". This page doesn't have any structure such as a table or visual markup (bold, italic or color).

Whatever.xml looks like:

<?XML version="1.0" ?> <?XML-stylesheet type="text/xsl" href="whatever.xsl"?> <people> <friend> <name>Lee</name> <address>25 Malvern street</address> <telephone>123 456 789</telephone> </friend> <friend> <name>Susanna</name> <address>11 Durban road</address> <telephone>987 654 231</telephone> </friend> </people>

2. The second page has the Extensible Stylesheet Language (whatever.xsl). This page has HTML and "tags" which takes the data out of whatever.xml and puts into "whatever.xsl". The xsl document has the mark-up such as <body>, <table> and <font>.

Whatever.xsl looks like:

<?XML version="1.0"?> <xsl:stylesheet XMLns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> <HTML> <head><title>XML Developer</title></head> <body> <table border="1" cellpadding="3" cellspacing="3"> <xsl:for-each select="people/friend"> <tr> <td><b>Name:</b><br/></td> <td><xsl:value-of select="name"/></td> </tr> <tr> <td>Address:<br/></td> <td><i><xsl:value-of select="address"/></i></td> </tr> <tr> <td>Telephone:<br/></td> <td><i><xsl:value-of select="telephone"/></i><br/></td> </tr> </xsl:for-each> </table> </body> </HTML> </xsl:template> </xsl:stylesheet>

3. The third page is the Document Type Definition. The good news is that a DTD is not always necessary, especially in a simple XML document. The bad news is that a DTD is pretty darn difficult. It contains elements such as attributes and data types. For more information on DTDs, take a look at:


The XML version of linebreak is <br/> instead of the HTML <br>. Herein lies the secret in getting around the most common and frustrating markup language bugs [or features, depending on your point of view -eds.], which go by the name of validity or "well- formed code." In the good old Internet days, developers were very meticulous when it came to their coding. If you opened a <font> tag, you'd have to close it with </font>. When browsers got smarter, coders became lazier. As HTML evolved, people also decided that it wasn't necessary to include certain quotes in their code. So what was once <font color="white"> became <font color=white>. And then XML hit the scene.

Next: XML Needs Clean Code

This article originally appeared in the December 2, 1999 edition of the WebReference Update Newsletter.


Comments are welcome
Written by Leroyson Figueira and

Revised: May 16, 2000

URL: http://webreference.com/new/xmlintro.html