Converting DTDs to XML Schemas (3/3) - exploring XML | WebReference

Converting DTDs to XML Schemas (3/3) - exploring XML

Converting DTDs to XML Schemas

Manual work

We immediately run into a bug here: Both the channel and item definitions contain elements of name ? and +, rather than having these symbols translated into minOccurs and maxOccurs constraints. This is also provoked by the strange DTD definitions of these elements, which should rather read:

<!ELEMENT channel (title, description, link, language, item+, 
 rating?, image?, textinput?, copyright?, pubDate?, lastBuildDate?, 
 docs?, managingEditor?, webMaster?, skipHours?, skipDays?)>

This stricter element definition would be more accurately converted into:

<element name="channel">
 <element ref="title" minOccurs="1" maxOccurs="1"/>
 <element ref="description" minOccurs="1" maxOccurs="1"/>
 <element ref="link" minOccurs="1" maxOccurs="1"/>
 <element ref="language" minOccurs="1" maxOccurs="1"/>
 <element ref="item" minOccurs="1"/>
 <element ref="rating" minOccurs="0" max="1"/>
 <element ref="image" minOccurs="0" max="1"/>
 <element ref="textinput" minOccurs="0" max="1"/>
 <element ref="copyright" minOccurs="0" max="1"/>
 <element ref="pubDate" minOccurs="0" max="1"/>
 <element ref="lastBuildDate" minOccurs="0" max="1"/>
 <element ref="docs" minOccurs="0" max="1"/>
 <element ref="managingEditor" minOccurs="0" max="1"/>
 <element ref="webMaster" minOccurs="0" max="1"/>
 <element ref="skipHours" minOccurs="0" max="1"/>
 <element ref="skipDays" minOccurs="0" max="1"/>
</element>

Increased Precision

In the last article we emphasized that XML Schema allows for increased precision for specifying constraints on elements and attributes. The RSS version atribute needs to be "0.91". In XML Schema we can express this with:

<element name="rss">
 <complexType content="elementOnly">
  <attribute name="version" type="string" use="fixed" value="0.91"/>
 </complexType>
</element>

One of the conventions on Netscape's former RSS syndication system was that a channel could only have up to 15 elements.

<element ref="item" minOccurs="1" maxOccurs="15"/>
More constraints on other elements of the RSS definition can be enforced in similar ways.

Conclusion

The conversion tool cannot work magic if the underlying DTD is already fairly lax. The RSS 0.91 definition is not very restrictive so a lot of manual work is required, including working around bugs. Nevertheless on the W3C site there are successful conversions of P3P, SMIL, XHTML, and MathML, so an automatic conversion seems feasible for better-behaved DTDs.

http://www.internet.com

Produced by Michael Claßen

URL: http://www.webreference.com/xml/column35/3.html
Created: Jul 18, 2001
Revised: Jul 18, 2001