dtddoc step 4: Generating HTML (2/2) - exploring XML
dtddoc step 4: Generating HTML
The PHP solution
The DTD parser in PHP is derived from the one in Java, so porting
dtddoc is fairly
straightforward. File operations are much more C-style in PHP, and iteration with
results in more compact and readable code. Availability of
sort functions save us from
having to manually sort the index, as in Java. Also worth mentioning is that PHP passes data structures
by value, not by reference. This is also true with Java, except that
Hashtables are objects and are therefore passed by reference. Mind your backslashes in
function definitions where you intend to modify the incoming data.
Here is the source code.
The Perl package
The Perl solution looks slightly different in parsing the DTD because of the callback
approach to reading comments. Other than that the PHP constructs can be translated almost
automatically to Perl constructs, carefully replacing
% for the respective data structures.
View the source code.
Differences in implementations
If you tried all three solutions you noticed slight differences that mirror the language's strengths or weaknesses:
- The list of entities and notations is in random order in Java. No sorting functionality is provided
as part of the standard JDK. If you think the new Java Collections do, think again:
TreeMapgives you a sorted map, but if you want items to be sorted case-insensitive, all items that only differ in capitalization are treated identical and therefore cannot be stored in the same map. I did not feel like slapping on another third-party library for sorting, but might do for version 2. Or implement bubble sort myself. Luckily most other programming languages have left that stone age already...
- The PHP solution does not print out cardinality in the content models, that is the
+symbols. [The problem has been solved by alert reader Adam Trachtenberg: The
writeris not always passed by reference in the DTD parser code. This has been fixed in CVS and the download ZIP file.]
- The Perl version does not print out the values of entities, nor any notations, as I could not find any functions in the SGML::DTD API for retrieving these things. I might have to implement an SGML:EntMan entity manager for version 2.
We successfully implemented a first version of a DTD documentation tool in three programming languages. More languages might be considered should a DTD parser be available for those. A first documentation effort was completed with RSS version 0.91 and more will follow. I hope this effort will help you to better understand and more quickly identify DTDs that are relevant to your XML or SGML work!
Produced by Michael Claßen
Created: Nov 11, 2002
Revised: Nov 11, 2002