The Flesh and the Soul of Information. Abstractions for the Web
|Abstractions for the Web|
ost advances in the field of document abstractions and "separation ideology" have so far been connected with SGML. It is also true for one of the widest and most diverse document exchange media, the World Wide Web. Although SGML has become a core of many successful document management environments, it's not an overestimation to say that if there exists a single most important document medium craving for a high-level abstraction language, it's the Web.
Despite the obvious need, however, the first attempt to graft SGML ideology upon the Internet tree was mostly a failure. At its early days, HTML offered purely structural markup, i.e. only one half of the equation; moreover, the structure that you were supposed to box up your documents into was very limited and not extensible. If you add to this the scarcity of popularization attempts focusing on the ideology of separation for the Web audience, it comes to absolutely no surprise that both Web users and browser manufacturers considered the language to be a fancy plain-text equivalent of some annoyingly poor and old-fashioned word processor format.
Naturally, the development curve taken by HTML since then is in
compliance with this level of understanding. The result that we're
all using nowadays is a peculiar mix of "old" structure tags and "new"
presentation extensions, pioneered by browser companies and lastly
acknowledged by W3C. Thus the most basic
The dire need for a more consistent markup solution led to introducing XML, a W3C-standardized subset of SGML, simplified yet retaining most of its power and flexibility. This "HTML of the future," as it is sometimes called, has already received a wide industry support, although it has yet to be adopted by the mass user audience (refer to another HTML Unleashed chapter for more information). This development is exciting in more than one way.
Although "named styles" in word processors have to some extent accustomed document creators to the advantages of storing presentation parameters separately, this hasn't yet become a subconscious imperative: as they say, word processors allow separation but do not enforce it. Thus, with the adoption of XML, millions of Web authors may, for the first time ever, enter the world of true structural flexibility and presentation independence. This may lead to a boost in the average Web document quality, but it is also likely to pose serious adaptation problems for many users.
SGML's separation ideology works perfectly for well-defined types of
documents whose DTDs were custom developed by trained
professionals. However, when the paradigm of structural markup
hits the truly world-wide Web with its much more diversified and even
erratic documents, this deceptively simplistic scheme must be rendered
more concrete. What needs to be explained to everyone is not only
why XML is different
In my opinion, XML creators' problem at the moment is that they're
pushing along the structural side of the technology while the
presentational machinery is lagging behind (although, fortunately, the
lag is much smaller than that of CSS following HTML only after several
years). While XML 1.0 is already a W3C Recommendation, eXtended
Stylesheet Language at the time of this writing is not even a Draft, but
only a submitted Proposal. The
unwelcome result that may ensue is not, of course, that XML will be
corrupted with built-in visual extensions (as was the case with HTML),
but that the first wave of enthusiastic users will face an additional
difficulty in embracing the new paradigm of document creation
Revised: Apr. 19, 1998