The History of XSL
The History of XSL
Like most of the XML family of standards, XSLT was developed by the World Wide Web Consortium (W3C), a coalition of companies orchestrated by Tim Berners-Lee, the inventor of the web. There is an interesting page on the history of XSL, and styling proposals generally, at http://www.w3.org/Style/History/.
HTML was originally conceived by Berners-Lee as a set of tags to mark the logical structure of a document: headings, paragraphs, links, quotes, code sections, and the like. Soon people wanted more control over how the document looked: they wanted to achieve the same control over the appearance of the delivered publication as they had with printing and paper. So HTML acquired more and more tags and attributes to control presentation: fonts, margins, tables, colors, and all the rest that followed. As it evolved, the documents being published became more and more browser-dependent, and it was seen that the original goals of simplicity and universality were starting to slip away.
The remedy was widely seen as separation of content from presentation. This was not a new concept; it had been well developed through the 1980s in the development of Standard Generalized Markup Language (SGML),Â whose architecture in turn was influenced by the elaborate (and never implemented) work done in the ISO Open Document Architecture (ODA) standards.
Just as XML was derived as a greatly simplified subset of SGML, so XSLT has its origins in an SGML-based standard called DSSSL (Document Style Semantics and Specification Language). DSSSL (I pronounce it Dissel) was developed primarily to fill the need for a standard device-independent language to define the output rendition of SGML documents, particularly for high-quality typographical presentation. SGML was around for a long time before DSSSL appeared in the early 1990s, but until then the output side had been handled using proprietary and often extremely expensive tools, geared towards driving equally expensive phototypesetters, so that the technology was only really taken up by the big publishing houses.
C. M. Sperberg-McQueen and Robert F. Goldstein presented an
influential paper at the WWW '94 conference in Chicago under the title A Manifesto for Adding SGML Intelligence to
the World-Wide Web. You can find it at: http://www.ncsa.uiuc.edu/SDG/
The authors presented a set of requirements for a stylesheet language, which is as good a statement as any of the aims that the XSL designers were trying to meet. As with other proposals from around that time, the concept of a separate transformation language had not yet appeared, and a great deal of the paper is devoted to the rendition capabilities of the language. There are many formative ideas, however, including the concept of fallback processing to cope with situations where particular features are not available in the current environment.
It is worth quoting some extracts from the paper here:
Ideally, the style sheet language should be declarative, not procedural, and should allow style sheets to exploit the structure of SGML documents to the fullest. Styles must be able to vary with the structural location of the element: paragraphs within notes may be formatted differently from paragraphs in the main text. Styles must be able to vary with the attribute values of the element in question: a quotation of type "display" may need to be formatted differently from a quotation of type "inline". They may even need to vary with the attribute values of other elements: items in numbered lists will look different from items in bulleted lists.
At the same time, the language has to be reasonably easy to interpret in a procedural way: implementing the style sheet language should not become the major challenge in implementing a Web client.
The semantics should be additive: It should be possible for users to create new style sheets by adding new specifications to some existing (possibly standard) style sheet. This should not require copying the entire base style sheet; instead, the user should be able to store locally just the user's own changes to the standard style sheet, and they should be added in at browse time. This is particularly important to support local modifications of standard DTDs.
Syntactically, the style sheet language must be very simple, preferably trivial to parse. One obvious possibility: formulate the style sheet language as an SGML DTD, so that each style sheet will be an SGML document. Since the browser already knows how to parse SGML, no extra effort will be needed.
We recommend strongly that a subset of DSSSL be used to formulate style sheets for use on the World Wide Web; with the completion of the standards work on DSSSL, there is no reason for any community to invent their own style-sheet language from scratch. The full DSSSL standard may well be too demanding to implement in its entirety, but even if that proves true, it provides only an argument for defining a subset of DSSSL that must be supported, not an argument for rolling our own. Unlike home-brew specifications, a subset of a standard comes with an automatically predefined growth path. We expect to work on the formulation of a usable, implementable subset of DSSSL for use in WWW style sheets, and invite all interested parties to join in the effort.
In late 1995, a W3C-sponsored workshop on stylesheet languages was held in Paris. In view of the subsequent role of James Clark as editor of the XSLT Recommendation, it is interesting to read the notes of his contribution on the goals of DSSSL, which can be found at http://www.w3.org/Style/951106_Workshop/report1.html#clark.
What follows is a few selected paragraphs from these notes:
DSSSL contains both a transformation language and a formatting language. Originally the transformation was needed to make certain kinds of styles possible (such as tables of contents). The query language now takes care of that, but the transformation language survives because it is useful in its own right.
Both simple and complex designs should be possible, and the styles should be suitable for batch formatting as well as interactive applications. Existing systems should be able to support DSSSL with only minimal changes (a DSSSL parser is obviously needed).
The language is strictly declarative, which is achieved by adopting a functional subset of Scheme. Interactive style sheet editors must be possible.
A DSSSL style sheet very precisely describes a function from SGML to a flow object tree. It allows partial style sheets to be combined ('cascaded' as in CSS): some rule may override some other rule, based on implicit and explicit priorities, but there is no blending between conflicting styles.30
James Clark closed his talk with the remark:
Creating a good, extensible style language is hard!
One suspects that the effort of editing the XSLT Recommendation didn't cause him to change his mind.
The First XSL Proposal
Following these early discussions, the W3C set up a formal activity to create a stylesheet language proposal. The remit for this group specified that it should be based on DSSSL.
As an output of this activity came the first formal proposal for XSL, dated 21 August 1997. It can be found at http://www.w3.org/TR/NOTE-XSL.html.
There are eleven authors listed. They include five from Microsoft, three from Inso Corporation, plus Paul Grosso of ArborText, James Clark (who works for himself), and Henry Thompson of the University of Edinburgh.
The section describing theÂ purpose of the language is worth reading:
XSL is a stylesheet language designed for the Web community. It provides functionality beyond CSS (e.g. element reordering). We expect that CSS will be used to display simply-structured XML documents and XSL will be used where more powerful formatting capabilities are required or for formatting highly structured information such as XML structured data or XML documents that contain structured data.
Web authors create content at three different levels of sophistication:
markup: relies solely on a declarative syntax
script: additionally uses code "snippets" for more complex behaviors
program: uses a full programming language
The powerful capabilities provided by XSL allow:
formatting of source elements based on ancestry/descendency, position, and uniqueness
the creation of formatting constructs including generated text and graphics
the definition of reusable formatting macros
writing-direction independent stylesheets
extensible set of formatting objects
The authors then explained carefully why they had felt it necessary to diverge from DSSSL, and described why a separate language from CSS (Cascading Style Sheets)Â was thought necessary.
They then stated some design principles:
XSL should be straightforwardly usable over the Internet.
XSL should be expressed in XML syntax.
XSL should provide a declarative language to do all common formatting tasks.
XSL should provide an "escape" into a scripting language to accommodate more sophisticated formatting tasks and to allow for extensibility and completeness.
XSL will be a subset of DSSSL with the proposed amendment. (As XSL was no longer a subset of DSSSL, they cannily proposed amending DSSSL so it would become a superset of XSL).
A mechanical mapping of a CSS stylesheet into an XSL stylesheet should be possible.
XSL should be informed by user experience with the FOSI stylesheet language.
The number of optional features in XSL should be kept to a minimum.
XSL stylesheets should be human-legible and reasonably clear.
The XSL design should be prepared quickly.
XSL stylesheets shall be easy to create.
Terseness in XSL markup is of minimal importance.
As a requirements statement, this doesn't rank among the best. It doesn't read like the kind of list you get when you talk to users and find out what they need. It's much more the kind of list designers write when they know what they want to produce, including a few political concessions to the people who might raise objections. But if you want to understand why XSLT became the language it did, this list is certainly evidence of the thinking.
The language described in this first proposal contains many of the key concepts of XSLT as it finally emerged, but the syntax is virtually unrecognizable. It was already clear that the language should be based on templates that handled nodes in the source document matching a defined pattern, and that the language should be free of side-effects, to allow "progressive rendering and handling of large documents". I'll explore the significance of this requirement in more detail on page 34, and discuss its implications on the way stylesheets are designed in Chapter 8. The basic idea is that if a stylesheet is expressed as a collection of completely independent operations, each of which has no external effect other than generating part of the output from its input (for example, it cannot update global variables), then it becomes possible to generate any part of the output independently if that particular part of the input changes. Whether the XSLT language actually achieves this objective is still an open question.
Microsoft shipped their first technology preview four months after this proposal appeared, in January 1998.
To enable W3C to make an assessment of the proposal, Norman Walsh produced a requirements summary, which was published in May 1998. It is available at http://www.w3.org/TR/WD-XSLReq.
The bulk of his paper is given over to a long list of the typographical features that the language should support, following the tradition both before and since that the formatting side of the language gets a lot more column inches than the transformation side. But as XSLT fans that need not worry us: the success of standards has always been inversely proportional to their length.
What Walsh has to say on the transformation aspects of the language is particularly terse, and although he clearly had reasons for thinking these features were necessary, it's a shame that he doesn't tell us why he put these in and left others, such as sorting, grouping, and totaling, out:
Ancestors, children, siblings, attributes, content, disjunctions, negation, enumerations, computed select based upon arbitrary query expressions.
Arithmetic Expressions; arithmetic, simple boolean comparisons, boolean logic, substrings, string concatenation.
Data Types: Scalar types, units of measure, Flow Objects, XML Objects
Side effects: No global side effects.
Standard Procedures: The expression language should have a set of procedures that are built in to the XSL language. These are still to be identified.
User Defined Functions: For reuse. Parameterized, but not recursive.
Following this activity, the first Working Draft of XSL (not to be confused with the Proposal) was published on 18 August 1998, and the language started to take shape, gradually converging on the final form it took in the 16 November 1999 Recommendation through a series of Working Drafts, each of which made radical changes, but kept the original design principles intact.
So let's look now at the essential characteristics of XSLT as a language.
Created: Jan. 05, 2001
Revised: Jan. 05, 2001