XML: Whines and Battles Q & A | WebReference

XML: Whines and Battles Q & A


There are several lessons we can draw from the HTML experience, but one of them is that the average Web page creator doesn’t give a rat’s end-tag about SGML or XML technicalities and cannot understand why the whole thing had to be made so ‘difficult’ when wordprocessing and desktop publishing (DTP) are so ‘easy.’ The XML FAQ lists Frequently-Asked Questions about XML, but some of the questions which are starting to come up in an XML context (just as they did about HTML) relate to markup in general, and apply just as much to some other systems as to HTML or XML. Many of you will know these almost by heart, but the point is that the questions have not disappeared as HTML has matured: they’ve returned for another round.

Why do I have to bother with rules and special names?
Because XML is still new, and the software which hides the internal rules and naming is still being written. All text-handling systems use rules of one kind or another, from the simplest of editors to the biggest DTP packages, and many of them let you define your own rules for different classes (types) of documents, and give names to them so you can refer to them — just like XML.
Why doesn’t it look like [my wordprocessor]?
Browsers don’t have the same features or set of capabilities as your wordprocessor. Wordprocessors are still mostly aimed at creating documents for printing. Creating information for online use is a different animal: the medium is different and it means using different tools.
Why can’t any of my readers see the fonts I put in?
Same reason as they can’t use the paper you’re writing on: they’re not included in your file so they’re not sent to the reader. Most modern computers come with Times and Helvetica (Arial), but for anything else you’d have to send them a copy of the font. That would probably be illegal, as fonts are a commodity, and copyrighted in most countries: you have to buy them (or download them in the case of free ones), and everyone has their own personal selection. PDF (Acrobat) and PostScript let you send fonts, but they’re not heavily used in the Web because the file size is often very large, making downloads slow.
What’s a stylesheet?
A list of typographic styles used for a particular class of document. It specifies things like ‘Section headings: italic font, ½ high, centered,’using a special language purposely desgned for it. Stylesheets let you change styles more easily — you only need to change the definition once, rather than change every section heading in your document one by one.
How do I convert it to [some other format]?
Many wordprocessors can import a variety of formats: try this way first and then pick ‘Save As. . .’ For better control, use one of the many conversion packages on the market. But remember if you haven’t put the necessary information into your document to let it be converted successfully, you probably won’t get what you’re expecting out of it: computers are not good at guessing your intentions.

This inquiry has resulted in a lot of rather disappointed people wandering around the Internet wanting to know ‘where XML is.’ In particular, some articles and comments have given the impression that XML is complete, ready to run, and is something you can download from your favorite server and install right away.

How it was

Those with long memories will recall exactly the same thing happening to HTML: imaginatively inaccurate articles about using pointy brackets. In many cases, the authors then hadn’t even known a specification existed, let alone actually read it.

The poor business user went off to implement pages armed with very partial information, and was puzzled when it didn’t work right. Browsers, anxious to make it look right even if it wasn’t, felt compelled to support every conceivable missing bracket or quote, and to implement more and more features regardless of whether or not pages using them actually contained any information.

This worked for a few years, but most users are much better informed now, and can see that real business information on the Web means much more than some smart logos and a snappy Javanese menu.

There’s a risk that those who fund a company’s internal efforts in new Web developments are going to pull the plug unless they can ‘see it’ fairly soon. The situation hasn’t been helped by the publication of some other articles which have been more misleading than helpful. This is unfortunate to the extent that less cautious users and developers may get the entirely wrong idea about XML (see ‘How it was’).

However, the biggest question I get about XML relates to this whole business of different classes or ‘types’ of documents. This has nothing directly to do with the styles of a document (one document type could be output in a gazillion different styles), but has everything to do with what ‘a document’ is made up of (headings, paragraphs, lists, images, etc). It’s a FAQ, but it’s worth revisiting to see what’s changed since the question was asked about HTML.

Comments are welcome


All Rights Reserved. Legal Notices.
Created: May 11, 1998
Revised: May 14, 1998

URL: http://www.webreference.com/authoring/languages/xml/questions/qanda.html