| home / programming / xsltweb2 / 1 | [previous] [next] |
|
|
Separate namespaces. However, there are situations where adding manual formatting hints to your XML source cannot be avoided. This may happen not only with images, although they are a frequent source of problems. It is advisable to use a separate namespace for all hints that pertain to the same output format (e.g., HTML):
<page xmlns:forhtml="http://www.kirsanov.com/formatting-hints-html"> <p forhtml:column-break="true"> ... <image src="solid wood table" forhtml:align="right"/> ... </p> </page>
Here, a hint is added to the p element specifying that this paragraph must start a new column in a multicolumn layout (assuming the stylesheet cannot figure this out automatically). Another hint floats an image within that paragraph to the right margin.
Now, if you want to render the same XML source into a different format, such as PDF, the new stylesheet will have no problems ignoring anything from the “for HTML” namespace. It is also very easy to strip all HTML formatting hints to produce a purely semantic version of the source. You can store several sets of formatting hints in the same source documents, each in its own namespace, and have the stylesheet select the set corresponding to the current output format (such as “HTML with columns,” “HTML without columns,” “printable HTML,” “PDF,” etc.).
HTML documents often use the height and width attributes
in img elements as spatial hints to speed up rendering of the page
in a browser. You don’t need to supply these values in XML; a stylesheet
can find out the dimensions of all referenced images itself (
Besides the location (full or abbreviated) and possibly formatting hints, an image element may contain various other information.
Textual descriptions. The XHTML specification requires that each image be provided with a piece of text describing what the image is. Traditionally, the alt attribute of an img element has been used for short descriptions, but in HTML 4.01 and XHTML the longdesc (“long description”) attribute was added to complement alt. Normally, an image description should contain:
nothing (empty string) for purely decorative images (such as components of frames, backgrounds, and separators);
the text visible on the image for images that display text (thus, the alt of a graphic button must contain exactly the button’s label and nothing else);
a short description of the image’s role or content for meaningful images (e.g., John's photo).
It’s only in the last of the above cases that the image description
may need to be supplied in the XML source, preferably in the content of an image
element (
Captions. Often, a standalone image must be accompanied by a visible descriptive piece of text (as opposed to alt descriptions that are normally not shown by graphic browsers). This may be a caption, a photo credit, a copyright notice, or anything else that semantically belongs to this image.
Since this content may need further inline markup, it is better to store it
in children of your image element rather than in attributes (
<photo src="sight"> <caption>A rare sight.</caption> <credit>Dmitry Kirsanov</credit> </photo>
Upon encountering a photo element, the stylesheet would expect to
format its caption child element as a photo caption and the credit
child element, if present, as credit (e.g., separately from the caption, in
a smaller font size, and with “Photo by” prepended to the
A simple linked image can be created by adding linking attributes (
The quick-and-dirty approach. It is natural to reuse the generic link element type for specifying multiple links inside an imagemap, by placing link elements in the image and adding coordinate attributes to define the linked area:
<image src="chart 3">
<link link="address1" shape="rect" x1="0" y1="0" x2="100" y2="20"/>
<link link="address2" shape="circle" x="50" y="50" radius="5"/>
</image>
In HTML, all coordinates for an imagemap area are cramped into one comma-separated attribute value string. You don’t need to reproduce that in your XML — instead, you can specify one value per attribute and use descriptive attribute names. It’s a good idea to use your schema to check that the set of coordinate attributes in each link element corresponds to the value of shape.
The thoroughly semantic approach.The syntax shown above may work for an occasional imagemap, but it is still not semantic enough and needs to be improved if you routinely use imagemaps (or other interactive objects). Namely, do the pixel values in the link attributes really belong in the source? Probably not, as they are closely bound to the image’s “presentation” and tell us nothing about its “content.” A better approach is to use each link element to associate the identifier of an image area with a link address — for example,
<image src="chart 3">
<link link="address1" area="block1"/>
<link link="address2" area="central-blob"/>
</image>
The correspondence between the area identifiers (block1 and central-blob in this example) and the actual pixel coordinates may be stored in the site’s master document. If, however, you want an imagemap to be truly orthogonal to everything else on the site and easily portable to other sites, consider creating a separate XML document for each imagemap storing its active areas and their identifiers.
Accessibility. Interactive objects such as Java applets and Flash movies may also incorporate multiple links (one example is an animated Flash menu). Even though you don’t have to specify these links in the HTML code embedding the object, it still makes sense to list them in the XML source of a page so that the stylesheet can construct an alternative access mechanism for those users who cannot (or don’t want to) peruse this interactive object.
Tables are perhaps the most abused feature of HTML, with the vast
majority of tables on web pages being used for layout purposes, not
for presenting inherently tabular data. If (like
most web designers) you are going to use HTML tables for web page
layout, you cannot reflect that in the semantic XML source of a page
in any way. It’s only the stylesheet that needs to be concerned with
layout table
Sometimes, however, you may have some genuinely tabular data that you want to format into some sort of a table on a web page. Still, this does not mean that you have to think in terms of rows and columns when creating a semantic source for such a table.
If you have something you can name, do it. For example, consider a sales data table listing sales figures for several products across several years. The XML way of marking up this data would be to forget that you’re working on a table and simply list all available data in an appropriately constructed element tree:
<sales-table> <product> <name>Foobar</name> <sold><year>1999</year><number>123</number></sold> <sold><year>2000</year><number>140</number></sold> <sold><year>2001</year><number>142</number></sold> </product> <product> <name>Barfoo</name> <sold><year>1998</year><number>89</number></sold> <sold><year>1999</year><number>14</number></sold> </product> </sales-table>
This approach frees you from worrying about column alignment, sort order, or empty cells — just dump all your data and you’re done. All the rest will be performed automatically by the stylesheet: It can filter out a subset of the provided data, group values in rows and columns, sort them, and fill in “N/A” for missing values. Thus, the above example might come out as follows:
1998 1999 2000 2001 Barfoo 89 14 N/A N/A Foobar N/A 123 140 142
Tables from triplets. In some cases, such a data-centric approach may also make your source significantly more compact than the table rendition. Thus, a sparse table with mostly empty cells can be represented in the source by triplets consisting of a row name, a column name, and the corresponding value at their intersection. Since such a source does not contain separate lists of all columns and rows, the stylesheet will compile them from the triplets.
Is it worth it? Granted, for an occasional table or two, this may be too much work: You’ll have to program your stylesheet to recognize various element types and perform various operations (such as normalizing dates) that may be necessary for your tabular data. For simple isolated tables, you may be better off more or less directly reproducing in XML the structure of the target HTML table. However, if you have a lot of simple tables (or a few complex ones) with similar data, or if your tables are updated often, the benefits of the semantic data-centric approach may easily outweigh the simplicity of the straightforward HTML imitation.
Also, the tabular data on your web site is likely to be coming from some external source, such as a database or a spreadsheet. When you write the code to update your tables automatically, it is usually much easier to first transform the external data into a semantic XML tree and then let the stylesheet do table layout.
Interactive elements in HTML are grouped into forms. Simple forms
such as site search or email newsletter subscription are often used on many
pages of the site, and your XML does not need to detail the structure of these
forms. Instead, in your source you can treat such a form as an indivisible entity —
for example, as a special type of orthogonal block (
Sometimes, even this is not required. For example, if all pages on your site contain a search field in the page footer, you don’t need to mention it in the XML source at all. Your stylesheet will simply add this form to every page it produces, just as it adds all other page components that remain the same from page to page.
What if you need to build something more complex, such as a shipping address input form or a survey form? In these cases you’ll need to create an appropriate element type for each variety of the form’s input controls (such as text fields, radio buttons, and drop-down lists) as well as for any higher level semantic constructs within the form. This work can be made much easier by reusing some of the existing form vocabularies.
Existing vocabularies. An obvious choice for the existing vocabulary from which you could borrow form-related markup is XHTML, especially if it is your target vocabulary. The Forms module,13 available starting from XHTML 1.1, may be a good first approximation. It covers many widget types and allows for proper logical structuring of your form.
However, in many cases the XHTML form markup may be too presentation-oriented to be useful for your semantic XML — or simply too awkward. This is mostly due to the historic baggage of older HTML versions. Mod ????t?????Ð??A?º??ern HTML and XHTML had to pile their logical markup provisions on top of the old — limited and inflexible — form components.
For instance, in XHTML you have to write
<label for="firstname">First name:</label> <input type="text" id="firstname"/>
instead of the more natural
<textfield id="firstname"> <label>First name:</label> </textfield>
HTML 4.0 had to define a separate
label element that is linked to its input
by a for attribute simply because it had to stay
compatible with older HTML versions that did not allow any children
in
In your source definition, you are free from these concerns and can therefore
mark up your forms in a more logical and readable way. It is also important
that your own markup may be better integrated with other parts of the system;
for example, you could use an abbreviation (
Another existing vocabulary worth looking at is XForms,14
recently developed by the W3C (see
Formatting hints. Form presentation is a difficult task. Even with full manual control, it’s not always easy to lay out a form so that it looks perfect and remains usable for any data that may be filled into it. Even more difficult is to automate form layout, enabling the stylesheet to consistently build good-looking form pages from the semantic description of the forms’ structure. To add insult to injury, different browsers on different platforms often render form controls in wildly different ways.
The key is keeping the layout simple and flexible. Don’t strive for precise placement or alignment of controls, as this is impossible to achieve given the vastly different font and screen sizes in browsers. (Also, do not tie the position of other parts of the page to the size or placement of a form — this often results in a broken page layout.) Take advantage of the form structure described in the source by separating groups of form controls into independent layout blocks.
All that said, adding formatting hints (
| home / programming / xsltweb2 / 1 | [previous] [next] |
Created: March 27, 2003
Revised: May 24, 2004
URL: http://webreference.com/programming/xsltweb2/1