spacer

Webref WebRef   Sitemap · Experts · Tools · Services · Newsletters · About i.com

home / programming / xsltweb2 / 1 To page 1To page 2To page 3To page 4To page 5current page
[previous]

Sr. Web Developer
mediabistro.com
US-NY-New York

Justtechjobs.com Post A Job | Post A Resume
Developer News
Microsoft Shows Off Silverlight 4, IE9 Plans
Metasploit Expands Vulnerability Test Framework
HyperCard Reborn?


XSLT 2.0 Web Development: Elements of a Web Site. Pt. 2.

3.10.3. Schematron schema

The schema in Example 3.3 is used to validate both the master document and page documents of our Foobar site. This makes sense because these document types have a lot in common. Still, for readability the schema is broken into three patterns: One tests the master document, another tests page documents, and the last one tests constructs that occur in both document types (this includes links, images, and text markup).

Languages. The lang-check abstract rule checks that the element being checked contains exactly as many translation children as there are languages defined in the languages element. This rule can then be reused for any element that provides information in two languages. A separate rule with context="translation" additionally checks that the lang attributes correspond to the defined languages and that each language version is provided only once.

Element presence. In this schema, many element-presence checks are lumped together for simplicity (e.g., all children of an environment are checked in one assert). This does not have to be that way; if you want your schema to be really helpful, you can write a separate check with its own diagnostic message for each element type, explaining its role and the possible consequences of its being missing from the source.

Context-sensitive checks. Note that there are two different page element types: One is used in the master document, and the other is the root element type in a page document. The same applies to blocks. The schema, however, has no problems differentiating between these element types based on the context.

Reporting unknowns. One function of a schema is to check for unknown element type names (most often resulting from typos). In Schematron, this can be implemented by providing a dummy rule with no tests, listing all defined element types as possible contexts. Following that, a rule with context="*" signals error whenever the rule is activated. This technique is possible because each context will only match one rule per pattern; if an element was not matched by the dummy rule, it is caught by the next rule and reported as unrecognized.

It’s only a beginning.This example schema demonstrates only the basic, most critical checks. Your own schema may be significantly larger and more detailed than this, although it will likely use mostly the same techniques. Consider this schema a phrasebook with common expressions for typical situations. Several advanced tricks for validating complex constraints are discussed in Chapter 5 (5.1.3).


Example 3.3. schema.sch: A Schematron schema for validating page documents and the master document.

<schema xmlns="http://www.ascc.net/xml/schematron">

<!-- Checks for the master document: -->
<pattern name="master">

<rule context="site">
  <report test="count(//environment) = 1">
    Only one 'environment' found; you will need to create more if you 
    want to build the site in a different environment.
  </report> 
  <report test="count(//environment) = 0">
    No 'environment' elements found; the stylesheet will be unable to  
    figure out pathnames.
  </report>
  <assert test="languages and menu and html-title and page-footer 
                          and blocks">
    One of the required elements not found inside 'site'.
  </assert>
</rule>

<rule context="page-footer">
  <assert test="copyright and language-switch
                          and contact-webmaster">
    One of the required elements not found inside 'page-footer'.
  </assert>
</rule>

<rule context="environment">
  <assert test="src-path and out-path 
                and target-path and img-path and os">
    One of the required elements not found inside 'environment'.
  </assert>
  <assert test="@id">
    An 'environment' must have an 'id' attribute.
  </assert>
  <assert test="count(//environment/@id[. = current()/@id]) = 1"> 
    The 'id' attribute value of an 'environment' must be unique.
  </assert>
</rule>


<rule context="src-path | img-path | out-path | target-path"> <report test="*"> The '<name/>' element cannot have children. </report> <report test="(normalize-space(.) = '') and not(name() = 'target-path')"> The '<name/>' element cannot be empty. </report> </rule> <rule context="languages"> <assert test="count(lang) = count (*)"> The 'languages' element can only have 'lang' children. </assert> <assert test="count(lang) &gt; 0"> The 'languages' element must have at least one 'lang' child. </assert> </rule> <rule context="languages/lang"> <assert test="count(//languages/lang[. = current()]) = 1"> Each language must be specified only once. </assert> </rule> <rule context="menu"> <assert test="count(item) = count (*)"> The 'menu' element cannot contain elements other than 'item'. </assert> </rule> <rule context="item"> <assert test="label" diagnostics="label-element"> A 'label' element is missing. </assert> <report test="count(label) &gt; 1" diagnostics="label-element"> There is an extra 'label' element. </report> <assert test="page"> At least one 'page' element should be specified within an 'item'. </assert> </rule>
<rule context="menu//page"> <assert test="@src"> Each 'page' must have an 'src' attribute. </assert> <assert test="@id"> Each 'page' must have a unique 'id' attribute. </assert> <assert test="count(//page/@id[. = current()/@id]) = 1"> The 'id' attribute value of a 'page' must be unique. </assert> </rule> <!-- Abstract rule to check 'transformation' children: --> <rule abstract="true" id="lang-check"> <assert test="count(translation) = count(//languages/lang)"> The number of 'translation' children in '<name/>' must correspond to the number of defined languages. If this element does not exist in one of the languages, use an empty 'translation' element. </assert> <assert test="count(translation) = count(*)"> There must be no child elements here other than 'translation'. </assert> </rule> <!-- Applying the abstract rule to all bilingual elements: --> <rule context="label | html-title | copyright | language-switch | contact-webmaster | button"> <extends rule="lang-check"/> </rule> <rule context="translation"> <assert test="@lang"> Each 'translation' must have a 'lang' attribute. </assert> <assert test="@lang = //languages/lang/text()"> The value of the 'lang' attribute must correspond to one of the defined languages. </assert> <report test="@lang = preceding-sibling::translation/@lang"> There is another 'translation' element under this parent with the same value of the 'lang' attribute. </report> </rule>
<rule context="blocks"> <report test="*[not(self::block or self::block-process)]"> A 'blocks' element must only contain one or more 'block' or 'block-process' elements. </report> </rule> <rule context="blocks/block"> <assert test="@id and @src"> A 'block' defined in the master document must have both 'id' and 'src' attributes. </assert> <assert test="count(//blocks/block/@id[. = current()/@id]) = 1"> The 'id' attribute value of a 'block' must be unique. </assert> </rule> </pattern> <!-- Checks for page documents: --> <pattern name="page"> <rule context="/page"> <assert test="@keywords"> Please consider adding a list of keywords to the page. Use a 'keywords' attribute for that. </assert> <assert test="title"> Each 'page' must have a 'title'. </assert> <assert test="count(title) &lt; 2"> A 'page' may have only one 'title'. </assert> <assert test="block"> Each 'page' must have at least one 'block'. </assert> </rule> <rule context="page//block"> <assert test="@idref or *"> A block must have either an 'idref' attribute (referring to an orthogonal block) or children. </assert>
<report test="@idref and *"> A block cannot have both an 'idref' attribute and children. </report> <report test="count(p | section) &lt; count(*)"> A block can only have 'p' or 'section' children. </report> </rule> <rule context="section"> <assert test="head"> A section must have a 'head'. </assert> <assert test="p"> A section must have at least one 'p' (paragraph). </assert> <assert test="normalize-space(text()) = ''"> A section cannot contain text. Use a 'p' element to include a paragraph of text. </assert> </rule> </pattern> <!-- Rules common for master and page documents: --> <pattern name="common"> <rule context="int | link[@linktype='internal']"> <assert test="@link"> An internal link must use a 'link' attribute to specify the page being linked. </assert> </rule> <rule context="p"> <report test="(normalize-space(text()) = '') and not(*)"> A paragraph cannot be empty. If you want to increase vertical spacing here, modify the stylesheet. </report> </rule>
<!-- Dummy rule listing all defined element types: --> <rule context=" block | block-process | blocks | button | buttons | contact-webmaster | copyright | environment | em | ext | head | html-title | img-path | int | item | label | lang | language-switch | languages | link | mailto | menu | os | out-path | p | page | page-footer | site | section | src-path | subhead | target-path | title | translation"/> <!-- Report error if an element was not matched by the above: --> <rule context="*"> <report test="true()"> Unrecognized element: '<name/>'. </report> </rule> </pattern> <diagnostics> <diagnostic id="label-element"> Every 'item' element must contain exactly one 'label' element specifying the corresponding top menu label. </diagnostic> </diagnostics> </schema>

1.  www.w3.org/TR/2003/WD-xhtml2-20030131/

2.  Formatting attributes such as font, color, or numbering style are out of the question — the whole point of semantic XML is that these must be abstracted away.

3.  A linked element is often called an anchor, and HTML uses this term for both the source of the link (source anchor) and its destination (destination anchor); hence the use of the a element for both ends of a link.

4.  In Western writing systems, of course.

5.  www.oasis-open.org/specs/docbook.shtml

6.  www.tei-c.org

7.  www.tei-c.org/P4X/

8.  www.nitf.org

9.  www.w3.org/TR/xlink/

10.  An RFC (Request for Comments) is one of the series of standards created by the Internet Engineering Task Force (IETF) and governing most of the underlying technical structure of the Internet.

11.  Strictly speaking, HTML links to URIs, not pathnames, but links within a site almost always use relative or absolute pathnames (without a server part) that are also valid URIs.

12.  Unfortunately, this only works for your own site. Visitors coming from another site linking to yours will still get a 404 for a moved page.

13.  www.w3.org/TR/xhtml-modularization/abstract_modules.html#s_forms

14.  www.w3.org/MarkUp/Forms/

15.  It might be argued that the size of an input field is one of its essential semantic aspects and not a superficial formatting property.

16.  www.exslt.org/dyn/functions/evaluate

17.  The single slash character in this URL means that the file is available locally and not on a network host.

home / programming / xsltweb2 / 1 To page 1To page 2To page 3To page 4To page 5current page
[previous]

internet.commediabistro.comJusttechjobs.comGraphics.com

Search:

WebMediaBrands Corporate Info

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | Shopping | E-mail Offers | Freelance Jobs

webref The latest from WebReference.com Browse >
Rolling Out Your Own HTML Application Version Control · HTML 5: Client-side Storage · Working with Ajax Server Extensions
Sitemap · Experts · Tools · Services · Email a Colleague · Contact FREE Newsletters 
 The latest from internet.com
Wi-Fi Product Watch, November 2009 · Chip Market Recovering From '08 Collapse · Low-Cost Tools to Kickstart Your New Business

Created: March 27, 2003
Revised: May 24, 2004

URL: http://webreference.com/programming/xsltweb2/1