Professional XML Databases - Chapter 2, part 4 (page 3)
Professional XML Databases
In this chapter, we've seen some guidelines for the creation of XML structures to hold data from existing relational databases. We've seen that this isn't an exact science, and that many of the decisions we will make while creating XML structures will entirely depend on the kinds of information we wish to represent in our documents.
If there's one point in particular we should come away with from this chapter, it's that we need to try to represent relationships in our XML documents with containment as much as possible. XML is designed around the concept of containment Â the DOM and XSLT treat XML documents as trees, while SAX and SAX-based parsers treat them as a sequence of branch begin and end events and leaf events. The more pointing relationships we use, the more complicated the navigation of your document will be, and the more of a performance hit our processor will take Â especially if we are using SAX or a SAX-based parser.
We must bear in mind as we create these structures that there are usually many XML structures that may be used to represent the same relational database data. The techniques described in this chapter should allow us to optimize our documents for rapid processing and minimum document size. Using the techniques discussed in this chapter, and the next, we should be able to easily move information between our relational database and XML documents.
Here are the eleven rules we have defined for the development of XML structures from relational database structures:
Rule 1: Choose the Data to Include.
Based on the business requirement the XML document will be fulfilling, decide which tables and columns from our relational database will need to be included in our documents.
Rule 2: Create a Root Element
Create a root element for the document. Add the root element to our DTD, and declare any attributes of that element that are required to hold additional semantic information (such as routing information). Root element's names should describe their content.
Rule 3: Model the Content Tables.
Create an element in the DTD for each content table we have chosen to model. Declare these elements as EMPTY for now.
Rule 4: Modeling Nonforeign Key Columns.
Create an attribute for each column we have chosen to include in our XML document (except foreign key columns). These attributes should appear in the !ATTLIST declaration of the element corresponding to the table in which they appear. Declare each of these attributes as CDATA, and declare it as #IMPLIED or #REQUIRED depending on whether the original column allowed NULLS or not.
Rule 5: Add ID Attributes to the Elements.
Add an ID attribute to each of the elements we have created in our XML structure (with the exception of the root element). Use the element name followed by ID for the name of the new attribute, watching as always for name collisions. Declare the attribute as type ID, and #REQUIRED.
Rule 6: Representing Lookup Tables.
For each foreign key that we have chosen to include in our XML structures that references a lookup table:
- Create an attribute on the element representing the table in which the foreign key is found.
- Give the attribute the same name as the table referenced by the foreign key, and make it #REQUIRED if the foreign key does not allow NULLS or #IMPLIED otherwise.
- Make the attribute of the enumerated list type. The allowable values should be some human-readable form of the description column for all rows in the lookup table.
Rule 7: Adding Element Content to Root elements.
Add a child element or elements to the allowable content of the root element for each table that models the type of information we want to represent in our document.
Rule 8: Adding Relationships through Containment.
For each relationship we have defined, if the relationship is one-to-one or one-to-many in the direction it is being navigated, and no other relationship leads to the child within the selected subset, then add the child element as element content of the parent element with the appropriate cardinality.
Rule 9: Adding Relationships using IDREF/IDREFS.
Identify each relationship that is many-to-one in the direction we have defined it, or whose child is the child in more than one relationship we have defined. For each of these relationships, add an IDREF or IDREFS attribute to the element on the parent side of the relationship, which points to the ID of the element on the child side of the relationship.
Rule 10: Add Missing Elements.
For any element that is only pointed to in the structure created so far, add that element as allowable element content of the root element. Set the cardinality suffix of the element being added to *.
Rule 11: Remove Unwanted ID Attributes.
Remove ID attributes that are not referenced by IDREF or IDREFS attributes elsewhere in the XML structures.
Created: June 14, 2001
Revised: June 14, 2001