Professional XML Databases - Chapter 2, part 3
Professional XML Databases
Handling Foreign Keys
In relational database structures, the only way to show a relationship between data kept in different tables is via a foreign key. As we saw in the previous chapter, there are two ways to show this relationship in XML. We can create hierarchical structures, which allow us to use containment to show relationships between data (where related information is nested inside a parent element). Alternatively, if we want to keep the XML structures separate - like the tables of a database - we can use an ID to point to a corresponding structure that has an IDREF attribute.
Each way has its benefits and drawbacks. Pointing is more flexible than containment, but pointing relationships may only typically be navigated in one direction by the processor, and tend to be slower than navigating parent-child relationships.
The next thing we need to decide is whether to use containment or pointing to represent the relationships between our tables. In addition, we need to add the enumerated attributes that correspond to the lookup tables we are using. Let's see how to do that first.
Add Enumerated Attributes for Lookup Tables
If we have a foreign key in a table that points to a lookup table, we need to add an enumerated attribute to the element representing the table in which that foreign key appears.
Before we can do so with our example transformation, we need to identify the nature of the relationships between the tables we have selected to include in our XML structures. For each relationship, we need to identify:
- Whether it's a lookup or content relationship
- Whether it is a content relationship, and if so the direction in which it will be navigated
This is important in larger structures because some relationships can be navigated in more
than one direction, depending on how the relationships are arrived at. As a general rule,
relationships should be navigated in the same direction that a program would most often
navigate them. For example, in our case we're much more likely to navigate from
LineItem than we are to navigate from
We need to determine the direction we'll be navigating between our elements because it determines where our ID-IDREF relationships should be. Remember that these are effectively unidirectional - it's relatively easy to traverse a structure from an IDREF value to an ID value, but not the other way around. Deciding how we will normally be traversing our structures helps us to determine how we should structure them. If we need to navigate between two elements in either direction, we may need to add an IDREF in each element pointing back to the other element in the relationship. However, this will increase document creation time and size.
Since we want our document to support invoices and monthly summary information, we conclude that we want our relationships to provide a way to navigate from invoices to associated information. For example, we want to be able to go from an invoice, to its line items, to the part associated with each line item; or from an invoice, to the customer who ordered it. Under other circumstances, we might order our relationships differently - for example, if we wanted a customer-centric XML document.
After assessing all the relationships in our structure, we can conclude that the navigation between the tables looks something like this:
We have a foreign key called
ShipMethodType pointing to a table called
ShipMethod, therefore we need to add an enumerated value for
ShipMethod to the
<Invoice> element. Let's assume that in
our database, the
ShipMethod table contains the following values:
|1||US Postal Service|
The enumerated attribute should take the name of the lookup table, and should be declared as #REQUIRED if the foreign key does not allow NULLS, or #IMPLIED otherwise. The determination of the allowable values for the enumeration is a subjective process, and will depend on other design constraints (such as size). The values allowed should typically be human-readable versions of the description for each record.
So, creating an attribute with allowable enumerated values for the three possible lookup
values, and adding the attribute to the
<Invoice> element, gives us
... <!ELEMENT Invoice EMPTY> <!ATTLIST Invoice InvoiceID ID #REQUIRED InvoiceNumber CDATA #REQUIRED TrackingNumber CDATA #REQUIRED OrderDate CDATA #REQUIRED ShipDate CDATA #REQUIRED> ShipMethod (USPS | FedEx | UPS) #REQUIRED> <!ELEMENT Customer EMPTY> ...
Rule 6: Representing Lookup Tables.
For each foreign key that we have chosen to include in our XML structures that references a lookup table:
- Create an attribute on the element representing the table in which the foreign key is found.
- Give the attribute the same name as the table referenced by the foreign key, and make it #REQUIRED if the foreign key does not allow NULLS or #IMPLIED otherwise.
- Make the attribute of the enumerated list type. The allowable values should be some human-readable form of the description column for all rows in the lookup table.
Created: June 07, 2001
Revised: June 07, 2001