Professional XML Databases - Chapter 2, part 3 (page 2)
Professional XML Databases
Add Element Content to the Root Element
When we created the root element for the DTD, and added child elements for tables, we did not define the content models for the elements in the DTD - we said we would cover that when looking at relationships, and here we are.
The next rule, therefore, is to add the content model for the root element to the DTD. We should add element content that is appropriate for the type of information we are trying to communicate in our documents.
For our example, we decided that the primary concepts we want to convey are related to
MonthlyTotal. When we add the elements representing
those contents as allowable element content for the root element, we get the following:
<!ELEMENT SalesData (Invoice*, MonthlyTotal*)> <!ATTLIST SalesData Status (NewVersion | UpdatedVersion | CourtesyCopy) #REQUIRED> <!ELEMENT Invoice EMPTY> ...
Rule 7: Adding Element Content to Root elements.
Add a child element or elements to the allowable content of the root element for each table that models the type of information we want to represent in our document.
Walk the Relationships
The next rule is a little tricky. We need to walk the relationships between the table (or tables) to add element content or ID-IDREF(S) pairs as appropriate. This process is similar to the process we would use when walking a tree data structure Â we navigate each of the relationships, then each of the relationships from the children of the previous relationships, and so on, until all the relationships contained in the subset of tables we have chosen to include in our XML document have been traversed. Relationships that lead outside of the subset of tables we're representing do not need to be traversed.
Again, when we have a choice of directions in which a relationship may be followed, we
do so in the direction that makes the most business sense - for example, we'll probably
need to go from
LineItem pretty frequently, but much
less often from
Invoice. On the other hand, a relationship
such as the one between
Invoice may need to be walked
in either direction as frequently, making it necessary to define the relationship in both
directions. We need to determine the direction in which the relationships will be walked,
in order to determine whether pointing or containment should be used to represent the
One-to-One or One-to-Many Relationships
First consider the case where a relationship is one-to-one or one-to-many in the direction we selected for the relationship traversal, and the relationship is the only one in the selected subset of tables where this table is the destination of a relationship traversal. Then we should represent the relationship by adding the child element as element content of the parent content.
Assign the multiplicity of the element according to the following table:
|If the relationship is||Set the multiplicity to|
For our example DTD, adding the containment relationships gives us the following:
<!ELEMENT SalesData (Invoice*, MonthlyTotal*)> <!ATTLIST SalesData Status (NewVersion | UpdatedVersion | CourtesyCopy) #REQUIRED> <!ELEMENT Invoice (LineItem*)> <!ATTLIST Invoice InvoiceID ID #REQUIRED InvoiceNumber CDATA #REQUIRED TrackingNumber CDATA #REQUIRED OrderDate CDATA #REQUIRED ShipDate CDATA #REQUIRED> ShipMethod (USPS | FedEx | UPS) #REQUIRED> ... <!ELEMENT MonthlyTotal (MonthlyCustomerTotal*, MonthlyPartTotal*)> <!ATTLIST MonthlyTotal MonthlyTotalID ID #REQUIRED Month CDATA #REQUIRED Year CDATA #REQUIRED VolumeShipped CDATA #REQUIRED PriceShipped CDATA #REQUIRED> ...
Rule 8: Adding Relationships through Containment.
For each relationship we have defined, if the relationship is one-to-one or one-to-many in the direction it is being navigated, and no other relationship leads to the child within the selected subset, then add the child element as element content of the parent element with the appropriate cardinality.
Many-to-One or Multiple Parent Relationships
If the relationship is many-to-one, or the child has more than one parent, then we need to use pointing to describe the relationship. This is done by adding an IDREF or IDREFS attribute to the element on the parent side of the relationship. The IDREF should point to the ID of the child element. If the relationship is one-to-many, and the child has more than one parent, we should use an IDREFS attribute instead.
Note that if we have defined a relationship to be navigable in either direction, for the purposes of this analysis it really counts as two different relationships.
Note that these rules emphasize the use of containment over pointing whenever it is possible. Because of the inherent performance penalties when using the DOM and SAX with pointing relationships, containment is almost always the preferred solution. If we have a situation that requires pointing, however, and its presence in our structures is causing too much slowdown in our processing, we may want to consider changing the relationship to a containment relationship, and repeating the information pointed to wherever it would have appeared before.
Applying this rule to our example and adding IDREF/IDREFS attributes, we arrive at the following:
<!ELEMENT SalesData (Invoice*, MonthlyTotal*)> <!ATTLIST SalesData Status (NewVersion | UpdatedVersion | CourtesyCopy) #REQUIRED> <!ELEMENT Invoice (LineItem*)> <!ATTLIST Invoice InvoiceID ID #REQUIRED InvoiceNumber CDATA #REQUIRED TrackingNumber CDATA #REQUIRED OrderDate CDATA #REQUIRED ShipDate CDATA #REQUIRED ShipMethod (USPS | FedEx | UPS) #REQUIRED CustomerIDREF IDREF #REQUIRED> <!ELEMENT Customer EMPTY> ... <!ELEMENT MonthlyCustomerTotal EMPTY> <!ATTLIST MonthlyCustomerTotal MonthlyCustomerTotalID ID #REQUIRED VolumeShipped CDATA #REQUIRED PriceShipped CDATA #REQUIRED CustomerIDREF IDREF #REQUIRED> <!ELEMENT MonthlyPartTotal EMPTY> <!ATTLIST MonthlyPartTotal MonthlyPartTotalID ID #REQUIRED VolumeShipped CDATA #REQUIRED PriceShipped CDATA #REQUIRED PartIDREF IDREF #REQUIRED> <!ELEMENT LineItem EMPTY> <!ATTLIST LineItem LineItemID ID #REQUIRED Quantity CDATA #REQUIRED Price CDATA #REQUIRED PartIDREF IDREF #REQUIRED>
Rule 9: Adding Relationships using IDREF/IDREFS.
Identify each relationship that is many-to-one in the direction we have defined it, or whose child is the child in more than one relationship we have defined. For each of these relationships, add an IDREF or IDREFS attribute to the element on the parent side of the relationship, which points to the ID of the element on the child side of the relationship.
We're getting close to our final result, but there are still a couple of things we need to do to finalize the structure. We'll see how this is done in the next couple of sections.
Created: June 07, 2001
Revised: June 07, 2001