Professional XML Databases - Chapter 2, part 3 (page 2) | WebReference

Professional XML Databases - Chapter 2, part 3 (page 2)

To page 1current page
[previous]

Professional XML Databases

Add Element Content to the Root Element

When we created the root element for the DTD, and added child elements for tables, we did not define the content models for the elements in the DTD - we said we would cover that when looking at relationships, and here we are.

The next rule, therefore, is to add the content model for the root element to the DTD. We should add element content that is appropriate for the type of information we are trying to communicate in our documents.

For our example, we decided that the primary concepts we want to convey are related to the Invoice and MonthlyTotal. When we add the elements representing those contents as allowable element content for the root element, we get the following:

<!ELEMENT SalesData (Invoice*, MonthlyTotal*)>
<!ATTLIST SalesData 
  Status (NewVersion | UpdatedVersion | CourtesyCopy) #REQUIRED>
<!ELEMENT Invoice EMPTY>
...

Rule 7: Adding Element Content to Root elements.

Add a child element or elements to the allowable content of the root element for each table that models the type of information we want to represent in our document.

Walk the Relationships

The next rule is a little tricky. We need to walk the relationships between the table (or tables) to add element content or ID-IDREF(S) pairs as appropriate. This process is similar to the process we would use when walking a tree data structure – we navigate each of the relationships, then each of the relationships from the children of the previous relationships, and so on, until all the relationships contained in the subset of tables we have chosen to include in our XML document have been traversed. Relationships that lead outside of the subset of tables we're representing do not need to be traversed.

Again, when we have a choice of directions in which a relationship may be followed, we do so in the direction that makes the most business sense - for example, we'll probably need to go from Invoice to LineItem pretty frequently, but much less often from LineItem to Invoice. On the other hand, a relationship such as the one between Customer and Invoice may need to be walked in either direction as frequently, making it necessary to define the relationship in both directions. We need to determine the direction in which the relationships will be walked, in order to determine whether pointing or containment should be used to represent the relationships.

One-to-One or One-to-Many Relationships

First consider the case where a relationship is one-to-one or one-to-many in the direction we selected for the relationship traversal, and the relationship is the only one in the selected subset of tables where this table is the destination of a relationship traversal. Then we should represent the relationship by adding the child element as element content of the parent content.

Assign the multiplicity of the element according to the following table:

If the relationship is Set the multiplicity to
One-to-one ?
One-to-many *

For our example DTD, adding the containment relationships gives us the following:

<!ELEMENT SalesData (Invoice*, MonthlyTotal*)>
<!ATTLIST SalesData 
  Status (NewVersion | UpdatedVersion | CourtesyCopy) #REQUIRED>
<!ELEMENT Invoice (LineItem*)>
 
<!ATTLIST Invoice
   InvoiceID ID #REQUIRED
   InvoiceNumber CDATA #REQUIRED
   TrackingNumber CDATA #REQUIRED
   OrderDate CDATA #REQUIRED
   ShipDate CDATA #REQUIRED>
   ShipMethod (USPS | FedEx | UPS) #REQUIRED>
...
<!ELEMENT MonthlyTotal (MonthlyCustomerTotal*, MonthlyPartTotal*)>
<!ATTLIST MonthlyTotal
   MonthlyTotalID ID #REQUIRED
   Month CDATA #REQUIRED
   Year CDATA #REQUIRED
   VolumeShipped CDATA #REQUIRED
   PriceShipped CDATA #REQUIRED>
...

Rule 8: Adding Relationships through Containment.

For each relationship we have defined, if the relationship is one-to-one or one-to-many in the direction it is being navigated, and no other relationship leads to the child within the selected subset, then add the child element as element content of the parent element with the appropriate cardinality.

Many-to-One or Multiple Parent Relationships

If the relationship is many-to-one, or the child has more than one parent, then we need to use pointing to describe the relationship. This is done by adding an IDREF or IDREFS attribute to the element on the parent side of the relationship. The IDREF should point to the ID of the child element. If the relationship is one-to-many, and the child has more than one parent, we should use an IDREFS attribute instead.

Note that if we have defined a relationship to be navigable in either direction, for the purposes of this analysis it really counts as two different relationships.

Note that these rules emphasize the use of containment over pointing whenever it is possible. Because of the inherent performance penalties when using the DOM and SAX with pointing relationships, containment is almost always the preferred solution. If we have a situation that requires pointing, however, and its presence in our structures is causing too much slowdown in our processing, we may want to consider changing the relationship to a containment relationship, and repeating the information pointed to wherever it would have appeared before.

Applying this rule to our example and adding IDREF/IDREFS attributes, we arrive at the following:

<!ELEMENT SalesData (Invoice*, MonthlyTotal*)>
<!ATTLIST SalesData 
  Status (NewVersion | UpdatedVersion | CourtesyCopy) #REQUIRED>
<!ELEMENT Invoice (LineItem*)>
<!ATTLIST Invoice
   InvoiceID ID #REQUIRED
   InvoiceNumber CDATA #REQUIRED
   TrackingNumber CDATA #REQUIRED
 
   OrderDate CDATA #REQUIRED
   ShipDate CDATA #REQUIRED
   ShipMethod (USPS | FedEx | UPS) #REQUIRED
   CustomerIDREF IDREF #REQUIRED>
<!ELEMENT Customer EMPTY>
...
<!ELEMENT MonthlyCustomerTotal EMPTY>
<!ATTLIST MonthlyCustomerTotal
   MonthlyCustomerTotalID ID #REQUIRED
   VolumeShipped CDATA #REQUIRED
   PriceShipped CDATA #REQUIRED
   CustomerIDREF IDREF #REQUIRED>
<!ELEMENT MonthlyPartTotal EMPTY>
<!ATTLIST MonthlyPartTotal
   MonthlyPartTotalID ID #REQUIRED
   VolumeShipped CDATA #REQUIRED
   PriceShipped CDATA #REQUIRED
   PartIDREF IDREF #REQUIRED>
<!ELEMENT LineItem EMPTY>
<!ATTLIST LineItem
   LineItemID ID #REQUIRED
   Quantity CDATA #REQUIRED
   Price CDATA #REQUIRED
   PartIDREF IDREF #REQUIRED>

Rule 9: Adding Relationships using IDREF/IDREFS.

Identify each relationship that is many-to-one in the direction we have defined it, or whose child is the child in more than one relationship we have defined. For each of these relationships, add an IDREF or IDREFS attribute to the element on the parent side of the relationship, which points to the ID of the element on the child side of the relationship.

We're getting close to our final result, but there are still a couple of things we need to do to finalize the structure. We'll see how this is done in the next couple of sections.

To page 1current page
[previous]


Created: June 07, 2001
Revised: June 07, 2001

URL: http://webreference.com/authoring/languages/xml/databases/chap2/3/2.html