Professional XML Databases | 5 | WebReference

Professional XML Databases | 5

current pageTo page 2
[next]

Professional XML Databases

Model the Tables

Having defined our root element, the next step is to model the tables that we've chosen to include in our XML document. As we saw in the last chapter, tables map directly to elements in XML.

Loosely speaking, these tables should either be:

There is another type of table - a relating table - whose sole purpose is to express a many-to-many relationship between two other tables. For our purposes, we shall model a table like this as a content table.

At this stage we will only be modeling content tables. Lookup tables will actually be modeled as enumerated attributes later in the process.

For each content table that we've chosen to include from our relational database, we will need to create an element in our DTD. Applying this rule to our example, we'll add the <Invoice>, <Customer>, <Part>, <MonthlyTotal>, and other elements to our DTD:

<!ELEMENT SalesData EMPTY>
<!ATTLIST SalesData 
  Status (NewVersion | UpdatedVersion | CourtesyCopy) #REQUIRED>
<!ELEMENT Invoice EMPTY>
<!ELEMENT Customer EMPTY>
<!ELEMENT Part EMPTY>
<!ELEMENT MonthlyTotal EMPTY>
<!ELEMENT MonthlyCustomerTotal EMPTY>
<!ELEMENT MonthlyPartTotal EMPTY>
<!ELEMENT LineItem EMPTY>

For the moment, we will just add the element definitions to the DTD. We'll come back to ensure that they are reflected in the necessary element content models, (including those of the root element), when we model the relationships between the tables.

Note that we didn't model the ShipMethod table, because it's a lookup table. We'll handle this table in Rule 6.

Rule 3: Model the Content Tables.

Create an element in the DTD for each content table we have chosen to model. Declare these elements as EMPTY for now.

Model the Nonforeign Key Columns

Using this rule, we'll create attributes on the elements we have already defined to hold the column values from our database. In a DTD, these attributes should appear in the !ATTLIST declaration of the element corresponding to the table in which the column appears.

If a column is a foreign key joining to another table, don't include it in this rule - we'll handle foreign key columns later in the process, when we model the relationships between the elements we have created.

Declare each attribute created this way as having the type CDATA. If the column is defined in your database as not allowing NULL values, then make the corresponding attribute #REQUIRED; otherwise, make the corresponding attribute #IMPLIED.

We have four choices here. #FIXED means the DTD provides the value. #REQUIRED means it must appear in the document. #IMPLIED means that it may or may not appear in the document. Finally, a value with these means that the processor must substitute that value for the attribute if it is not provided in the document. #IMPLIED is the only way to legitimately leave off an attribute value.

If we choose to store table column values as the content of elements, rather than attributes, we can take the same approach - create an element for each data point, and add it to the content list of the element for the table in which the column appears. Use no suffix if the column does not allow nulls; or the optional suffix (?) if the column allows nulls. Be aware that if we take this approach, we'll need to be on the look out for possible name collisions between columns in different tables with the same name. This is not an issue when using attributes.

To summarize:

Does the column allow NULLS? Elements Attributes
Allows NULLS Use the ? suffix Declare as #IMPLIED
Doesn't allow NULLS Use no suffix Declare as #REQUIRED

For our example, remember that we want to keep all the nonforeign key columns, with the exception of the system-generated primary keys:

<!ELEMENT SalesData EMPTY>
<!ATTLIST SalesData 
  Status (NewVersion | UpdatedVersion | CourtesyCopy) #REQUIRED>
<!ELEMENT Invoice EMPTY>
<!ATTLIST Invoice
   InvoiceNumber CDATA #REQUIRED
   TrackingNumber CDATA #REQUIRED
   OrderDate CDATA #REQUIRED
   ShipDate CDATA #REQUIRED>
<!ELEMENT Customer EMPTY>
<!ATTLIST Customer
   Name CDATA #REQUIRED
   Address CDATA #REQUIRED
   City CDATA #REQUIRED
   State CDATA #REQUIRED
   PostalCode CDATA #REQUIRED>
<!ELEMENT Part EMPTY>
<!ATTLIST Part 
   PartNumber CDATA #REQUIRED
   Name CDATA #REQUIRED
   Color CDATA #REQUIRED
   Size CDATA #REQUIRED>
<!ELEMENT MonthlyTotal EMPTY>
<!ATTLIST MonthlyTotal
   Month CDATA #REQUIRED
   Year CDATA #REQUIRED
   VolumeShipped CDATA #REQUIRED
   PriceShipped CDATA #REQUIRED>
<!ELEMENT MonthlyCustomerTotal EMPTY>
<!ATTLIST MonthlyCustomerTotal
   VolumeShipped CDATA #REQUIRED
   PriceShipped CDATA #REQUIRED>
<!ELEMENT MonthlyPartTotal EMPTY>
<!ATTLIST MonthlyPartTotal
   VolumeShipped CDATA #REQUIRED
   PriceShipped CDATA #REQUIRED>
<!ELEMENT LineItem EMPTY>
<!ATTLIST LineItem
   Quantity CDATA #REQUIRED
   Price CDATA #REQUIRED>

Note that we left off Month and Year on the <MonthlyPartTotal> and <MonthlySummaryTotal> structures, since these will be dictated by the <MonthlyTotal> element associated with these elements.

Rule 4: Modeling Nonforeign Key Columns.

Create an attribute for each column we have chosen to include in our XML document (except foreign key columns). These attributes should appear in the !ATTLIST declaration of the element corresponding to the table in which they appear. Declare each of these attributes as CDATA, and declare it as #IMPLIED or #REQUIRED depending on whether the original column allowed nulls or not.

Contents

current pageTo page 2
[next]


Created: May 09, 2001
Revised: May 09, 2001

URL: http://webreference.com/authoring/languages/xml/databases/chap2/2/