| home / authoring / languages / xml / databases / chap2 / 2 |
[next] |
|
Model the TablesHaving defined our root element, the next step is to model the tables that we've chosen to include in our XML document. As we saw in the last chapter, tables map directly to elements in XML. Loosely speaking, these tables should either be:
There is another type of table - a relating table - whose sole purpose is to express a many-to-many relationship between two other tables. For our purposes, we shall model a table like this as a content table. At this stage we will only be modeling content tables. Lookup tables will actually be modeled as enumerated attributes later in the process. For each content table that we've chosen to include from our relational database, we will need to create an element in our DTD. Applying this rule to our example, we'll add the <Invoice>, <Customer>, <Part>, <MonthlyTotal>, and other elements to our DTD:
For the moment, we will just add the element definitions to the DTD. We'll come back to ensure that they are reflected in the necessary element content models, (including those of the root element), when we model the relationships between the tables. Note that we didn't model the ShipMethod table, because it's a lookup table. We'll handle this table in Rule 6.
Model the Nonforeign Key ColumnsUsing this rule, we'll create attributes on the elements we have already defined to hold the column values from our database. In a DTD, these attributes should appear in the !ATTLIST declaration of the element corresponding to the table in which the column appears. If a column is a foreign key joining to another table, don't include it in this rule - we'll handle foreign key columns later in the process, when we model the relationships between the elements we have created. Declare each attribute created this way as having the type CDATA. If the column is defined in your database as not allowing NULL values, then make the corresponding attribute #REQUIRED; otherwise, make the corresponding attribute #IMPLIED. We have four choices here. #FIXED means the DTD provides the value. #REQUIRED means it must appear in the document. #IMPLIED means that it may or may not appear in the document. Finally, a value with these means that the processor must substitute that value for the attribute if it is not provided in the document. #IMPLIED is the only way to legitimately leave off an attribute value. If we choose to store table column values as the content of elements, rather than attributes, we can take the same approach - create an element for each data point, and add it to the content list of the element for the table in which the column appears. Use no suffix if the column does not allow nulls; or the optional suffix (?) if the column allows nulls. Be aware that if we take this approach, we'll need to be on the look out for possible name collisions between columns in different tables with the same name. This is not an issue when using attributes. To summarize:
For our example, remember that we want to keep all the nonforeign key columns, with the exception of the system-generated primary keys:
Note that we left off Month and Year on the <MonthlyPartTotal> and <MonthlySummaryTotal> structures, since these will be dictated by the <MonthlyTotal> element associated with these elements.
|
| home / authoring / languages / xml / databases / chap2 / 2 |
[next] |
Created: May 09, 2001
Revised: May 09, 2001