RSS and Atom in Action: Newsfeed Formats | Page 3
RSS and Atom in Action: Newsfeed Formats
4.2.1 The elements of RSS 1.0
As we did with RSS 0.91, let's take a look at a summary diagram of RSS 1.0. Figure 4.2 shows the XML elements that make up RSS 1.0, the containment hierarchy, and the optional and required elements. We use the same notation we used in figure 4.1, along with a new notation: required XML attributes are marked with an "at" sign
By comparing the RSS 0.91 and RSS 1.0 diagrams, you can see that the formats are significantly different. Here are the key differences:
- A typical RSS 1.0 newsfeed is longer and more complex, but it does not include as much metadata as the equivalent RSS 0.91 newsfeed.
- RSS 1.0 is more complex, but only because it is more flexible and extensible.
- The root element is
<rdf:rdf> rather than
- News items exist as children of the document's root element and not as children of the
<channel> element, as they do in RSS 0.91.
- News items must be declared inside the
<channel> as RDF resources.
<textinput> elements must be declared inside the
<rdf:rdf> element as RDF resources if they are to be included inside the
- Many metadata elements, such as
<webmaster> are missing from the format. These can be added as needed by using RSS 1.0 modules, which are described in the next section.
As you might imagine, these differences were significant enough to break all existing RSS parsers, but RSS 1.0 was released years ago and RSS parsers have been updated to handle both formats. Later in this chapter, we will show you how to write a parser that can handle RSS 0.91 and RSS 1.0 newsfeeds.
4.2.2 Extending RSS 1.0 with modules
Unlike RSS 0.91, RSS 1.0 is extensible in the same ways that XML is extensible. The RDF Site Summary 1.0 Modules specification defines how this is done. The specification allows producers of RSS 1.0 newsfeeds to add new XML elements to their newsfeeds as long as those elements are defined in their own XML namespaces. The RSS 1.0 Modules specification defines three standard modules:
- Dublin Core (http://purl.org/rss/1.0/modules/content/)—This module defines basic data types, such as title, date, description, creator, and language, which are useful at the newsfeed or item level. The Dublin Core
<dc:creator> elements are often used in both RSS 1.0 and 2.0 newsfeeds.
- Content (http://purl.org/rss/1.0/modules/ content/)—This module defines elements for web site content and de fines content formats. The Content module's
<content:encoded> element is often used in RSS 1.0 and 2.0 newsfeeds to allow both a short summary (in
<description>) and full content (in
- Syndication (http://purl.org/rss/1.0/modules/syndication/)—The syndication module defines elements to tell newsreaders how often to poll the newsfeed for updates. These are the RSS 1.0 equivalents for the
<skipdays> elements we present in RSS 0.91.
Using a module is easy. For example, if you want to add a
<date> element to your RSS 1.0 newsfeed, something that is not in the RSS 1.0 specification, you can use the
<date> element defined by the Dublin Core Metadata Initiative to do just that. Simply declare the XML namespace in the standard way and add your new element. The newsfeed fragment below shows how you would do this. The Dublin Core namespace declaration attribute is shown in bold inside the
<rdf:rdf> element, and the usage of the Dublin Core
<dc:date> element is also shown in bold. Note that Dublin Core dates use the ISO 8601 date format.
In addition to the three standard modules, there are a number of user-contributed modules. Here are some of the more interesting ones:
mod_annotation—This module defines an element that can be used to annotate newsfeed items.
mod_audio—This module defines a series of
<audio> elements that can be used to add metadata such as song name, artist, and album to audio tracks referenced by items in a newsfeed.
mod_link—This module defines a
<link> element, modeled on the HTML element of the same name, that can be used to provide links to resources referenced in a newsfeed.
mod_taxonomy—This module defines a series of
<taxo> elements that can be used to add tags or category information to items in a newsfeed. For example, the popular del.icio.us bookmark-sharing site uses this module in its newsfeeds to represent bookmark tags.
You can learn more about these modules' specifications and many others at http://web.resource.org/rss/1.0/modules.