Using RSS News Feeds | 5
Using RSS News Feeds
The XML::RSS Module
Now that you've had a change to glance at two RSS examples, it's time to introduct the XML::RSS module. XML::RSS is a subclass of XML::Parser, a Perl module maintained by Clark Cooper that utilizes James Clark's Expat C library. XML::RSS was developed to simplify the task of manipulating and parsing RSS files. A deep understanding of XML is not a prerequisite for using XML::RSS since the XML details are hidden inside the class interface.
While XML::RSS is capable of creating RSS files, we will be
focusing on parsing existing RSS files in this column. You can read
more about the capabilities of XML::Parser in the module's
documentation or by typing:
Well, let's look at the code shall we? Lines 16-17 load the XML::RSS and LWP::Simple modules. We've already talked about XML::RSS in brief, but what does LWP::Simple do? Good question! The answer is simple (puns intended). It's a procedural interface for interacting with a Web server. It's also the little cousin of LWP::UserAgent, a fuller object oriented interface. We'll be using one of the library's subroutines later in the code to fetch an RSS file from the Web.
In lines 20-21 we initialize two variables that we're going to use later.
Next we create a new instance of the XML::RSS class and assign the
reference to the
$rss variable on
Now we must determine whether the command-line parameter the user
entered is an HTTP URL or a file on the local file system
(lines 34-46). On
line 34, we us a
regular expression to look for the characters
If the command-line argument starts with these characters, we can safely
assume that the user intends to retrieve an RSS file from a Web server.
On line 35 we pass the
argument to the
get() function, which is a part of
LWP::Simple, and assign the results to the
variable. On line 36 we call
$content is empty. If this happens,
it means there was an error retrieving the RSS file. If the RSS file
was downloaded successfully,
$rss->parse($content) is called
which parses the RSS file and stores the results in the object's internal
structure (line 38).
If the command-line argument does not contain the
characters, we assume the argument is a file instead of a URL on
lines 41-46. The
first thing we do is assign the value of
$file variable and test for the existence of
the file (lines 42-43).
Then we call
(line 45), which parses
the RSS file and stores the results in the object's internal structure.
parsefile() method parses a file, whereas the
parse() method parses the string that's passed to it.
Lastly, we call the
print_html subroutine on
line 49, which converts
the RSS object in nicely formatted HTML.
As you examine this subroutine, you will begin to understand
the internal structure of the XML::RSS object. The critical portion
of the subroutine is contained on
lines 76-79. In this
foreach loop, we iterate over each of the RSS items.
Next, let's take a look at rss2html.pl in action.
Produced by Jonathan
Created: September 1, 1999
Revised: Septemver 1, 1999