spacer

Webref WebRef   Sitemap · Experts · Tools · Services · Newsletters · About i.com

home / experts / perl / tutorial / 8

Developer News
Mandrake Linux Founder Back, Virtually
Amazon: We're a Technology Company
Sun Expands MySQL With Closed Source

Using RSS News Feeds

The XML::RSS Module

Now that you've had a change to glance at two RSS examples, it's time to introduct the XML::RSS module. XML::RSS is a subclass of XML::Parser, a Perl module maintained by Clark Cooper that utilizes James Clark's Expat C library. XML::RSS was developed to simplify the task of manipulating and parsing RSS files. A deep understanding of XML is not a prerequisite for using XML::RSS since the XML details are hidden inside the class interface.

While XML::RSS is capable of creating RSS files, we will be focusing on parsing existing RSS files in this column. You can read more about the capabilities of XML::Parser in the module's documentation or by typing:
perldoc XML::RSS

The Code

Well, let's look at the code shall we? Lines 16-17 load the XML::RSS and LWP::Simple modules. We've already talked about XML::RSS in brief, but what does LWP::Simple do? Good question! The answer is simple (puns intended). It's a procedural interface for interacting with a Web server. It's also the little cousin of LWP::UserAgent, a fuller object oriented interface. We'll be using one of the library's subroutines later in the code to fetch an RSS file from the Web.

In lines 20-21 we initialize two variables that we're going to use later.

Line 25 starts the main code body. The first thing we do is verify that the user typed exactly one command-line parameter. This parameter is then assigned to the $arg variable in line 28.

Next we create a new instance of the XML::RSS class and assign the reference to the $rss variable on line 31.

Now we must determine whether the command-line parameter the user entered is an HTTP URL or a file on the local file system (lines 34-46). On line 34, we us a regular expression to look for the characters http:.

If the command-line argument starts with these characters, we can safely assume that the user intends to retrieve an RSS file from a Web server. On line 35 we pass the argument to the get() function, which is a part of LWP::Simple, and assign the results to the $content variable. On line 36 we call die() if $content is empty. If this happens, it means there was an error retrieving the RSS file. If the RSS file was downloaded successfully, $rss->parse($content) is called which parses the RSS file and stores the results in the object's internal structure (line 38).

If the command-line argument does not contain the http: characters, we assume the argument is a file instead of a URL on lines 41-46. The first thing we do is assign the value of $arg to the $file variable and test for the existence of the file (lines 42-43).

Then we call $rss->parsefile($file) (line 45), which parses the RSS file and stores the results in the object's internal structure. The parsefile() method parses a file, whereas the parse() method parses the string that's passed to it.

Lastly, we call the print_html subroutine on line 49, which converts the RSS object in nicely formatted HTML.

print_html

As you examine this subroutine, you will begin to understand the internal structure of the XML::RSS object. The critical portion of the subroutine is contained on lines 76-79. In this foreach loop, we iterate over each of the RSS items.

Next, let's take a look at rss2html.pl in action.


home / experts / perl / tutorial / 8

http://www.internet.com

Produced by Jonathan Eisenzopf and

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info

Legal Notices, Licensing, Reprints, Permissions, Privacy Policy.
Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Whitepapers and eBooks

Symantec Whitepaper: Converging System and Data Protection for Complete Disaster Recovery
Intel Whitepaper: Comparing Two- and Four-Socket Platforms for Server Virtualization
IBM Solutions Brief: Go Green With IBM System xTM And Intel
HP eBook: Simplifying SQL Server Management
IBM Contest: Are You the Next Superstar? Join the "Search for the XML Superstar" Contest to Find Out
Intel PDF: Quad-Core Impacts More Than the Data Center
Intel PDF: Virtualization Delivers Data Center Efficiency
Go Parallel Article: PDC 2008 in Review
Avaya Article: Communication-Enabled Mashups: Empowering Both Business Owners and IT
Intel Whitepaper: Building a Real-World Model to Assess Virtualization Platforms
PDF: Intel Centrino Duo Processor Technology with Intel Core2 Duo Processor
Microsoft Article: Build and Run Virtual Machines with Hyper-V Server 2008
  Go Parallel Article: Q&A with a TBB Junkie
IBM Whitepaper: Innovative Collaboration to Advance Your Business
Internet.com eBook: Real Life Rails
IBM eBook: The Pros and Cons of Outsourcing
Internet.com eBook: Best Practices for Developing a Web Site
IBM CXO Whitepaper: The 2008 Global CEO Study "The Enterprise of the Future"
Avaya Article: Call Control XML in Action - A CCXML Auto Attendant
IBM CXO Whitepaper: Unlocking the DNA of the Adaptable Workforce--The Global Human Capital Study 2008
Adobe Acrobat Connect Pro: Web Conferencing and eLearning Whitepapers
Symantec Whitepaper: Comprehensive Backup and Recovery of VMware Virtual Infrastructure
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
webref The latest from WebReference.com Browse >
Popular JavaScript Framework Libraries: An Overview - Part 3 · Accessing Your MySQL Database from the Web with PHP · Working with the DOM Stylesheets Collection
Sitemap · Experts · Tools · Services · Email a Colleague · Contact FREE Newsletters 
 The latest from internet.com
Fixing MySQL Replication · Firewall Guide: First Steps to Securing the Enterprise · VoxOx Tames the Tumultuous Communications Tangle


Created: September 1, 1999
Revised: Septemver 1, 1999

URL: http://www.webreference.com/perl/tutorial/8/