WebReference.com - The WebReference/internet.com Multi-Feed RSS Sidebar Tab (5/6) | WebReference

WebReference.com - The WebReference/internet.com Multi-Feed RSS Sidebar Tab (5/6)

To page 1To page 2To page 3To page 4current pageTo page 6
[previous] [next]

The Multi-Feed RSS Sidebar Tab

The Perl, continued

The &showOne function displays the HTML for a specific, user selected feed. It does so with the help of two included Perl modules, which do the bulk of the heavy lifting: XML::RSS, for RSS file processing; and HTML::Template, for variable replacement in HTML-like files. You'll need these two modules on your system (as well as CGI which you probably already have, and XML::Parser, which XML::RSS relies on) to run feedtab.cgi. If you don't have them, nab 'em from CPAN or have your server hosts install them for you.

We said earlier that we would provide more information for RSS files for those who weren't familiar with them. That time has now come. RSS is an XML-based file format designed primarily for the distribution of headline links and short descriptions throughout the Internet. Long-time readers of WebReference are no doubt familiar with RSS, as we provide our own information as well as information from several other of our internet.com sister sites via RSS files. Our former Perl columnist, Jonathan Eisenzopf, is the original author of the XML::RSS module, and discusses it and several of his RSS related tools in his archived Mother of Perl articles (see especially tutorials 8, rss1, and 22). Multiple other helpful RSS resources can be found on our RSS feeds listing page.

In a nutshell, XML::RSS parses an RSS file and formats its contents into objects that can be accessed directly within our Perl code. XML::RSS also allows you to modify and save new RSS files, but we won't be using that functionality in feedtab. In feedtab.cgi, the main XML::RSS related code is as follows:

   # initialize the RSS object
   my $rssinfo=new XML::RSS;
   # retrieve RSS data
   $rssinfo->parsefile($home.$selectedFeed);
   # save valid headlines in rssitems
   my @rssitems=();
   foreach my $rssHeadline (@{$rssinfo->{'items'}}) {
      my $ilink =$rssHeadline->{'link'};
      my $ititle=$rssHeadline->{'title'};
      next unless (($ilink) && ($ititle));
      push(@rssitems,$rssHeadline);
   }

This code is as simple as it looks; the RSS file is read (using parsefile) and then the individual headlines within it are examined and stored in the @rssitems array if both the link and the title are present (it's possible, though very unlikely, for one or the other to be blank). Later in the code, we'll access some specific other pieces of the RSS file, such as the feed title and search form parameters, if they exist.

The other key module used in feedtab.cgi is the HTML::Template module, which allows us to provide a basic separation between our HTML and the Perl code that will be used to generate it. Authored by Sam Tregar, the HTML::Template module is light and quick; being built and constantly improved for the primary purpose of CGI-HTML page displays. An active mailing list keeps up with all of the HTML::Template happenings at htmltmpl@lists.vm.com (send a blank message to htmltmpl-subscribe@lists.vm.com to join).

HTML::Template uses the concept of a template to create HTML displays. The template itself is a mixture of specific variables and constructs (that the Perl code uses) and standard HTML code that will be sent to the client's browser. A template can contain specific variables that the Perl code will supply, looping and if...else branching for logical decisions to be determined by the Perl code at runtime, and include statements for grabbing other files. As an example, let's consider our earlier HTML markup, specifically, the JavaScript arrays that contain our data for display:

t=new Array('Book Excerpt: Professional Java Web Services','Google SVG Search, Part II','Site Review: Alexa Web Search ','Update: Perl Links and Resources','Book Excerpt: Perl & XML','JScript .NET, Part III: Classes and Namespaces','Book Review: Constructing Accessible Web Sites','Update: CGI links and Resources','DOM-Based Conditional JavaScript Loading','Google SVG Search, Part I','Update: PHP Links and Resources','JScript .NET, Part II: Major Features','Book Excerpt: XML for ASP.NET Developers','No JavaScript? No Service','Spam I Am');
l=new Array('http://www.webreference.com/programming/java/webservices/','http://www.webreference.com/xml/column56/','http://www.webreference.com/new/020509.html','http://www.webreference.com/programming/perl/','http://www.webreference.com/programming/perl/perlxml/','http://www.webreference.com/js/column109/index.html','http://www.webreference.com/new/020502.html','http://www.webreference.com/programming/cgi.html','http://www.webreference.com/programming/javascript/domloader/','http://www.webreference.com/xml/column55/','http://www.webreference.com/programming/php/','http://www.webreference.com/js/column108/index.html','http://www.webreference.com/authoring/languages/xml/aspnet/','http://www.webreference.com/new/020418.html','http://www.webreference.com/outlook/license/gallery.html');

In HTML::Template speak, they would look like this:

t=new Array(<TMPL_LOOP NAME=JS_TEXT_LOOP>'<TMPL_VAR ESCAPE=1 NAME=TEXT>'<TMPL_UNLESS __LAST__>,</TMPL_UNLESS></TMPL_LOOP>);
l=new Array(<TMPL_LOOP NAME=JS_LINK_LOOP>'<TMPL_VAR NAME=URL>'<TMPL_UNLESS __LAST__>,</TMPL_UNLESS></TMPL_LOOP>);

Each component to be recognized by HTML::Template is enclosed in a <TMPL...> tag; a variable is represented by TMPL_VAR, a loop by TMPL_LOOP (and ended with </TMPL_LOOP>) and logic branching with TMPL_UNLESS or TMPL_IF. (Don't worry; these tags are only for HTML::Template's use and do not actually appear in the resulting HTML sent to the client.) For each iteration of a loop, the enclosed variables are passed to the template (and replaced) by sending an array of hash references to the designated loop name. For example, in feedtab, we create our array of hashes for the JS_TEXT_LOOP, JS_LINK_LOOP, and the later PAGE_LINK_LOOP as follows:

   # Format headlines for insertion in template. 
   my @JSArr = ();
   my @pageArr = ();
   # start at the users selected starting point
   my $feedCount=$start;
   for(my $i=0;$i<@rssitems;$i++) {
      my $rssHeadline=$rssitems[$feedCount];
      my %arrRow=();
         # just in case: escape apostrophes; as 
         # they will confuse JavaScript
         my $JSTitle = $rssHeadline->{'title'};
            $JSTitle =~ s/'/\\'/gs;
         $arrRow{'TEXT'}=$JSTitle;
         $arrRow{'URL'}=$rssHeadline->{'link'};
      if (@JSArr<$maxInArrays) {
         push(@JSArr,\%arrRow);
      }
      my %arrRow2 = ();
         $arrRow2{'TEXT'} = $rssHeadline->{'title'};
         $arrRow2{'URL'}  = $rssHeadline->{'link'};
      if (@pageArr<$displayFeed->{'totlinks'}) {
         push(@pageArr,\%arrRow2);
      }
      $feedCount++;
      $feedCount=0 if ($feedCount>=@rssitems);
   }

(Quick note: $start is passed in from the user as part of the URL, and allows the user to display any headline in the feed as the "starting" headline of the page; in this way we allow non-JavaScript users to also cycle through the headlines, albeit at a much, much slower pace.)

Having built the array of hashes complete with TEXT and URL fields, we can then apply them to the template like so:

   # Initialize template object. Set to allow passing of parameters
   # that may not exist within the template, and allow the __FIRST__,
   # __LAST__, __INNER__, and __ODD__ variables within loops.
   my $template=HTML::Template->new(filename          => $displayFeed->{'template_one'},
                                    loop_context_vars => 1,
                                    die_on_bad_params => 0);
   # insert variables into main HTML template
   $template->param(JS_LINK_LOOP   => \@JSArr,
                    JS_TEXT_LOOP   => \@JSArr,
                    PAGE_LINK_LOOP => \@pageArr,
                    DARK_COLOR     => $displayFeed->{'dark'},
                    LIGHT_COLOR    => $displayFeed->{'light'},
                    BG_COLOR       => $displayFeed->{'bg'},
                    LINK_COLOR     => $displayFeed->{'li'},
                    ACTIVE_LINK    => $displayFeed->{'al'},
                    VISIT_LINK     => $displayFeed->{'vl'},
                    HOVER_BACK     => $displayFeed->{'hbg'},
                    TOTLINKS       => $displayFeed->{'totlinks'},
                    PREV_LINK      => $prevLink,
                    NEXT_LINK      => $nextLink,
                    FEED_TITLE     => ((defined($displayFeed->{'shortTitle'}))?$displayFeed->{'shortTitle'}:$rssinfo->channel('title')),
                    FEED_LINK      => $rssinfo->channel('link'),
                    DO_FORM        => $rssinfo->textinput('name'),
                    QNAME          => $rssinfo->textinput('name'),
                    QLINK          => $rssinfo->textinput('link'),
                    QTITLE         => $rssinfo->textinput('title'),
                    THIS_LINK      => $thisURL,
                    THIS_URL       => ($thisURL."?feed=$selectedFeed"));
   return $template->output();

In addition to the previously mentioned loops, we also apply many other variables to the template, which you may therefore choose to use (or ignore) in your own HTML code. And of course the best part of the templates is that you can modify your HTML code and produce a completely different look for your feeds without changing a byte of your Perl code.

We only use a subset of the HTML::Template capabilities in our feedtab.cgi code. Other features you may want to investigate include the use of __FIRST__, __INNER__, or __ODD__ booleans within loops (for alternately colored tables, for example) and the ability to automatically associate template variables to a Perl object (provided the Perl object has a param function, like CGI).

There are a couple other miscellaneous things to point out about the Perl code before we wrap up this project. Note that the default template(s) are provided in the %defHash structure; which means they can be overridden by each particular feed (meaning that each feed can potentially have its own unique display template). Also note that within the templates we always use the ESCAPE parameter for displayed variables, i.e.:

<TMPL_VAR ESCAPE=1 NAME="FEED_TITLE">

XML::RSS tends to automatically unescape HTML entities as it reads them; so telling HTML::Template to escape them adds a small amount of safety to our displays. With ESCAPE=1, embedded angle brackets and the like in the RSS text will be properly displayed in the resulting HTML page as entities (i.e. &lt; and &gt;). Without the ESCAPE attribute, you run the risk of displaying angle brackets literally, and therefore hosing your HTML layout. Since RSS files should not contain literal angle brackets anyways, and XML::RSS automatically unescapes them, always setting ESCAPE=1 in our templates should be the safest move.

Finally, earlier versions of the popular CGI module (I believe from around version 2.6 or so and earlier) might have problems reading our HTML-Friendly formatted parameters; i.e., providing multiple parameters to feedtab separated by semicolons instead of ampersands. As the script stands now, this should only effect non-JavaScript browsers so it's not likely to be prevalent. If it does cause you trouble, (it is recognized when you attempt to scroll through the headlines of a specific feed using the next/previous buttons and instead continously get dumped back to the main feed listing), then you can either update your version of the CGI module, or change the following lines as follows:

$prevLink = $thisURL."?feed=$selectedFeed&start=".$prevLink;
$nextLink = $thisURL."?feed=$selectedFeed&start=".$nextLink;

That pretty much wraps up the code. Now let's see if we can wrap up this article.


To page 1To page 2To page 3To page 4current pageTo page 6
[previous] [next]

Created: May 17, 2002
Revised: May 17, 2002

URL: http://webreference.com/scripts/sidebar/5.html