| home / programming / perl / comments / v.950 / 2 | [previous] |
|
In our previous release of Simple Comments, we introduced CAPTCHA-based identification--the ability to distinguish between bots and humans by requiring the entry of graphically displayed characters. If you're unfamiliar with the Simple Comments CAPTCHA system or captchas in general, you might wish to review the v.930 release notes now.
Our initial implementation of captchas for Simple Comments was based on the system deployed over at Captchas.net, which is free, simple to deploy and also includes audio captchas for the visually impaired. With v.950, we also support the use of reCAPTCHAS.

A sample reCAPTCHA entry box.
reCAPTCHAS are captchas with a twist: Instead of supplying any old random graphical characters or words to be typed into the submission form, the reCAPTCHA form instead displays words that have been scanned from books but could not be properly recognized by OCR technology--i.e., words that the computer itself literally cannot understand or make out. By typing in the correct characters for these mis-scanned words, your visitors are not only validating themselves to your forms, but are also helping to digitize older texts for electronic representation. Have a look at this page of the reCAPTCHA site for further details on the project and the captcha mechanism.
To use reCAPTCHAS in Simple Comments, you must first (as is the case with
Captchas.net) register for an account at the reCAPTCHA site mentioned above. As
a result of that registration, you'll be provided with several pieces of
information, which you can incorporate within your config.xml
file, including your own unique public and private key (which can be used only
on your Web site). To enable reCAPTCHAs in Simple Comments, you need to set
the captcha_enabled parameter to 1, and set the
captcha_system parameter--new in v.950--to recaptcha.
The previous captcha system (from Captchas.net) is also still available in
v.950; to use it, you simply set captcha_system to
captchas.net.
A clear danger with any CGI script which allows users to submit information that will be written to your hard drive is the potential for misuse; i.e., bad guys can bombard your site with submissions in an attempt to fill your disk and/or slow your site response times. This is also a concern with the new release of Simple Comments; since visitor profile registrations are written to the disk in a one-profile-per-disk-file architecture. Even if the visitors never bother confirming their profile registrations (and it's doubtful that Mr. or Mrs. Bad Guy would bother to confirm), they can still generate thousands of files on your drive--by submitting bogus visitor profiles--in a relatively short period of time.
To combat this and other DoS-related possibilities in Simple Comments, I've added a submission threshold feature. In brief, the submission threshold will prevent any submitter from supplying comments or visitor registrations faster than a predetermined rate, which can be set by the administrator. In its default deployment, Simple Comments restricts submitters to one comment or profile every 30 seconds. Such a limitation is a minor inconvenience for legitimate users (most won't even notice the feature), while providing an effective deterrent for all but the most determined of Bad Guys. You can adjust the 30 second interval to anything you like; or if you're confident your system is Bad Guy proof (via some other means) you can disable the feature entirely (by setting the threshold to zero).
Other enhancements, adjustments, and outright fixes finding their way into this version of Simple Comments include:
The new no_html parameter can now be
used to block all
forms of HTML code in submitted comments; including the basic HTML
tags (b, i, pre, and links) that would otherwise be allowed. If you
prefer to not allow any HTML in your comment submissions,
set this parameter to 1. When set to 1, all HTML tags will
be displayed literally within the comment, instead of being interpreted
as HTML; i.e., instead of getting boldface type you will instead
get <b>boldface type</b>.
Fixed a typo in the unauthorized_user.tmpl that was
preventing the form from being displayed (at all) in Internet Explorer
when Simple Comments was doing the authentication to the administration
script (i.e., deployments that were using Apache-based authentication
were immune to this one; since the unauthorized_user.tmpl
is not used in that scenario). The typo was a duplicated <title>
tag; i.e.:
<title>Unauthorized Access<title>
instead of
<title>Unauthorized Access</title>
Surpisingly enough, this typo was enough to cause the entire page to be unrendered in Internet Explorer--i.e., the HTML was delivered to the browser but nothing would actually be displayed within it! Thankfully the new templates have this bug fixed--but keep that typo in mind when working on your own Internet Explorer-based projects!
Slightly strengthened administrator password storage by "salting" the passwords with the administrator's login ID.
That last point deserves a bit more detail. Previously, we simply hashed the
admin passwords and stored them directly within the users.xml file,
which provided good protection--but it could have been better. Salting the hashed
passwords--i.e., combining the password with a value unique to each
password entry and user before hashing--can strengthen the overall password file
if it (the file) ever gets into the hands of crackers.
To break a hashed password, a cracker typically takes a dictionary of common
words, phrases, or character combinations, hashes each of those entries in the
same manner that the program using the hashes did, and then compares the results
to the actual encoded hashes in the user list (such as the password encodings stored
in the users.xml file). If all the passwords in the file were salted
with the same value (or unsalted), then the cracker can quickly compare the results
of their encoded dictionary to all the entries in the file without having to
re-encrypt their dictionary between password checks. While this might seem like
a minor deterrent, forcing the cracker to reencrypt the dictionary between password
checks can slow them down a tremendous amount--perhaps even enough to warrant moving
on to another malicious project and abandoning your password file altogether.
In Simple Comments we've now chosen to salt the stored passwords with the
user's ID; fulfilling our requirements that each password is salted with a
separate value (that both we and the administrator supplying the password knows,
and that can be somehow linked to the user themselves for subsequent authentication
checks) and therefore forcing the cracker to re-hash their dictionary on each
password cracking attempt. You might be tempted to salt the password with the
password itself before storing it--but that would be a serious mistake.
Doing so offers no deterrence to the would-be cracker
(who can just encrypt their whole dictionary once using the dictionary words
themselves; since the dictionary word itself is the salt), and with some
functions--such as Perl's crypt function on most systems--the
salt itself is stored literally as part of the encrypted string that is
returned. If your crypt-based script salts the password with the password
itself, you'll be giving away two literal characters of the password to any would-be attacker! I.E.:
my $password = "Testing";
# Don't do this
my $encoded = crypt($password, $password);
print $encoded, "\n";
# Above prints "TeakRFRsTNc6E"
# Note that the first two characters are
# the first two characters of the password
# Instead do something like this
$encoded = crypt($password,
join('', ('A'..'Z', 'a'..'z')[rand 52, rand 52]));
print $encoded, "\n";
# Above uses a random salt; the salt itself appears
# as the first two characters of the encoded string
# for subsequent verifications
And while we're on the topic, remember that crypt typically only
encrypts the first 8 characters of a string; meaning that password1
and password2 will create identical encodings
print crypt('password1', 'DR'), "\n";
print crypt('password2', 'DR'), "\n";
# Both statements above print "DRSA07ZWOpRt6"
and can therefore be used interchangeably (i.e., a person could authenticate
with 'password1,' 'password2,' or even 'password-smashword,' for that matter). For
more crypt related details, have a look at the
perldoc entry on your system
(perldoc -f crypt) or visit
perldoc.perl.org.
What's next from Simple Comments? On my radar screen are multiple potential enhancements inspired by the new visitor registration capabilities; including support for OpenID and hopefully single sign on (where perhaps a single comment script can serve as the login point for multiple independent sites, or multiple sections of a single site). Also, the comment display and submission interaction templates--including the new login process--could use some updating; perhaps with AJAX-based controls so the user doesn't have to flip from page to page to login or post new comments (i.e., the entire comment posting process could conceivably be performed within the page, instead of requiring multiple script hits).
More importantly, what do you think the next key feature of Simple Comments should be? Drop us a note with your ideas and we'll be sure to consider them for possible inclusion in future Simple Comments releases!
| home / programming / perl / comments / v.950 / 2 | [previous] |
Created: November 12, 2007
Revised: November 12, 2007
URL: http://webreference.com/programming/perl/comments/v.950/2.html