Perl Subroutine Primer [con't]
|
Returning Values
Returning a value from a subroutine follows the same rule as passing parameters: all the values are returned as a simple list of scalar values. It's your job to assign those returned values to an array, hash, multiple variables, etc. as per your needs. Here's how we might return some formatted names:
Note that in the above example we can't accidentally wipe out the main program's
version of the variables passed to the subroutine; since we don't access the elements
of the @_ array directly.
In Perl, all subroutines return
something; by default, they return the value of the last expression evaluated in the
subroutine (or an empty list if the subroutine has no statements). You can use the
return statement to explicitly define what the return value should be,
as well as return from the subroutine early if need be with a specific value. For
example:
Since all subroutines return something in Perl, there is no distinction between "subroutines" and "functions," as there may be in other languages you are familiar with. There is only this one subroutine construct (though you are welcome to call them functions, if you like).
Prototypes
We've stated earlier that all subroutines receive as their input a list of scalar values to operate on, and that if you try to pass an array or a hash it will be converted into a simple list of values to be read one at a time by the subroutine. How then, could we create a subroutine that looks and works like Perl's own built-in subroutines, that appear to be able to accept named arrays and hashes? For example, when we execute the following:
Perl does not convert @my_array to its individual values to be be
passed to the push subroutine; instead, it operates on the @my_array
directly, adding the two supplied values to it, just as if we actually called it
this way:
Perl provides a mechanism that allows us to define our own subroutines in the same way. When we specify prototypes with our subroutines, we can force the perl interpreter to check all calls to our subroutine so that the right types of parameters are supplied, and if not, the compiler complains and the script isn't executed. In addition, we can force Perl to assume a certain context of values to be sent to the subroutine; forcing a scalar context on an array if it's provided as the first parameter to a function, for example, to pass the length of a named array to the subroutine instead of the individual array values themselves. Prototypes are specified for a subroutine as part of its definition, i.e.:
A couple simple prototype examples:
In the second example above, note that the presence of the backslash as applied
to the @ character forces the interpreter to ensure that the first
parameter passed to the subroutine is a named array (with an @ sign) and
that that array will be followed by a single argument forced into scalar context.
When this subroutine is actually executed Perl will automatically pass the first
argument as an array reference, so that you can access it within your
subroutine as an array (instead of a list of values), i.e., let's create an
add_a_bar subroutine that allows for the specification of both an
array, and the element number of the array that should be adjusted:
In all the examples above, we called the subroutine without using an
ampersand; because with the ampersand, prototype processing would be suppressed
(recall our ealier section on the four typical ways to call subroutines). Also
note that accessing the subroutine as a method, or by dereferencing it via
$subroutine->() would also defeat the prototype processing. Keep
this possibility in mind when you define your subroutine prototypes.
There is more to cover concerning prototypes. For example, you can force
a subroutine to accept an anonymous subroutine as an argument, and you can use
square brackets to accept a set of potentially valid argument types in a single
variable slot. But we're wandering a bit out of the scope of this article. If
you would like further information and/or clarification on prototype usage I
recommend browsing the perlsub documentation (perldoc
perlsub). For now, just remember that prototypes allow you to define
subroutines that look and work much like Perl's own built-in subroutines.
Private Variables and Nested Subroutines
Named subroutines are visible to the entire package that they are compiled into; even if they're declared within another subroutine, i.e., the following is allowed in Perl:
However, private variables, declared with my, are scoped
lexically, i.e., they are visible only within the innermost enclosing
block within which they are defined:
So what happens when a named, nested subroutine attempts to access a private variable that is within its lexical scope? Probably not what you might expect:
To understand what is happening in the above code, you must remember that the
inner_subroutine is scoped to the package; it's available to the
entire package (and any other package, for that matter, so long as it is qualified).
But the my variable $foo is not; it's freshly recreated
and reinitialized each time outer_subroutine is called and is not available
outside outer_subroutine. Thus the Perl
designers had a dilemma: when outer_subroutine is called repetitively
(as above, or would
happen naturally in a recursive subroutine, for example), which instance of $foo
should be seen and used by inner_subroutine? To keep the mechanics simple,
the designers only allow the inner_subroutine to see the first
instance of $foo; that is, the one that sprang into existence when
outer_subroutine was first called. Since that instance of $foo
is set to 20 just before outer_subroutine exits for the first time,
that's the value that inner_subroutine will see for the rest of the
script. If you compile your scripts with warnings enabled, Perl would
try to tell you of this potential problem:
Note that if you think you're safe from this behavior simply because you never
use (or intend to use) named, nested subroutines, think again. If you are using, or
plan to use, your scripts on a mod_perl enabled Web server, you might encounter this problem even if you don't explicitly use named nested subroutines.
Why? Because mod_perl scripts, when run with Apache::Registry
(ModPerl::Registry in mod_perl2), are executed as if they
are wrapped in an outer subroutine call, i.e., something (but not exactly) like this:
In the above example, it's easy to see how show_foo has become
a nested named subroutine, even though in its original state it was not intended
to be that way. If this script were executed as a Registry script through a
mod_perl Web server, $foo would initially be displayed as
10 (what you would expect); but the second and subsequent calls to the script (as
long as the Web server remained running and the Perl script was unchanged) would
display 20, i.e., the last known value of $foo when the handler
routine was first executed (that is, when the script was first called).
There are several simple fixes to the above problem. Basically you need to match the scope of the variable with the subroutine, i.e., either the variable needs to be a package or global variable (to match the scope of the nested subroutine); or the subroutine needs to be defined anonymously (to match the scope of of the private variable). This is one of the reasons seasoned programmers often recommend that your subroutines be defined in a self-contained manner -- i.e., any variables that are needed are passed into the subroutine, as opposed to relying on "global" variables defined outside the subroutine. More on this "sticky parameters" problem and potential solutions can be found in this article by Stas Bekman: The Perl You Need To Know - Part 2
For Further Study
If you've not used subroutines before, or even if you have used them but never delved into their inner workings, the quick taste presented in this article should give you plenty of ideas to play with in your future projects. Still, we've left out some facets and capabilities of subroutines that you may wish to continue to pursue on your own. In no particular order, here are some additional subroutine tidbits that may interest you:
- Subroutines as Methods
- Subroutines can be called as a method of a Perl object; in which case
special parameter passing rules are assumed (specifically, the object reference itself is
passed as the first parameter to the subroutine). This is why Perl subroutines that
are designed to be used as methods often begin with statements like this:And when called as a class method, the subroutine will automatically receive the name of the class as its first argument.
- More Prototypes
- We touched on the use of prototypes earlier in the article, but left out many
features. For example, when you prototype a subroutine as requiring an anonymous
subroutine as the first argument, then you don't need to use the
subkey word when you pass the anonymous subroutine (like you would withsort, for example). Other prototype possibilities include using a semicolon to separate mandatory from optional arguments, and using an asterisk to indicate that a bareword, constant, scalar expression, typeglob, or a reference to a typeglob should be passed. - Symbolic References
- In addition to the typical subroutine calling methods we outlined on the previous
page, you can also call a subroutine indirectly as a symbolic link:Note that the complier won't let you get away with that if
use strictor the more specificuse strict "refs"is in force. - Overriding Built-In Subroutines
- You can override Perl's built-in subroutines with the
use subspragma, which allows you to predeclare and then redefine a sub routine with your own code. This is dangerous, of course, so be sure you understand the ramifications of doing it before you plunge in. And even then, be sure to wear knee pads.
Conclusion
Defining and using subroutines in Perl is a simple and straightforward process, but one that also has several possibilities that beginning Perl coders often miss. It's my hope that this brief introduction has provided you with the basics necessary to begin using subroutines effectively in your own scripts.
Original: February 14, 2006
![]()




