Perl Subroutine Primer / Page 2 | WebReference

Perl Subroutine Primer / Page 2


[previous]

Perl Subroutine Primer [con't]

Returning Values

Returning a value from a subroutine follows the same rule as passing parameters: all the values are returned as a simple list of scalar values. It's your job to assign those returned values to an array, hash, multiple variables, etc. as per your needs. Here's how we might return some formatted names:

Note that in the above example we can't accidentally wipe out the main program's version of the variables passed to the subroutine; since we don't access the elements of the @_ array directly.

In Perl, all subroutines return something; by default, they return the value of the last expression evaluated in the subroutine (or an empty list if the subroutine has no statements). You can use the return statement to explicitly define what the return value should be, as well as return from the subroutine early if need be with a specific value. For example:

Since all subroutines return something in Perl, there is no distinction between "subroutines" and "functions," as there may be in other languages you are familiar with. There is only this one subroutine construct (though you are welcome to call them functions, if you like).

Prototypes

We've stated earlier that all subroutines receive as their input a list of scalar values to operate on, and that if you try to pass an array or a hash it will be converted into a simple list of values to be read one at a time by the subroutine. How then, could we create a subroutine that looks and works like Perl's own built-in subroutines, that appear to be able to accept named arrays and hashes? For example, when we execute the following:

Perl does not convert @my_array to its individual values to be be passed to the push subroutine; instead, it operates on the @my_array directly, adding the two supplied values to it, just as if we actually called it this way:

Perl provides a mechanism that allows us to define our own subroutines in the same way. When we specify prototypes with our subroutines, we can force the perl interpreter to check all calls to our subroutine so that the right types of parameters are supplied, and if not, the compiler complains and the script isn't executed. In addition, we can force Perl to assume a certain context of values to be sent to the subroutine; forcing a scalar context on an array if it's provided as the first parameter to a function, for example, to pass the length of a named array to the subroutine instead of the individual array values themselves. Prototypes are specified for a subroutine as part of its definition, i.e.:

A couple simple prototype examples:

In the second example above, note that the presence of the backslash as applied to the @ character forces the interpreter to ensure that the first parameter passed to the subroutine is a named array (with an @ sign) and that that array will be followed by a single argument forced into scalar context. When this subroutine is actually executed Perl will automatically pass the first argument as an array reference, so that you can access it within your subroutine as an array (instead of a list of values), i.e., let's create an add_a_bar subroutine that allows for the specification of both an array, and the element number of the array that should be adjusted:

In all the examples above, we called the subroutine without using an ampersand; because with the ampersand, prototype processing would be suppressed (recall our ealier section on the four typical ways to call subroutines). Also note that accessing the subroutine as a method, or by dereferencing it via $subroutine->() would also defeat the prototype processing. Keep this possibility in mind when you define your subroutine prototypes.

There is more to cover concerning prototypes. For example, you can force a subroutine to accept an anonymous subroutine as an argument, and you can use square brackets to accept a set of potentially valid argument types in a single variable slot. But we're wandering a bit out of the scope of this article. If you would like further information and/or clarification on prototype usage I recommend browsing the perlsub documentation (perldoc perlsub). For now, just remember that prototypes allow you to define subroutines that look and work much like Perl's own built-in subroutines.

Private Variables and Nested Subroutines

Named subroutines are visible to the entire package that they are compiled into; even if they're declared within another subroutine, i.e., the following is allowed in Perl:

However, private variables, declared with my, are scoped lexically, i.e., they are visible only within the innermost enclosing block within which they are defined:

So what happens when a named, nested subroutine attempts to access a private variable that is within its lexical scope? Probably not what you might expect:

To understand what is happening in the above code, you must remember that the inner_subroutine is scoped to the package; it's available to the entire package (and any other package, for that matter, so long as it is qualified). But the my variable $foo is not; it's freshly recreated and reinitialized each time outer_subroutine is called and is not available outside outer_subroutine. Thus the Perl designers had a dilemma: when outer_subroutine is called repetitively (as above, or would happen naturally in a recursive subroutine, for example), which instance of $foo should be seen and used by inner_subroutine? To keep the mechanics simple, the designers only allow the inner_subroutine to see the first instance of $foo; that is, the one that sprang into existence when outer_subroutine was first called. Since that instance of $foo is set to 20 just before outer_subroutine exits for the first time, that's the value that inner_subroutine will see for the rest of the script. If you compile your scripts with warnings enabled, Perl would try to tell you of this potential problem:

Note that if you think you're safe from this behavior simply because you never use (or intend to use) named, nested subroutines, think again. If you are using, or plan to use, your scripts on a mod_perl enabled Web server, you might encounter this problem even if you don't explicitly use named nested subroutines. Why? Because mod_perl scripts, when run with Apache::Registry (ModPerl::Registry in mod_perl2), are executed as if they are wrapped in an outer subroutine call, i.e., something (but not exactly) like this:

In the above example, it's easy to see how show_foo has become a nested named subroutine, even though in its original state it was not intended to be that way. If this script were executed as a Registry script through a mod_perl Web server, $foo would initially be displayed as 10 (what you would expect); but the second and subsequent calls to the script (as long as the Web server remained running and the Perl script was unchanged) would display 20, i.e., the last known value of $foo when the handler routine was first executed (that is, when the script was first called).

There are several simple fixes to the above problem. Basically you need to match the scope of the variable with the subroutine, i.e., either the variable needs to be a package or global variable (to match the scope of the nested subroutine); or the subroutine needs to be defined anonymously (to match the scope of of the private variable). This is one of the reasons seasoned programmers often recommend that your subroutines be defined in a self-contained manner -- i.e., any variables that are needed are passed into the subroutine, as opposed to relying on "global" variables defined outside the subroutine. More on this "sticky parameters" problem and potential solutions can be found in this article by Stas Bekman: The Perl You Need To Know - Part 2

For Further Study

If you've not used subroutines before, or even if you have used them but never delved into their inner workings, the quick taste presented in this article should give you plenty of ideas to play with in your future projects. Still, we've left out some facets and capabilities of subroutines that you may wish to continue to pursue on your own. In no particular order, here are some additional subroutine tidbits that may interest you:

Subroutines as Methods
Subroutines can be called as a method of a Perl object; in which case special parameter passing rules are assumed (specifically, the object reference itself is passed as the first parameter to the subroutine). This is why Perl subroutines that are designed to be used as methods often begin with statements like this:

And when called as a class method, the subroutine will automatically receive the name of the class as its first argument.

More Prototypes
We touched on the use of prototypes earlier in the article, but left out many features. For example, when you prototype a subroutine as requiring an anonymous subroutine as the first argument, then you don't need to use the sub key word when you pass the anonymous subroutine (like you would with sort, for example). Other prototype possibilities include using a semicolon to separate mandatory from optional arguments, and using an asterisk to indicate that a bareword, constant, scalar expression, typeglob, or a reference to a typeglob should be passed.

Symbolic References
In addition to the typical subroutine calling methods we outlined on the previous page, you can also call a subroutine indirectly as a symbolic link:

Note that the complier won't let you get away with that if use strict or the more specific use strict "refs" is in force.

Overriding Built-In Subroutines
You can override Perl's built-in subroutines with the use subs pragma, which allows you to predeclare and then redefine a sub routine with your own code. This is dangerous, of course, so be sure you understand the ramifications of doing it before you plunge in. And even then, be sure to wear knee pads.

Conclusion

Defining and using subroutines in Perl is a simple and straightforward process, but one that also has several possibilities that beginning Perl coders often miss. It's my hope that this brief introduction has provided you with the basics necessary to begin using subroutines effectively in your own scripts.

Original: February 14, 2006




[prev]