Sams Teach Yourself XML in 24 Hours, Complete Starter Kit, 3rd Edition. Part 1 | 3 | WebReference

Sams Teach Yourself XML in 24 Hours, Complete Starter Kit, 3rd Edition. Part 1 | 3

Files and Directories in Perl

Renaming Files

Renaming files or directories in Perl is simple; you use the rename function, as follows:

rename oldname, newname;

The rename function takes the file named oldname, tries to change its name to newname, and returns true if the rename is successful. If the oldname and newname are directories, those directories are renamed. If the rename is unsuccessful, rename returns false and sets $! to the reason why, as shown here:

if (! rename "myfile.txt", "archive.txt") {
    warn "Could not rename myfile.txt: $!";

The rename function also moves the file from one directory to another if you specify full pathnames instead of just filenames, as in this example:

rename "myfile.txt", "/tmp/myfile.txt"; # Effectively it moves the file.

If the file newname already exists, it is destroyed.

Unix Stuff

This portion of the hour is primarily for the users of Perl who are working under the Unix operating system. If you don't use Perl on a Unix system, you can safely skip this section and not miss anything important; you can read it if you're curious about Unix, however.

Unix users should know that Perl has a deep Unix heritage, and some Perl functions come right from Unix commands and operating system functions. Most of these functions you will not use. Some functions—like unlink—come from Unix but have meanings that are not really tied to Unix at all. Every operating system deletes files, and Perl ensures that unlink does the right thing on every operating system. Perl makes a great effort to ensure that concepts—such as file I/O—that should be portable between operating systems actually are portable, and it hides all the compatibility issues from you if it can.

The fact that many Unix functions and commands are embedded in the Perl language— which has been ported to many non-Unix operating systems—is a tribute to the desire of Unix developers and administrators to take a little bit of the Unix toolkit with them wherever they go.

A Crash Course in File Permissions

In Hour 1, to make a Perl program run as though it were a regular command, you were given the command chmod 755 scriptname without any real explanation of what it meant. First of all, chmod is a command in Unix that sets the permissions on files. Next, the 755 is a description of the permissions being given to the file scriptname. Each of the three digits represents one set of permissions, given to the owner of the file, the group that the file belongs to, and any nonowner and nongroup—called other—user. In this case, the owner has permission 7, whereas the group and others have permission 5, as shown in Figure 10.1.

Table 10.2 lists the possible values of the digit for each possible combination of permissions.

To set permissions on a file in Perl, you use the built-in function called chmod:

chmod mode, list_of_files;

The chmod function changes the permission on all the files in list_of_files and returns the number of files whose permissions were changed. The mode must be a four-digit number beginning with 0 (because it's a literal octal number, as mentioned in Hour 2 and later in this hour) followed by the digits that you want to indicate permission. The following are some examples of chmod. Note that R stands for read permission, W for write permission, and X for execute permission:

chmod 0755, ‘'; # Grants RWX to owner, RX to group and non-owners
chmod 0644, ‘mydata.txt'; # Grants RW to owner, and R to group and non-owners
chmod 0777, ‘'; # Grants RWX to everyone (usually, not very smart)
chmod 0000, ‘cia.dat'; # Nobody can do anything with this file

Earlier this hour, you learned about the mkdir function. The first argument of mkdir is a file permission—the same kind that chmod uses:

mkdir "/usr/tmp", 0777; # Publicly readable/writable/executable
mkdir "myfiles", 0700; # A very private directory

Everything You Ever Wanted to Know About THAT File

To find out, in excruciating detail, everything you might want to know about a file, you can use Perl's stat function. The stat function originated in Unix, and the return values differ slightly between Unix and non-Unix systems. The syntax for stat is as follows:

stat filehandle;
stat filename;

The stat function can retrieve information either about an open filehandle or about a particular file. Under any operating system, stat returns a 13-element list describing the attributes of the file. The actual values in the list differ slightly depending on which operating system you're running because some operating systems include features that others do not implement. Table 10.3 shows what each element in stat's return value stands for.

Many of the values in Table 10.3 you will probably never use, but they are presented for completeness. For the more obscure values—especially in the Unix return values— you might want to consult your operating system's reference manual.

The following is an example of using stat on a file:

@stuff=stat "myfile";

Normally, the returned values from stat are copied into an assignable list of scalars for clarity:

($dev, $ino, $mode, $nlink, $uid, $gid, $rdev, $size,
$atime, $mtime, $ctime, $blksize, $blocks)=stat("myfile");

To print the permissions for a file in the three-digit form described in the section "A Crash Course in File Permissions," you use this bit of code, where @stuff contains the permissions:

printf "%04o\n", $mode&0777;

The preceding snippet contains elements you might not understand. That's okay; some of it has not been presented to you yet. The permissions as they're retrieved by stat—in $mode in this case—contain lots of "extra" information. The &0777 strips out just the portion of the information you're interested in here. Finally, %o is a printf format that prints numbers in octal—the 0–7 form in which Unix expects permissions to be formatted.

The three time stamps mentioned in Table 10.3—access, modification, and change (or create) time—are stored in a peculiar format. The time stamps are stored as the number of seconds from midnight, January 1, 1970, Greenwich mean time. To print them in a usable format, you use the localtime function, as follows:

print scalar localtime($mtime);

This function prints the modification time of the file in a format similar to Sat Jul 3 23:35:11 EDT 1999. The access time is the time that the file was last read (or opened for reading). The modification time is the time when the file was last written to. Under Unix, the "change" time is the time when the information about the file— the ownership, number of links, permissions, and so on—was changed; it is not the creation time of the file but often happens to be by coincidence. Under Microsoft Windows, the ctime field actually stores the time the file was created.

Sometimes you might be interested in retrieving just one value from the list returned by stat. To do so, you can wrap the entire stat function in parentheses and use subscripts to slice out the values you want from it:

print "The file has", (stat("file"))[7], " bytes of data";


Created: March 27, 2003
Revised: February 3, 2006