[ Team LiB ] Previous Section Next Section

Recipe 9.6 Globbing, or Getting a List of Filenames Matching a Pattern

9.6.1 Problem

You want to get a list of filenames similar to those produced by MS-DOS's *.* and Unix's *.h. This is called globbing, and the filename wildcard expression is called a glob, or occasionally a fileglob to distinguish it from a typeglob.

9.6.2 Solution

Perl provides globbing with the semantics of the Unix C shell through the glob keyword and <>:

@list = <*.c>;
@list = glob("*.c");

You can also use readdir to extract the filenames manually:

opendir(DIR, $path);
@files = grep { /\.c$/ } readdir(DIR);
closedir(DIR);

9.6.3 Discussion

In versions of Perl before v5.6, Perl's built-in glob and <WILDCARD> notation (not to be confused with <FILEHANDLE>) ran an external program (often the csh shell) to get the list of filenames. This led to globbing being tarred with security and performance concerns. As of v5.6, Perl uses the File::Glob module to glob files, which solves the security and performance problems of the old implementation. Globs have C shell semantics on non-Unix systems to encourage portability. In particular, glob syntax isn't regular expression syntax—glob uses ? to mean "any single character" and * to mean "zero or more characters," so glob("f?o*") matches flo and flood but not fo.

For complex rules about which filenames you want, roll your own selection mechanism using readdir and regular expressions.

At its simplest, an opendir solution uses grep to filter the list returned by readdir:

@files = grep { /\.[ch]$/i } readdir(DH);

As always, the filenames returned don't include the directory. When you use the filename, prepend the directory name to get the full pathname:

opendir(DH, $dir)        or die "Couldn't open $dir for reading: $!";

@files = ( );
while( defined ($file = readdir(DH)) ) {
    next unless /\.[ch]$/i;

    my $filename = "$dir/$file";
    push(@files, $filename) if -T $filename;
}

The following example combines directory reading and filtering with the efficient sorting technique from Recipe 4.16. It sets @dirs to a sorted list of the subdirectories in a directory whose names are all numeric:

@dirs = map  { $_->[1] }                # extract pathnames
        sort { $a->[0] <=> $b->[0] }    # sort names numeric
        grep { -d $_->[1] }             # path is a dir
        map  { [ $_, "$path/$_" ] }     # form (name, path)
        grep { /^\d+$/ }                # just numerics
        readdir(DIR);                   # all files

Recipe 4.16 explains how to read these strange-looking constructs. As always, formatting and documenting your code can make it much easier to read and understand.

9.6.4 See Also

The opendir, readdir, closedir, grep, map, and sort functions in perlfunc(1) and in Chapter 29 of Programming Perl; documentation for the standard DirHandle module (also in Chapter 32 of Programming Perl); the "I/O Operators" section of perlop(1), and the "Filename Globbing Operator" section of Chapter 2 of Programming Perl; we talk more about globbing in Recipe 6.9; Recipe 9.5

    [ Team LiB ] Previous Section Next Section