Team LiB   Previous Section   Next Section

Perl Variable Types: Scalars, Arrays, and Hashes

There are three basic variable types in Perl, the last two of which are merely collections of the first arranged in specific patterns. These three variable types are illustrated in Figure A-1; note that we've substituted a Perl camel for any kind of scalar element, such as a string, an integer, or a float.

Figure A-1. Perl's three main variable types
figs/pdba_aa01.gif

Scalars

Scalars are single-valued entities — numbers (floats, decimals, hexadecimals, etc.), strings, or references. (We'll describe references later in this appendix.) Scalars, which are prefixed with the dollar sign ($), are the basic building blocks of Perl, the indivisible atom from classical Greek science. Everything in Perl reduces to scalars, which bear names up to 251 characters in length. Because Perl is a weakly typed language,[1] scalar types also change "automagically" between strings and numbers as you use them:

[1] A weakly typed language is one in which variables do not have to have their datatypes strictly defined (as integers, floats, strings, and so on). On the other hand, a strongly typed language is one in which all variables must be predeclared with their datatypes.

$harpo = "1"; # A previously unmentioned $harpo is 
              # set to a string value of "1"
  
$harpo++; # Perl recognizes that you wish to turn "1" into 1, and 
          # then add one on, to get to 2
  
$harpo = "Groucho";  # The $harpo variable turns dynamically 
                     # back into a string, from a numeric 2

Arrays

Arrays (or list arrays) are simply lists of scalars indexed by number, starting from zero (the second element is one away from the beginning). A typical array can be set up in the following way:

@video_collection = 
   ("Day at the Races", "Duck Soup", "A Night at the Opera");

The @video_collection array has three string elements. However, an array can consist of any mixture of atomic scalar types:

@casablanca_items = ("Rick's", 2, 4000.00, "A Beautiful Friendship");

You can think of an array as being like an ice hockey team wearing shirt numbers, but no names. Each player is still an individual, but he or she is accessible within the team (or array) by number. To access an individual array element, we precede the array name with a scalar $ symbol, and follow it with the numeric position of the scalar within the array. This position, or shirt number, is held within square brackets.

Whenever you see [] square brackets in Perl, outside of regular expressions, you should think immediately in terms of arrays, array slices, anonymous arrays, or lists. There is almost certainly something array-like going on!

To demonstrate scalar notation of array elements, let's introduce a simple foreach loop in Perl to iterate through a list, from 0 to 3:

foreach $i (0..3) {
   print $i, " ", $casablanca_items[$i], "\n";
}

This prints out:

0 Rick's
1 2
2 4000
3 A Beautiful Friendship

Notice how 4000 printed out, rather than 4000.00. If it can, Perl reduces floats to integers in memory, to save space. It turns them back again as necessary.

There are two ways of finding out the size of an array. The first way is to use the $# notation in front of the array name. This provides the highest array index (the size of the array minus one). The other is to assign an array to a scalar. Perl interprets this in scalar context, and gives us the size of the array. The following code generates the two different types of figures:

$highest_index = $#casablanca_items; # Watch out for comment confusion!
$size_of_array = @casablanca_items;
  
print "highest_index >", $highest_index, "<\n";
print "size_of_array >", $size_of_array, "<\n";

This code produces the following:

highest_index >3<
size_of_array >4<

Some people avoid using the $# syntax for the highest current array index. Because # is also a Perl symbol that is used to begin a comment (which extends to the end of the line), the various # symbols can become confusing within complicated code blocks.

Hashes

Hashes (or associative arrays) are collections of scalars indexed by string names rather than integers. Think of the ice hockey team, in the second period, now wearing shirts displaying only their names, without the numbers. In Figure A-1, the three scalar values are represented by "Fred," "Barney," and "Wilma." Although at first the concept of hashes may seem a bit confusing, you'll find that you'll tend to use it for most things in Perl once you're used to it (especially with object orientation, as we'll see later). A hash can be constructed via the following flat list initialization technique:

%middle_earth_leaders = 
   ('Saruman', 'Orthanc', 'Sauron', 'Mordor', 
    'Bombadil', 'The Old Forest');

This pattern goes in a key=>value order. To make this visually clearer, we can add some syntactic sugar, indent a little more, and rewrite:

%middle_earth_leaders = 
   (Saruman => 'Orthanc', 
    Sauron => 'Mordor', 
    Bombadil => 'The Old Forest');

The => aliases as a comma, while making it clear that the left-hand values are key strings, without the need for the now unnecessary quote marks.

The other main difference between ordinary arrays and hash arrays is that you can always work out where the individual scalars are inside an array by knowing their numeric position. Imagine our ice hockey team lining up in a numeric order before the start of the game. Hashes are different. We can never be sure in what order the key/value pairs will come out. This time, imagine the entire team mobbing the crucial goal scorer just after the final whistle. There's no predefined order. To access each scalar, we generally iterate the unordered string index names, and then sort them out, before re-accessing the hash:

foreach $key (sort keys %middle_earth_leaders) {
   print $key, " => ", $middle_earth_leaders{$key}, "\n";
}

Notice again that we use $ in front of the hash array name to get the scalar value. However, we know we're dealing with hashes because the clue is curly brackets ({ }), which contain the index string name. The above code produces the following output:

Bombadil => The Old Forest
Saruman => Orthanc
Sauron => Mordor

Incidentally, this is where we can use our $_ pronoun for the first time, as a sort of "it." Instead of using the $key variable explicitly, we could use the following code:

foreach (sort keys %middle_earth_leaders) {
   print $_, " => ", $middle_earth_leaders{$_}, "\n";
}

Notice that there is no scalar variable following the foreach, in the first line of code, as earlier. However, $_ is being used in the same position of $key inside the loop. What's going on? Perl takes the preceding code and assumes that because foreach has no associated scalar, we really meant to use the "it" pronoun, $_. Perl therefore translates the above code into the following logical snippet before executing it. Notice the assumed first appearance of $_:

for $_ (sort keys %middle_earth_leaders) {     
   print $_, " => ", $middle_earth_leaders{$_}, "\n";
}

Revenge of the Mnemonics

Here are some easy ways to remember our Perl definitions:

Scalars

To remember scalars, think of the $ dollar sign preceding the variable name — it looks a bit like an "S" for "Scalar."

Arrays

The simplest way to remember the @ array notation, is that @ has an "A" in the middle, which stands for "Array."

Hashes

To try to remember the hash symbol, think of the % character, with its slash and two small opposed circular elements, as standing for key/value. Imagine that key and value each represents one circle from the percentage division sign, with the slash dividing them into the key/value pair. (OK, it's not great, but this is the "Revenge of the Mnemonics"!)

Array and Hash Array Slices

In case you're having trouble imagining arrays and hashes in terms of hockey teams accessed by number or name, try thinking of them in more traditional pie shapes. This can make it easier to imagine array slices, which are discrete collections of scalars. The two different pie types, and slice patterns, are displayed in Figure A-2.

Figure A-2. Array slices in Perl
figs/pdba_aa02.gif
    Team LiB   Previous Section   Next Section