Book HomeLearning Perl, 3rd EditionSearch this book

15.3. Formatting Data with sprintf

The sprintf function takes the same arguments as printf (except for the optional filehandle, of course), but it returns the requested string instead of printing it. This is handy if you want to store a formatted string into a variable for later use, or if you want more control over the result than printf alone would provide:

my $date_tag = sprintf
  "%4d/%02d/%02d %2d:%02d:%02d",
  $yr, $mo, $da, $h, $m, $s;

In that example, $date_tag gets something like "2038/01/19 3:00:08". The format string (the first argument to sprintf) used a leading zero on some of the format number, which we didn't mention when we talked about printf formats in Chapter 6, "I/O Basics". The leading zero on the format number means to use leading zeroes as needed to make the number as wide as requested. Without a leading zero in the formats, the resulting date-and-time string would have unwanted leading spaces instead of zeroes, looking like "2038/ 1/19 3: 0: 8".

15.3.1. Using sprintf with "Money Numbers"

One popular use for sprintf is when a number needs to be rendered with a certain number of places after the decimal point, such as when an amount of money needs to be shown as 2.50 and not 2.5 -- and certainly not as 2.49997! That's easy to accomplish with the "%.2f" format:

my $money = sprintf "%.2f", 2.49997;

The full implications of rounding are numerous and subtle, but in most cases you should keep numbers in memory with all of the available accuracy, rounding off only for output.

If you have a "money number" that may be large enough to need commas to show its size, you might find it handy to use a subroutine like this one.[337]

[337]Yes, we know that not everywhere in the world are commas used to separate groups of digits, not everywhere are the digits grouped by threes, and not everywhere the currency symbol appears as it does for U.S. dollars. But this is a good example anyway, so there!

sub big_money {
  my $number = sprintf "%.2f", shift @_;
  # Add one comma each time through the do-nothing loop
  1 while $number =~ s/^(-?\d+)(\d\d\d)/$1,$2/;
  # Put the dollar sign in the right place
  $number =~ s/^(-?)/$1\$/;
  $number;
}

This subroutine uses some techniques you haven't seen yet, but they logically follow from what we've shown you. The first line of the subroutine formats the first (and only) parameter to have exactly two digits after the decimal point. That is, if the parameter were the number 12345678.9, now our $number is the string "12345678.90".

The next line of code uses a while modifier. As we mentioned when we covered that modifier in Chapter 10, "More Control Structures", that can always be rewritten as a traditional while loop:

while ($number =~ s/^(-?\d+)(\d\d\d)/$1,$2/) {
  1;
}

What does that say to do? It says that, as long as the substitution returns a true value (signifying success), the loop body should run. But the loop body does nothing! That's okay with Perl, but it tells us that the purpose of that statement is to do the conditional expression (the substitution), rather than the useless loop body. The value 1 is traditionally used as this kind of a placeholder, although any other value would be equally useful.[338] This works just as well as the loop above:

[338]Which is to say, useless. By the way, in case you're wondering, Perl optimizes away the constant expression so it doesn't even take up any runtime.

'keep looping' while $number =~ s/^(-?\d+)(\d\d\d)/$1,$2/;

So, now we know that the substitution is the real purpose of the loop. But what is the substitution doing? Remember that $number will be some string like "12345678.90" at this point. The pattern will match the first part of the string, but it can't get past the decimal point. (Do you see why it can't?) Memory $1 will get "12345", and $2 will get "678", so the substitution will make $number into "12345,678.90" (remember, it couldn't match the decimal point, so the last part of the string is left untouched).

Do you see what the dash is doing near the start of that pattern? (Hint: The dash is allowed at only one place in the string.) We'll tell you at the end of this section, in case you haven't figured it out.

We're not done with that substitution statement yet. Since the substitution succeeded, the do-nothing loop goes back to try again. This time, the pattern can't match anything from the comma onward, so $number becomes "12,345,678.90". The substitution thus adds a comma to the number each time through the loop.

Speaking of the loop, it's still not done. Since the previous substitution was a success, we're back around the loop to try again. But this time, the pattern can't match at all, since it has to match at least four digits at the start of the string, so now that is the end of the loop.

Why couldn't we have simply used the /g modifier to do a "global" search-and-replace, to save the trouble and confusion of the 1 while? We couldn't use that because we're working backwards from the decimal point, rather than forward from the start of the string. Putting the commas in a number like this can't be done simply with the s///g substitution alone.[339]

[339]At least, it can't be done without some more-advanced regular expression techniques than we've shown you so far. Those darn Perl developers keep making it harder and harder to write Perl books that use the word "can't."

So, did you figure out the dash? It's allowing for a possible minus-sign at the start of the string. The next line of code makes the same allowance, putting the dollar-sign in the right place so that $number is something like "$12,345,678.90", or perhaps "-$12,345,678.90" if it's negative. Note that the dollar sign isn't necessarily the first character in the string, or that line would be a lot simpler. Finally, the last line of code returns our nicely formatted "money number," ready to be printed in the annual report.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.