Book HomeLearning Perl, 3rd EditionSearch this book

9.6. Substitutions with s///

If you think of the m// pattern match as being like your word processor's "search" feature, the "search and replace" feature would have to be Perl's s/// substitution operator. This simply replaces whatever part of a variable[205] matches a pattern with a replacement string:

[205]Unlike m//, which can match against any string expression, s/// is modifying data that must therefore be contained in what's known as an lvalue. This is nearly always a variable, although it could actually be anything that could be used on the left side of an assignment operator.

$_ = "He's out bowling with Barney tonight.";
s/Barney/Fred/;  # Replace Barney with Fred
print "$_\n";

If the match fails, nothing happens, and the variable is untouched:

# Continuing from above; $_ has "He's out bowling with Fred tonight."
s/Wilma/Betty/;  # Replace Wilma with Betty (fails)

Of course, both the pattern and the replacement string could be more complex. Here, the replacement string uses the first memory variable, which is set by the pattern match:

s/with (\w+)/against $1/;
print "$_\n";  # says "He's out bowling against Fred tonight."

Here are some other possible substitutions. (These are here only as samples; in the real world, it would not be typical to do so many unrelated substitutions in a row.)

$_ = "green scaly dinosaur";
s/(\w+) (\w+)/$2, $1/;  # Now it's "scaly, green dinosaur"
s/^/huge, /;            # Now it's "huge, scaly, green dinosaur"
s/,.*een//;             # Empty replacement: Now it's "huge dinosaur"
s/green/red/;           # Failed match: still "huge dinosaur"
s/\w+$/($`!)$&/;        # Now it's "huge (huge !)dinosaur"
s/\s+(!\W+)/$1 /;       # Now it's "huge (huge!) dinosaur"
s/huge/gigantic/;       # Now it's "gigantic (huge!) dinosaur"

There's a return value from s///; it's true if a substitution was successful; otherwise it's false:

$_ = "fred flintstone";
if (s/fred/wilma/) {
  print "Successfully replaced fred with wilma!\n";
}

9.6.1. Global Replacements with /g

As you may have noticed in a previous example, s/// will make just one replacement, even if others are possible. Of course, that's just the default. The /g modifier tells s/// to make all possible nonoverlapping[206] replacements:

[206]It's nonoverlapping because each new match starts looking just beyond the latest replacement.

$_ = "home, sweet home!";
s/home/cave/g;
print "$_\n";  # "cave, sweet cave!"

A fairly common use of a global replacement is to collapse whitespace; that is, to turn any arbitrary whitespace into a single space:

$_ = "Input  data\t may have    extra whitespace.";
s/\s+/ /g;  # Now it says "Input data may have extra whitespace."

Once we show collapsing whitespace, everyone wants to know about stripping leading and trailing whitespace. That's easy enough, in two steps:[207]

[207]It could be done in one step, but this way is better.

s/^\s+//;  # Replace leading whitespace with nothing
s/\s+$//;  # Replace trailing whitespace with nothing

9.6.2. Different Delimiters

Just as we did with m// and qw//, we can change the delimiters for s///. But the substitution uses three delimiter characters, so things are a little different.

With ordinary (non-paired) characters, which don't have a left and right variety, just use three of them, as we did with the forward slash. Here, we've chosen the pound sign[208] aSs the delimiter:

[208]With apologies to our British friends, to whom the pound sign is something else! Although the pound sign is generally the start of a comment in Perl, it won't start a comment when the parser knows to expect a delimiter -- in this case, immediately after the s that starts the substitution.

s#^https://#http://#;

But if you use paired characters, which have a left and right variety, you have to use two pairs: one to hold the pattern and one to hold the replacement string. In this case, the delimiters don't have to be the same kind around the string as they are around the pattern. In fact, the delimiters of the string could even be non-paired. These are all the same:

s{fred}{barney};
s[fred](barney);
s<fred>#barney#;

9.6.3. Option Modifiers

In addition to the /g modifier,[209] substitutions may use the /i and /s modifiers that we saw in ordinary pattern matching. The order of modifiers isn't significant.

[209]We speak of the modifiers with names like "/i" , even if the delimiter is something different than a slash.

s#wilma#Wilma#gi;  # replace every WiLmA or WILMA with Wilma
s{__END_  _.*}{}s;   # chop off the end marker and all following lines

9.6.4. The Binding Operator

Just as we saw with m//, we can choose a different target for s/// by using the binding operator:

$file_name =~ s#^.*/##s;  # In $file_name, remove any Unix-style path

9.6.5. Case Shifting

It often happens in a substitution that you'll want to make sure that a replacement word is properly capitalized (or not, as the case may be). That's easy to accomplish with Perl, by using some backslash escapes. The \U escape forces what follows to all uppercase:

$_ = "I saw Barney with Fred.";
s/(fred|barney)/\U$1/gi;  # $_ is now "I saw BARNEY with FRED."

Similarly, the \L escape forces lowercase. Continuing from the previous code:

s/(fred|barney)/\L$1/gi;  # $_ is now "I saw barney with fred."

By default, these affect the rest of the (replacement) string; or you can turn off case shifting with \E:

s/(\w+) with (\w+)/\U$2\E with $1/i;  # $_ is now "I saw FRED with barney."

When written in lowercase (\l and \u), they affect only the next character:

s/(fred|barney)/\u$1/ig;  # $_ is now "I saw FRED with Barney."

You can even stack them up. Using \u with \L means "all lower case, but capitalize the first letter":[210]

[210]The \L and \u may appear together in either order. Larry realized that people would sometimes get those two backwards, so he made Perl figure out that you want just the first letter capitalized and the rest lowercase. Larry is a pretty nice guy.

s/(fred|barney)/\u\L$1/ig;  # $_ is now "I saw Fred with Barney."

As it happens, although we're covering case shifting in relation to substitutions, it's available in any double-quotish string:

print "Hello, \L\u$name\E, would you like to play a game?\n";


Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.