Recipe 1.16 Indenting Here Documents
1.16.1 Problem
When
using the multiline quoting mechanism called a here
document, the text must be flush against the margin, which
looks out of place in the code. You would like to indent the here
document text in the code, but not have the indentation appear in the
final string
value.
1.16.2 Solution
Use a s///
operator to strip out leading whitespace.
# all in one
($var = << HERE_TARGET) =~ s/^\s+//gm;
your text
goes here
HERE_TARGET
# or with two steps
$var = << HERE_TARGET;
your text
goes here
HERE_TARGET
$var =~ s/^\s+//gm;
1.16.3 Discussion
The substitution is straightforward. It removes leading whitespace
from the text of the here document. The /m
modifier lets the ^ character match at the start
of each line in the string, and the /g modifier
makes the pattern-matching engine repeat the substitution as often as
it can (i.e., for every line in the here document).
($definition = << 'FINIS') =~ s/^\s+//gm;
The five varieties of camelids
are the familiar camel, his friends
the llama and the alpaca, and the
rather less well-known guanaco
and vicuña.
FINIS
Be warned: all patterns in this recipe use \s,
meaning one whitespace character, which will also match newlines.
This means they will remove any blank lines in your here document. If
you don't want this, replace \s with
[^\S\n] in the patterns.
The substitution uses the property that the result of an assignment
can be used as the lefthand side of =~. This lets
us do it all in one line, but works only when assigning to a
variable. When you're using the here document directly, it would be
considered a constant value, and you wouldn't be able to modify it.
In fact, you can't change a here document's value
unless you first put it into a variable.
Not to worry, though, because there's an easy way around this,
particularly if you're going to do this a lot in the program. Just
write a subroutine:
sub fix {
my $string = shift;
$string =~ s/^\s+//gm;
return $string;
}
print fix( << "END");
My stuff goes here
END
# With function predeclaration, you can omit the parens:
print fix << "END";
My stuff goes here
END
As with all here documents, you have to place this here document's
target (the token that marks its end, END in this
case) flush against the lefthand margin. To have the target indented
also, you'll have to put the same amount of whitespace in the quoted
string as you use to indent the token.
($quote = << ' FINIS') =~ s/^\s+//gm;
...we will have peace, when you and all your works have
perished--and the works of your dark master to whom you would
deliver us. You are a liar, Saruman, and a corrupter of men's
hearts. --Theoden in /usr/src/perl/taint.c
FINIS
$quote =~ s/\s+--/\n--/; #move attribution to line of its own
If you're doing this to strings that contain code you're building up
for an eval, or just text to print out, you might
not want to blindly strip all leading whitespace, because that would
destroy your indentation. Although eval wouldn't
care, your reader might.
Another embellishment is to use a special leading string for code
that stands out. For example, here we'll prepend each line with
@@@, properly indented:
if ($REMEMBER_THE_MAIN) {
$perl_main_C = dequote << ' MAIN_INTERPRETER_LOOP';
@@@ int
@@@ runops( ) {
@@@ SAVEI32(runlevel);
@@@ runlevel++;
@@@ while ( op = (*op->op_ppaddr)( ) ) ;
@@@ TAINT_NOT;
@@@ return 0;
@@@ }
MAIN_INTERPRETER_LOOP
# add more code here if you want
}
Destroying indentation also gets you in trouble with poets.
sub dequote;
$poem = dequote << EVER_ON_AND_ON;
Now far ahead the Road has gone,
And I must follow, if I can,
Pursuing it with eager feet,
Until it joins some larger way
Where many paths and errands meet.
And whither then? I cannot say.
--Bilbo in /usr/src/perl/pp_ctl.c
EVER_ON_AND_ON
print "Here's your poem:\n\n$poem\n";
Here is its sample output:
Here's your poem:
Now far ahead the Road has gone,
And I must follow, if I can,
Pursuing it with eager feet,
Until it joins some larger way
Where many paths and errands meet.
And whither then? I cannot say.
--Bilbo in /usr/src/perl/pp_ctl.c
The following
dequote function handles all these cases. It
expects to be called with a here document as its argument. It checks
whether each line begins with a common substring, and if so, strips
that off. Otherwise, it takes the amount of leading whitespace found
on the first line and removes that much from each subsequent line.
sub dequote {
local $_ = shift;
my ($white, $leader); # common whitespace and common leading string
if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\1\2?.*\n)+$/) {
($white, $leader) = ($2, quotemeta($1));
} else {
($white, $leader) = (/^(\s+)/, '');
}
s/^\s*?$leader(?:$white)?//gm;
return $_;
}
If that pattern makes your eyes glaze over, you could always break it
up and add comments by adding /x:
if (m{
^ # start of line
\s * # 0 or more whitespace chars
(?: # begin first non-remembered grouping
( # begin save buffer $1
[^\w\s] # one character neither space nor word
+ # 1 or more of such
) # end save buffer $1
( \s* ) # put 0 or more white in buffer $2
.* \n # match through the end of first line
) # end of first grouping
(?: # begin second non-remembered grouping
\s * # 0 or more whitespace chars
\1 # whatever string is destined for $1
\2 ? # what'll be in $2, but optionally
.* \n # match through the end of the line
) + # now repeat that group idea 1 or more
$ # until the end of the line
}x
)
{
($white, $leader) = ($2, quotemeta($1));
} else {
($white, $leader) = (/^(\s+)/, '');
}
s{
^ # start of each line (due to /m)
\s * # any amount of leading whitespace
? # but minimally matched
$leader # our quoted, saved per-line leader
(?: # begin unremembered grouping
$white # the same amount
) ? # optionalize in case EOL after leader
}{ }xgm;
There, isn't that much easier to read? Well, maybe not; sometimes it
doesn't help to pepper your code with insipid comments that mirror
the code. This may be one of those cases.
1.16.4 See Also
The "Scalar Value Constructors" section of
perldata(1) and the section on "Here Documents"
in Chapter 2 of Programming Perl; the
s/// operator in perlre(1)
and perlop(1), and the "Pattern Matching"
section in Chapter 5 of Programming Perl
|