From simple string to here-document

String management in Perl could be surprising for the random user. But after a while you get used of this sometimes cryptical way of defining strings.

The "normal" string usage looks quite a common sight, for the C/C++/Java programmer:
Here I just printed a line, I hope you already know that backslash-n is the way a newline is represented.

A first surprise comes when we rewrite the print function without using round brackets - they are not mandatory in Perl:
print "Hello!\n";
But now look here:
print 'Hello!\n', "\n";
We are passing to strings to the print function - the single quote in Perl is another delimiter for strings, and not for a single character - and we have both of them printed on standard output.

The single quote string delimiter, acts in a way to let the string interpreted literally: \n is seen just like a couple of normal characters.

If I want to put a single backslash in a double quoted string, I escape it using another backslash:
print "backslash here \\, and at the end:\\", "\n";
Theoretically speaking, using the single quote should avoid the double-backslash trick - but we have to pay attention that our backslash won't be interpreted erroneously as an escape character:
print 'backslash here \, and at the end:\\', "\n";
We had to put a double backslash at the end of the first string, otherwise the \' couple was about to be interpreted as a single quote escaped - breaking the code, since Perl wouldn't find the string terminator.

For more complicated strings the quote-like operators are a good alternative. We delimit the string using a single or double q (instead of a single or double quote) and a slash (or other character like a pipe '|', hash '#', brackets) to start it and a slash (or pipe, hash, ...) to terminate it. Like this:
print qq/'backslash here \, and at the end:\\'/;
When you should generate a real complex string, it could be useful using the here-document (also friendly known as heredoc) concept, to keep things simple.

The idea is that we start the string with a special tag: two less-than and a sequence of characters of our choice (traditionally EOF). We repeat our "magic" sequence at the end of the string. If we want to avoid escaping, we put our sequence in single quotes, like this:

print <<'EOF';
here-document.\' \
I could write anything I want.

I'm refreshing my Perl knowledge reading Beginning Perl by Simon Cozens. On chapter 2 you could find more information on strings and printing.

No comments:

Post a Comment