develooper Front page | perl.perl5.porters | Postings from April 2000

Re: PATCH: perlre.pod (against 5.6.0)

Thread Previous
From:
Hugo
Date:
April 29, 2000 12:53
Subject:
Re: PATCH: perlre.pod (against 5.6.0)
Message ID:
200004292001.VAA28216@crypt.compulink.co.uk
In <14962.957028538@chthon>, Tom Christiansen writes:
:*** perlre56.pod	Sat Apr 29 09:42:35 2000
:--- perlre.pod	Sat Apr 29 11:13:51 2000

Below is a patch with some minor fixes. Here are some other comments:

ll153-4: Otherwise, the lefter one always wins.
Cute though it is, I'd rather see something like 'the leftmost of the
two'.

l172: octal char (think of a PDP-11);
Does 'think of a PDP-11' actually help anyone understand this?
(I appreciate this phrase was not introduced by Tom's patch.)

l179: \u      titlecase next char
Not sure what 'titlecase' means, or why it is more accurate than
'uppercase', nor why \U was not similarly changed.

ll577-580:
  For reasons of security, this construct is normally forbidden if
  the regex involves variable interpolation, unless the perilous C<use
  re 'eval'> pragma has been used (see L<re>), or the variables contain
  results of C<qr//> operator (see L<perlop/"qr/STRING/imosx">).

I don't think this is correct.

I _think_ the story is that without C<use re 'eval'>, you cannot combine
variables and code within a regexp, but that you can put code into a
qr// regexp (as long as you don't mix in variables to be interpolated),
and then interpolate the variable containing the evalable qr// pattern
along with other variables into a new pattern. Thus this is allowed:

  $y = qr/(?{ "code here" })/;
  /$x$y$z/;

.. but these aren't:

  $x = qr/./; /(?{ "code here" }) $x/;
  $x = qr/(?{ "code here" }) $y/;

l609: Execute I<code> and interpolate its result as more pattern.
I think 'as a subpattern' might be more accurate, since you can't
say, for example, /(??{ "(" }) . (??{ ")" })/.

ll638-9:
  This is mostly useful as an efficiency hack
  to optimize of what would otherwise be "eternal" matches [...]

"to optimize what would", "to break out of what would"?

ll695-7:
  Be aware, however, that this pattern currently
  triggers a warning message under the C<use warnings> pragma or B<-w>
  switch saying it C<"matches the null string many times">.

I was unable to find evidence of this in any version of perl I have
here. I appreciate this sentence was not introduced by Tom's patch.

Hugo
--- pod/perlre.pod.old	Sat Apr 29 19:47:02 2000
+++ pod/perlre.pod	Sat Apr 29 20:57:00 2000
@@ -59,7 +59,7 @@
 These are usually written as "the C</x> modifier", even though the
 delimiter in question might not really be a slash.  Any of these
 modifiers may also be embedded within the regex itself using the
-C<(?I<flags>...) construct.  See below.
+C<(?I<flags>...>) construct.  See below.
 
 The C</x> modifier itself needs a little more explanation.  It tells
 the regex parser to ignore whitespace that is neither backslashed
@@ -68,7 +68,7 @@
 is also treated as a metacharacter introducing a comment, just as
 in ordinary Perl code.  This also means that if you want real
 whitespace or C<#> characters in the pattern (outside a character
-class, where they are unaffected by C</x>), that you'll either have
+class, where they are unaffected by C</x>), you'll either have
 to escape them or encode them using octal or hex escapes.  Taken
 together, these features go a long way towards making Perl's patterns
 more readable.  Note that you have to be careful not to include the
@@ -94,7 +94,7 @@
     ()	Grouping
     []	Character class
 
-By default, the C<^> metacharacter is matches only the beginning
+By default, the C<^> metacharacter matches only the beginning
 of the string, the C<$> metacharacter only before an optional
 trailing newline at the end, so Perl does certain optimizations
 with the assumption that the string contains only one line.  Embedded
@@ -353,7 +353,7 @@
 interpreting C<\10> as a backreference only if at least 10 left
 parentheses have opened before it.  Likewise C<\11> is a backreference
 only if at least 11 left parentheses have opened before it.  And
-so on.  C<\1> through C<\9> are always interpreted as backreferences."
+so on.  C<\1> through C<\9> are always interpreted as backreferences.
 
 Examples:
 
@@ -377,7 +377,7 @@
 everything after the matched string.
 
 The numbered variables ($1, $2, $3, etc.) and the related punctuation
-set (C<<$+>, C<$`>, C<$&>, and C<$'>) are all automatically localized
+set (C<$+>, C<$`>, C<$&>, and C<$'>) are all automatically localized
 to the enclosing dynamic scope.  Their values are therefore ephemeral
 and best copied into more enduring variables.  (See L<perlsyn/"Compound
 Statements">.)
@@ -426,7 +426,7 @@
 Perl also defines a consistent extension syntax for features not
 found in standard tools like B<awk> and B<lex>.  The syntax is a
 pair of parentheses with a question mark as the first thing within
-the parentheses, such as C<(?I<X>...).  The value of I<X> after the
+the parentheses, such as C<(?I<X>...)>.  The value of I<X> after the
 question mark determines which extension is selected.
 
 Stability of these extensions varies widely.  Some have been part
@@ -536,7 +536,7 @@
 B<WARNING>: This extended regular expression feature is considered
 highly experimental, and may be changed or deleted without notice.
 
-This zero-width element evaluates to any embedded Perl code.
+This zero-width element evaluates any embedded Perl code.
 Currently, the rules to determine where the C<code> ends are somewhat
 convoluted.  It is not an assertion, because it does not assert
 anything: the success of the match is unrelated to the code's return
@@ -567,7 +567,7 @@
 This construct may be used as a C<(?(condition)yes-pattern|no-pattern)>
 switch.  If I<not> used in this way, the result of evaluation of
 C<code> is put into the special variable C<$^R>.  This happens
-immediately, so C<$^R> can be used from other C<(?{ code })> assertions
+immediately, so C<$^R> can be used from other C<(?{ code })> elements
 inside the same pattern.
 
 The assignment to C<$^R> above is properly localized, so the old
@@ -694,7 +694,7 @@
 finishes in a fourth the time when used on a similar string with
 1000000 C<a>s.  Be aware, however, that this pattern currently
 triggers a warning message under the C<use warnings> pragma or B<-w>
-switch saying it C<"matches the null string many times">):
+switch saying it C<"matches the null string many times">.
 
 On simple groups, such as the pattern C<< (?> [^()]+ ) >>, a comparable
 effect may be achieved by negative look-ahead, as in C<[^()]+ (?! [^()] )>.
@@ -746,7 +746,7 @@
 
 =head2 Backtracking
 
-NOTE: This section presents an abstract approximation of the how
+NOTE: This section presents an abstract approximation of how
 the regex engine behaves.  For a somewhat more rigorous (and harder
 to understand) view of the rules involved in selecting a match among
 possible alternatives, see L<Combining pieces together>.
@@ -1114,7 +1114,7 @@
     $_ = 'bar';
     s/\w??/<$&>/g;
 
-results in C<"<><b><><a><><r><>">.  At each position of the string
+results in C<<"<><b><><a><><r><>">>.  At each position of the string
 the best match given by non-greedy C<??> is the zero-length match,
 and the I<second best> match is what is matched by C<\w>.  Thus
 zero-length matches alternate with one-character-long matches.
@@ -1166,7 +1166,7 @@
 substrings that can be matched by C<S>, C<B> and C<B'> are substrings
 which can be matched by C<T>.
 
-If C<A> is better match for C<S> than C<A'>, C<AB> is a better
+If C<A> is a better match for C<S> than C<A'>, C<AB> is a better
 match than C<A'B'>.
 
 If C<A> and C<A'> coincide: C<AB> is a better match than C<AB'> if
@@ -1238,8 +1238,8 @@
 the functionality of the regex engine.
 
 Suppose that we want to enable a new regex escape-sequence C<\Y|> that
-matches at boundary between white-space characters and non-whitespace
-characters.  Note that C<(?=\S)(?<!\S)|(?!\S)(?<=\S)> matches exactly
+matches at the boundary between white-space characters and non-whitespace
+characters.  Note that C<<(?=\S)(?<!\S)|(?!\S)(?<=\S)>> matches exactly
 at these positions, so we want to have each C<\Y|> in the place of the
 more complicated version.  We can create a C<custom_re> module to do this:
 
@@ -1303,7 +1303,7 @@
 
 L<perllocale>.
 
-L<perldebugs/"Debugger Internals">.
+L<perldebguts/"Debugger Internals">.
 
 I<Mastering Regular Expressions> by Jeffrey Friedl, published
 by O'Reilly and Associates.

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About