Front page | perl.perl5.porters |
Postings from July 2010
[perl #76604] Inaccurate repetition examples in re tutorial docs
Thread Next
From:
David Olsson
Date:
July 21, 2010 02:09
Subject:
[perl #76604] Inaccurate repetition examples in re tutorial docs
Message ID:
rt-3.6.HEAD-11314-1279648211-606.76604-75-0@perl.org
# New Ticket Created by David Olsson
# Please include the string: [perl #76604]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=76604 >
Message-Id: <5.10.1_4048_1279645323@D-097-DOLSSON>
This is a bug report for perl from davidolsson@yahoo.com,
generated with the help of perlbug 1.39 running under perl 5.10.1.
-----------------------------------------------------------------
Though I am running an older Perl, this bug applies to the
current Perl documentation.
The regular expression tutorial documents -- perlrequick and
perlretut -- provide inaccurate examples of matching repeated
expressions.
At perldoc.perl.org, reference
http://perldoc.perl.org/perlrequick.html#Matching-repetitions
and
http://perldoc.perl.org/perlretut.html#Matching-repetitions
These sections provide similar examples of parsing year strings.
In perlrequick:
$year =~ /\d{2,4}/; # make sure year is at least 2 but not more
# than 4 digits
$year =~ /\d{4}|\d{2}/; # better match; throw out 3 digit dates
Either one of these expressions will match any string of
two or more digits. In order to match digits as implied,
the expressions need to bind to some non-digit things.
Simplest might be the beginning and end of the string:
$year =~ /^\d{2,4}$/; # 2, 3, or 4 digits
$year =~ /^\d{4}$|^\d{2}$/ # 2 or 4 digits (4 preferred)
In the second example, /^\d{4}|\d{2}$/ would NOT be accurate,
because the first alternative binds only to the beginning
of the string and the second alternative binds only to the
end of the string.
If the purpose were to extract a year numeral from anywhere
in the string, the expression might bind to word boundaries,
or, perhaps best, to either the string edges or non-digits:
$year =~ /(?:^|\D)(\d{4}|\d{2})(?:$|\D)/
This expression also returns the extracted year instead of
1 for a match. But we're probably past what we want to put
in a tutorial. I would just like to see the examples made
accurate, as above.
Thank you!
-----------------------------------------------------------------
---
Flags:
category=docs
severity=low
---
Site configuration information for perl 5.10.1:
Configured by rurban at Fri Dec 18 14:51:24 GMT 2009.
Summary of my perl5 (revision 5 version 10 subversion 1) configuration:
Platform:
osname=cygwin, osvers=1.7.0(0.21853),
archname=i686-cygwin-thread-multi-64int
uname='cygwin_nt-5.1 reini 1.7.0(0.21853) 2009-12-04 17:08 i686 cygwin '
config_args='-de -Dlibperl=cygperl5_10.dll -Dmksymlinks
-Dusethreads -Doptimize=-O3'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=undef, uselongdouble=undef
usemymalloc=y, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
optimize='-O3',
cppflags='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.3.4 20090804 (release) 1', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long long', ivsize=8, nvtype='double', nvsize=8,
Off_t='off_t', lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='g++', ldflags =' -Wl,--enable-auto-import
-Wl,--export-all-symbols -Wl,--stack,8388608
-Wl,--enable-auto-image-base -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib /lib
libs=-lgdbm -ldb -ldl -lcrypt -lgdbm_compat
perllibs=-ldl -lcrypt
libc=/usr/lib/libc.a, so=dll, useshrplib=true, libperl=cygperl5_10.dll
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
cccdlflags=' ', lddlflags=' --shared -Wl,--enable-auto-import
-Wl,--export-all-symbols -Wl,--stack,8388608
-Wl,--enable-auto-image-base -L/usr/local/lib -fstack-protector'
Locally applied patches:
CYG11 no-bs
CYG12 no archlib in otherlibdirs
CYG14 Dynaloader
CYG15 static-Win32CORE
CYG17 utf8-paths
CYG21 LibList-Kid.patch
CYG22 cygwin-1.7 hints
CYG23 544-stat
CYG24 build man pages
CYG26 Cwd for svk
Bug#55162 File::Spec::case_tolerant performance
disable ExtUtils::MakeMaker::Coverage in Sys-Syslog
---
@INC for perl 5.10.1:
/usr/lib/perl5/5.10/i686-cygwin
/usr/lib/perl5/5.10
/usr/lib/perl5/site_perl/5.10/i686-cygwin
/usr/lib/perl5/site_perl/5.10
/usr/lib/perl5/vendor_perl/5.10/i686-cygwin
/usr/lib/perl5/vendor_perl/5.10
/usr/lib/perl5/vendor_perl/5.10
/usr/lib/perl5/site_perl/5.8
/usr/lib/perl5/vendor_perl/5.8
.
---
Environment for perl 5.10.1:
HOME=/home/dolsson
LANG=C.UTF-8
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/usr/local/bin:/usr/bin:/bin:/cygdrive/c/Program Files/Common
Files/Microsoft Shared/Windows
Live:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/Program
Files/Intel/DMIX:/cygdrive/c/Program
Files/QuickTime/QTSystem/:/cygdrive/c/WINDOWS/system32/WindowsPowerShell/v1.0:/cygdrive/c/Program
Files/SlikSvn/bin/:/cygdrive/c/Python25:/cygdrive/c/Program
Files/Common Files/Microsoft Shared/Windows
Live:/cygdrive/c/Python26:/cygdrive/c/Program
Files/Vim/vim72:/cygdrive/c/Program Files/AutoIt3:/cygdrive/c/Program
Files/Google/google_appengine/:/cygdrive/c/Documents and
Settings/dolsson/My
Documents/install/xpdf-3.02pl4-win32:/usr/lib/lapack
PERL_BADLANG (unset)
SHELL (unset)
Thread Next
-
[perl #76604] Inaccurate repetition examples in re tutorial docs
by David Olsson