develooper Front page | perl.bootstrap | Postings from July 2000

Re: ok, fire away

Thread Previous | Thread Next
From:
Matthew Persico
Date:
July 25, 2000 18:15
Subject:
Re: ok, fire away
Message ID:
397E3C9D.71A3C2DE@acedsl.com
To get to ball rolling:

=SUBJECT - Perl regexp substitutions require extra work for parsing

=SYNOPSIS

The s/ operator, when used in conjunction with backrefs ($1, $2, etc)
requires extra assignment to avoid modifing the original source string
and extra work to assign the results to other variables.

=DESCRIPTION

I frequently use regexps with backrefs to parse expressions. In
particular, I have an application which (unfortunately) encodes
information in a filename. To whit:

     135amc4567m.prn_real-reprort.prn

The 135 is a sequence number. The previous version of this system did
something appalling like

     ls -c1 * | sort -n | xargs lp

to print in proper sequence.

The amcXXXX is a report name. The 'm' is the report period and can be
(m)onth, (q) quarter, (s)emi-annual, (a) anual or f[0-9]* for weird
periods. The stuff after the first _ is optional and is used indicate
that this file is a link to some standard report.

Anyway, the parseing expression is

($reportSeq,$reportDesignation,$reportCode,$reportFreq,$commentaryName)
=
split ',', ((($dummy = $reportFileName) =~
s/(\d+){0,1}(am[cpj])(\S+?)((?:[a-eg-z]){0,1}|f(?:[0-9]*)).prn(.*)/$1,$2,$3,$4,$5/),$dummy);

Notice that in order to get the five backrefs assigned to five vars, I
had to convert $dummy to a comma-sep list and then split it.

I propose a new syntax: if m// is for match and s/// is substitute, then

 p///

is for parse. p/// will

1) Not modify the original string
2) Accept only backrefs in the replacement clause, with optional
whitespace separation for clarity.
3) Return an array of the individual results.

I could then code my statement above as:

($reportSeq,$reportDesignation,$reportCode,$reportFreq,$commentaryName)
=
($reportFileName =~
p/(\d+){0,1}(am[cpj])(\S+?)((?:[a-eg-z]){0,1}|f(?:[0-9]*)).prn(.*)/$1 $2
$3 $4 $5/);

Comments?
-- 
Matthew O. Persico
    
"If you were supposed to understand it,
we wouldn't call it code." - FedEx

____________NetZero Free Internet Access and Email_________
Download Now     http://www.netzero.net/download/index.html
Request a CDROM  1-800-333-3633
___________________________________________________________

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About