develooper Front page | perl.perl5.porters | Postings from January 2010

Re: [PATCH] New force_package_version uses scan_version

Thread Previous | Thread Next
From:
David Golden
Date:
January 10, 2010 19:53
Subject:
Re: [PATCH] New force_package_version uses scan_version
Message ID:
5d4beb41001101953v2c0dd20bmf520046b85f103f@mail.gmail.com
Hi John and P5P.

I've had a chance to read the code but not really play with it or test
it, but I'm very, very short on bandwidth so wanted to get you some
feedback sooner rather than waiting until I can do more with it. For
the same reason, I apologize for providing just text feedback and not
patches.

Overall -- I think is on the right track, particularly because it
appears to be completely independent of how version numbers are parsed
for "use NAME VERSION", which was the approach we all discussed for
Perl 5.12.  Though I think I found a few minor bugs, in general I
don't think this will break anyone's existing code.

Before I get into my list of nits, I want to play back my
understanding of the various code paths for version number strings.

(1) "use NAME VERSION" -- the version part is parsed by
S_force_version (which is unchanged by John's patch), which I think
results in a dual-var

(2) "package NAME VERSION" -- the version parsed by this calling chain:

  * S_force_package_version
    - calls Perl_prescan_version (in 'strict' mode)
    - calls Perl_scan_version
      * calls Perl_prescan_version (in 'lax' mode)
      * constructs a version object

(3) version->new, qv(), and UNIVERSAL::VERSION all call either
Perl_new_version or Perl_upg_version.  Perl_new_version itself calls
Perl_upg_version.  Perl_upg_version calls Perl_scan_version, which
follows the chain above in (2) to result in a version object

This means that (1) is unchanged, (2) is strict and (3) is lax --
albeit subject to the new, more consistent "lax" rules.  It also means
that "package NAME VERSION" assigns a version object to $NAME::VERSION
rather than a strict, but I don't think that's a big deal.  It might
even be smart, since any naive comparisons will get properly
overloaded operators.

What bothers, however, is that in (2), above, prescan_version gets
called twice, once in strict mode (by S_force_package_version) and
again in lax mode (by Perl_scan_version).  If the idea is that
eventually in Perl 5.14 or 5.16, Perl_scan_version will switch from
calling Perl_prescan_version in lax mode to strict mode and all three
paths will route through Perl_scan_version, then this is just a
temporary issues and I can live with it.

Separately, I'd like to see a major change to the testing.  I like
that John has split out a new file for version number testing, but
what I would like to see is for each test string to have a "pass/fail"
for each of strict and lax modes.  If either fails, the error string
should be the same.  E.g.

    v1.2    fail    pass    Invalid version format...
    v1.2.3 pass pass

Each string should be tested against the strict & lax regexes in
version.pm.  Strictness should also be tested with 'package NAME
VERSION' and laxness should be tested with version->new.  I suspect
that laxness should also be tested against "use NAME VERSION", except
those rules are yet different and might need a new fail/pass column
just for that.  In summary -- I'd like to have a really good list of
candidate version strings and a test that actually shows what we
expect each of the three paths above to do with them.  That will make
it *much* easier to track down bugs and catch regressions in the
future.

Now, on to the list of nits/issues.

* I don't like the name 'S_force_package_version'.  I'd rather call it
'S_force_strict_version' since that describes its effect rather than
how it happens to be used

* Rather than defining is_VERSION(a,b) as an alias to
Perl_prescan_version, I'd rather have an explicit is_STRICT_VERSION(a)
as an alias that sets the strict flag and is_LAX_VERSION(a) as an
alias that clear the strict flag.  Otherwise is_VERSION(s, TRUE) has
an obscure extra TRUE that is inscrutable unless knows how the macro
is defined.

* In the version.pm strict/lax regular expressions, there are several
repeated sections. (e.g. non-leading-zero integer).  I'd would be
easier to see the differences between regexes if those common parts
could be refactored out.

* In the lax regexes, the trailing alpha stanza has a star (*)
quantifier and I think it should have a question mark(?)

* I think I found two test cases that Perl_prescan_version doesn't
handle well -- though I admit I didn't actually test them:  v.2.3.4
and v1.2345.6

* For the error messages, I have mixed feelings about the phrase
"Invalid strict version format" as it might mislead people to thing it
has something to do with strict.pm.  I think just "Invalid version
format" is fine.

* I think the error message "(v1.2.3 required)" should be
"(dotted-decimal versions need at least three parts)"

* S_force_package_version is commented with copy/paste from
S_force_version that mentions "guessing" but has no such parameter, so
that comment needs to be re-written

* Several of the errors found in Perl_prescan_version warn with
Perl_ck_warner and others die with Perl_croak.  I don't understand
why.  Also, the actual error output looks pretty clumsy.  I think we
need better diagnostic output.  E.g.

    $ ./perl -wE 'package Foo v1'
    Invalid strict version format (v1.2.3 required) at -e line 1.
    syntax error at -e line 1, near "package Foo "
    syntax error at -e line 1, near "package Foo v1"
    Execution of -e aborted due to compilation errors.

I'm sure there's a way to avoid those repeated "syntax error" messages.

* strict dotted-decimal section needs to check that there is at least
one digit after 'v' before "while (isDIGIT(*d)) d++".  (e.g. "package
Foo v.1.2.3" is an error but the message isn't as clear as it should
be).

* I think the strict dotted-decimal section for "max 3 digits between
decimals" is breaking on j=3 when it should instead be croaking with
an error when j>3 (e.g. "package Foo v1.2345.6" succeeds when it
should be an error)

* very minor nit -- rather than "goto version_saw_decimal", I think it
would be clearer as "goto decimal_mantissa"

* I don't understand what's going on with width and *swidth, but I'm
going to guess it's part of the legacy API so we just stick with it.

* I don't understand the point of adding HvSHAREKEYS_on(hv), so I
can't comment one way or the other about it.

I hope these comments help.  Generally, I'd like to see revised
testing and some of the nits addressed and then have the patch applied
for 5.12.  I think it's a nice step forward towards formalizing the
behaviors and laying the groundwork for a consistent code path for
dealing with version numbers.

-- David

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About