develooper Front page | perl.perl5.porters | Postings from October 2013

NWCLARK TPF grant report #104

From:
Nicholas Clark
Date:
October 3, 2013 13:49
Subject:
NWCLARK TPF grant report #104
Message ID:
20131003134944.GL4940@plum.flirble.org
[Hours]		[Activity]
2013/08/26	Monday
 1.50		RT #2968
 0.25		RT #55896
 0.50		\Q \E
 2.75		reading/responding to list mail
 2.50		untangling PAUSE permissions
=====
 7.50

2013/08/27	Tuesday
 0.50		CPAN client and modules removed from core.
 0.50		RT #119445
 1.00		RT #2968
 0.25		de-duping BSD Call Back Units
=====
 2.25

2013/08/28	Wednesday
 0.25		RT #119497
 4.50		RT #2968
 2.25		atoi
 1.75		atoi ($1, $2, $& etc)
 1.00		atoi (RT #81586)
 0.25		de-duping BSD Call Back Units
=====
10.00

2013/08/29	Thursday
 0.50		${^MPEN}
 0.25		CPAN #88258, CPAN #88259, CPAN #88260
 3.50		atoi ($1, $2, $& etc)
 0.50		atoi (RT #116118)
 1.00		reading/responding to list mail
=====
 5.75

2013/08/30	Friday
 0.50		LVALUE macros
 0.50		PerlIO_sprintf, PerlIO_vsprintf
 0.25		RT #116118
 1.00		RT #116907
 0.25		RT #119445
 0.50		RT #119515
 5.00		reading/responding to list mail
=====
 8.00

2013/09/01	Sunday
 0.75		RT #119123
=====
 0.75

Which I calculate is 34.25 hours

Did you know that ${^MPEN} used to be treated as a synonym for ${^OPEN}?
No, and I didn't either. This was caused by a missing break; statement in
Perl_gv_fetchpvn_flags(), in the check for variables starting with ^M. What
should have happened is that if the name isn't ^MATCH it behaves like any
other unused multi-character control variable. What was going wrong was that
it was falling through into the code for variables starting with ^O, so the
check there for 'are the next 3 characters "PEN"' would pass, and hence the
odd behaviour. Now fixed in blead.

And this came about as a side effect of a bug report that Carp::confess loses
$! on Windows. Which it does, as it turns out, due to an unhelpful combination
of Microsoft's atoi() function being buggy (it resets errno. That's naughty),
and the regular expression match variables making a call to atoi() each time
they are read. No, I didn't know that. That seems mighty inefficient, so I
looked to fix it, hence I ended up also looking at the code for ${^MATCH}, and
spotted the above problem.

As to $1 and friends...

Well, it turns out that since forever (perl 5.0 alpha 2) runtime lookup of
the numbered match variables has been implemented by finding their name (as
a string) and converting it to a number (the aforementioned call to atoi())
to get the array index to use. Which works, but sure doesn't feel like the
most efficient way of doing it (even if it didn't also cause strange side
effects due to buggy libraries). I think that it was done this way because
it's the same approach as used by all the other punctuation variables - they
all have magic type '\0' attached to them, and then the routines called for
reads and writes on such variables uses switches based on the name to perform
the correct action.

It happens that the "name" is stored as a pointer, length pair. It also
turns out that nothing (else) relies on the name, and that the name is never
NULL. All this is useful. We have an integer field (the "length") and no
existing flag value of NULL. So it was a S.M.O.P. to change the
representation such that for match variables the pointer is now NULL, and
the length is used to store the array index, with the string to number
conversion done once at compile time.

More pleasingly *everything* seemed to come together nicely. Other routines
that handle magic (such as thread duping) already work smoothly if the
length is non-zero even though the pointer is NULL, so nothing needed
changing there. Not needing to store the name until runtime is a memory
saving. Thanks to some refactoring Dave did for the regex engine, *all* the
match variables are actually now implemented as integers - $& is 0, $' is -1,
etc, which means that they too can be implemented this way, not just $1
upwards. Which means that the C code is 23 lines shorter as a result
(although the object code is about the same size - you can't have it all).

Nicholas Clark



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About