develooper Front page | perl.perl5.porters | Postings from March 2012

Re: with malice aforethought (Re: Unicode cheatsheet for Perl)

Thread Previous | Thread Next
Nicholas Clark
March 2, 2012 09:26
Re: with malice aforethought (Re: Unicode cheatsheet for Perl)
Message ID:
On Fri, Mar 02, 2012 at 02:59:01AM +0100, Christian Hansen wrote:
> 26 feb 2012 kl. 20:22 skrev Tom Christiansen:
> > Christian Hansen <> wrote
> >   on Tue, 21 Feb 2012 02:07:08 +0100:
> > 
> >>>> I would love for this to happen, I have advocated this on #p5p several
> >>>> times, but there is always the battle of  "backwards compatibility
> >>>> disease". About 10 months ago I reported a security issue reading the
> >>>> relaxed UTF-8 implementation (still undisclosed and still exploitable)
> >>>> on the perl security mailing list.
> > 
> > Then we are currently in a security-through-obscurity situation, wherein
> > only overall ignorance of an exploit "protects" us.  That's not protection;
> > it's a vulnerability.  Would you estimate the vulnerability is severe
> > enough for us to consider whether in this particular case we should
> > consider issuing patches for old releases, like make a 5.12.5 or 5.10.2?
> The vulnerability is present from early realises of 5.8.X (I haven't confirmed all perl releases, but the implementation is the same). The vulnerability makes it possible to smuggle through character strings (specially crafted for malicious purposes) using the :utf8 layer, which (in this case) bypass the perls regex engine (which fails the match/validation).
> Wheter or not this is severe enough to patch older releases or not, I'll leave unsaid.

Yes. This is real, and needs fixing. It shouldn't be conflated with the
general documented problem that :utf8 is lax.

I believe (but haven't checked) that all the code here (the specific, and the
general) has been functionally unchanged since 5.8.0. ie - all 20 stable
releases in the past 10 years have the same behaviour.

As to the security reporting - this is my take:

With volunteers, when someone says that they will do something but they don't
get on with it, it's problematic. There's a fine line between nagging someone
enough to get them do it, and nagging too far and they resign, leaving no-one
to do it.

In this case, "ownership" of security was person A, who had delegated it to
person B. Person B wasn't *doing* it, and person A wasn't chasing them up.
Because two things failed, it dropped on the floor. As I don't know B well,
I asked A about it. But, personally, there's only so long I am going to
point out that it was on the floor, before I consider it pointless banging
my head. Also, specifically *I* am pointing out things on the floor, rather
than picking them up and pretending that there was never a problem, as

a) I've done most of these things before, and it's someone else's turn now
   Security fixes need new maint releases - I'm done with doing releases
b) I see it as more useful to Perl 5 long term to cause shorter term pain
   to fix the problems, than to pretend that they don't exist.

I can report that there is progress generally here - one of A and B above has
handed over their position to a new individual, who is being more active.

> >>> There is absolutely no need to remain compatible with security-related
> >>> bugs, and every reason not to.  Indeed, security is the only thing that
> >>> we ever issue patches to releases that are past their end-of-life support.
> I agree!

But not forever. The support policy states:

    We "officially" support the two most recent stable release series.
    5.10.1 and earlier are now out of support.  As of the release of 5.16.0,
    we will "officially" end support for Perl 5.12.4, other than providing
    security updates as described below.

    To the best of our ability, we will attempt to fix critical issues in
    the two most recent stable 5.x release series.  Fixes for the current
    release series take precedence over fixes for the previous release

    To the best of our ability, we will provide "critical" security patches
    / releases for any major version of Perl whose 5.x.0 release was within
    the past three years.  We can only commit to providing these for the
    most recent .y release in any 5.x.y series.

    We will not provide security updates or bug fixes for development
    releases of Perl.

    We encourage vendors to ship the most recent supported release of Perl
    at the time of their code freeze.

    As a vendor, you may have a requirement to backport security fixes
    beyond our 3 year support commitment.  We can provide limited support
    and advice to you as you do so and, where possible will try to apply
    those patches to the relevant -maint branches in git, though we may or
    may not choose to make numbered releases or "official" patches
    available.  Contact us at <> to begin that

> It's quite easy, we need a Benevolent Dictator, such as Larry
> Wall. Someone who can make the though calls. Personally I think we should
> just implement Unicode as most people expect it to work (according to the
> Unicode standard).

Tom's conclusions from his talk to OSCON last year was that Perl 5's Unicode
support is generally better than *every* other language. You seem to be
choosing words that make that sound like this is not the case.

And yes, Ricardo is acting as dictator and *has* made a decision. The problem
is that people don't like it when decisions go against them. And decisions
can't be in everyone's preferred direction, as if everyone agreed, a decision
wouldn't have been needed.

Please don't confuse the specific security issue with the general documented
:utf8 laxness. The specific security issue damn well should be a release

Ricardo's decision is that delaying 5.16 to fix the general *known*
*documented* *ten year old* laxness of :utf8 doesn't actually help get the
fix into the hands of end users any faster, but does deny them every other
bug fix currently in blead. As Ricardo stated in his e-mails, particularly
this one which no-one commented on, and no-one disputed:

* if Ricardo declares that something *is* a release blocker, then it blocks
  the release
* There isn't a pool of programmers he can direct to fix it
* He isn't able to fix it himself
* No-one was able to promise to deliver a fix in a timely fashion

result - the release would block, potentially forever.

Right now, the current stable version of perl is 5.14.2. It has lax :utf8
If Ricardo chooses to *delay* 5.16.0, the current stable version still has
the bug.

Shipping a 5.18.0 as soon as we have it fixed gets the fix out there in
the *same* timeframe as delaying 5.16.0 until the fix is done.

I think that a lot of people reading this list don't realise how *few* people
are actually contributing *code* to the repository. Most commits are from
about half a dozen people. Which 6 people changes from month to month, and it
might be fairer to express it as 4 + 4 * 0.5, but it's surprisingly *low*,
given all the traffic on this list.

It's been like that for about 10 years, as can be seen from the graphs on

The graphs are noisy, but even so there's no obvious change point around the
switch from perforce to git, or around the start of regular blead releases.
Both *were* useful changes for other reasons, but haven't changed the
contributor makeup.

> What happened to the Perl mantra "Making Easy Things Easy and Hard Things Possible"?

I don't see what relevance that has here.

> > It seems to me that Python went through a transition where encoding-decoding
> > errors changed from some sort of non-fatal to proper exceptions.  I don't know
> > what sort of conniptions they experience there, since it's not a backwards-
> > contemptible change.  But it doesn't have to be b-c, and probably shouldn't be.
> > Jarkko is right.

I think that yes, we should make various things fatal. Including "wide
character in print" and UTF-8 laxness in the parser. But the codebase is
messy, and it all takes time.

> What you are saying is correct, Python supports two different compile options UC2 and UCS4. Our case is worse, we support two different internal encodings depending on platform, on EBCDIC we use UTF-EBCDIC and on US-ASCIIplatforms we use a relaxed UTF-8 encoding.
> Just to cut to the shit, there seem to be a group of people that likes EBCDIC, but so far we haven't heard from anyone with this facilities.
> Why are we trying to support two differential encodings when we can barley support the proper one? 

It is politically awkward to kill EBCDIC support. Given that (so far) neither
the EBCDIC user base nor IBM have actually come good on delivering help, they
are not doing themselves any favours. (There was one individual who volunteered
to test build, but he's gone cold. Ricardo is chasing him). At some point,
soon*er* rather than later we will loose all patience with them.

However, *Jarkko* is also against removing EBCDIC support. He rightly points
out that it would be very hard to add it back once it's removed. I can also
understand his personal concern here - he put a lot of personal effort into
getting it working, and we're wanting to throw it away. I have a lot more time
for Jarkko than all the talk-but-no-action EBCDIC user base, as even now
Jarkko is still responsible for 23% of the Perl 5 distribution, as the
ohloh link above shows.

But I believe that even Jarkko only had access to an EBCDIC system for a short
while whilst he was working on things, and no longer has access, so I don't
think that he could help *code* here, even if he had the time.

We're not going to kill EBCDIC support for 5.16.0. But at the current rate of
non-responsiveness from its userbase, its days *are* numbered.

Nicholas Clark

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About