Front page | perl.perl5.porters |
Postings from November 2011
Re: [perl #95160] Unicode readdir bugs
Thread Previous
|
Thread Next
From:
Brian Fraser
Date:
November 10, 2011 15:54
Subject:
Re: [perl #95160] Unicode readdir bugs
Message ID:
CA+nL+nYW_mat9ru1ikzmfytxzm4bsR3fS=Kdk_pievSwDSaUXg@mail.gmail.com
On Mon, Oct 24, 2011 at 1:25 AM, Father Chrysostomos via RT <
perlbug-followup@perl.org> wrote:
> > That's true, but consider which one of those has the actually useful
> > behavior. How many times have you gotten a "Wide character" warning
> > that
> > left you with mostly worthless output, and had to rerun things by
> > adding the
> > layers?
>
> Several hundred. But those were one-time one-liners.
>
> > Also, how often do you actually want to pass the internal form of UTF-
> > 8 to
> > system calls? I'm not saying it can't happen, but it's certainly not
> > the
> > common use case. On nearly every other occasion it's a bug that Perl
> > isn't
> > reporting, and a warning in this case is twice as useless.
>
> I think we need to warn, for backward-compatibility. I know there have
> been times that I relied on UTF-8 interfaces accepting Unicode strings,
> without even realising what I was doing. My code worked, after all.
> Then module upgrades broke things, but only every tenth time or so that
> the code ran, so it remained buggy a long time.
>
> > I don't think it wouldn't cause any more breakage than when the Fcntl
> > constants subs became actual ()-prototyped constants. The only things
> > that
> > "broke" were already broken, but Perl wasn't reporting it.
>
> That’s my thought, but actual smoke reports tend to sway me quickly.
>
>
Actually, how about a CPAN smoke of this? If the extent of the breakage is
reasonable, I'll personally send patches to all the affected modules : )
And as an added bonus, even if the core doesn't change to croak, it'll
improve the overall robustness of CPAN!
>
> The whole point of the unicode::filenames pragma is to eliminate the
> need to have to specify encodings everywhere, at least as I envision it.
> After all, Windows, VMS and Mac OS X all have character sequences for
> file names. I think some FreeBSDs might, too, but I’m not sure. So
> your explicit encoding suggestion just seems like a can of worms to me,
> which will doubtless be misused in CPAN modules by those who don’t
> really understand the issues.
>
>
Hm.. That's true enough. I was a bit wary of something automatically
picking the fs encoding for me, but then I noticed that the most common use
case of a pragma that had you explicitly set the encodings would be to load
a module to do exactly that! (e.g. the PerlIO::fse example in my previous
mail). Having that as the default seems reasonable.
Though it would be swell if it provided a way to override those defaults.
(Would you consider calling it unicode::syscalls or somesuch? ::filenames
implies it wouldn't affect, say, qx//)
>
> My initial train of thought was a little muddled. In any case, if perl
> is to make multiple attempts to load the file, using different methods,
> ignoring any pragmata, then that concern is irrelevant. But how many
> attempts should perl be making?
>
> If some OSes use Aristotle’s approach, then we only need *two* attempts,
> and Zefram’s plan, although it would have been wonderful if 5.8 had
> implemented it, will have to be discarded.
>
Yeah, you are right. I don't think I fully understand Aristotle's proposal
(though many thanks to him for taking time to explain it to me on IRC), but
it seems pretty good. Now someone just has to write it : )
>
> There are already people using ‘use Mödule’ on OS X. We shouldn’t break
> their code.
>
That probably won't work for the latin-1 range though, and the lack of
normalization on our side, while the OS does it, is and will be
troublesome. But personally, I was thinking of exempting use/require/do for
the time being, for two main reasons; first, properly overriding/encoding
those is non-trivial, and second, it's not a issue that should matter to
people writing Perl; How perl finds its stuff should concert only (mostly)
perl.
>
> > More boilerplate for the boilerplate god?
>
> ???
>
>
Sorry, in-joke.
Thread Previous
|
Thread Next