develooper Front page | perl.perl5.porters | Postings from February 2013

Re: Don't patch perlopentut: rewrite it completely

Thread Previous | Thread Next
From:
Tom Christiansen
Date:
February 17, 2013 17:22
Subject:
Re: Don't patch perlopentut: rewrite it completely
Message ID:
24763.1361121720@chthon
In-reply-to 
    message dated "Sun, 17 Feb 2013 16:14:44 +0100."
    Aristotle Pagaltzis <pagaltzis@gmx.de> 

>* Tom Christiansen <tchrist@perl.com> [2013-02-17 02:35]:
>> See what I mean? Short, simple; no fancy stuff. Just the facts, ma'am.

> ++

> Excellent reframing.

Thank you.

> * Konovalov, Vadim <vadim.konovalov@alcatel-lucent.com> [2013-02-17 07:10]:
>> * no more IO Layers section, fortunately.

>  Is this really a problem? Encodings are mentioned throughout, though
>  no formal explanation of the mechanism is given. But that seems like
>  a feature to me here. The only other thing I can think of is that it
>  seems necessary to mention :crlf somewhere. Even so I see no necessity
>  for a formal introduction to layers in order to mention it, though.

I thought about that.  I'm not sure of the answer, but I suspect that you
might be right.  Lately I've been seeing very strange mixes of things, like
:crlf on :utf8 files:

    open(INPUT, "< :utf8 :crlf", $filename)

I've also been told by someone who should be correct that one needs to use

    open(OUTPUT, "> :raw :encoding(UTF-16LE) :crlf", $filename)

for certain sorts of files.  It is very complicated, but it appears to be
something that many, and perhaps even many-many, people will need to do.

BUG 0
=====
There is some real confusion between :raw and :bytes, and binmode
and layers, and the open mode vs binmode.  That's a sneaky way of 
saying that it confuses *me*.  The interplay between buffering
and unbuffering, the crlf translation, the actual encoding
of binary data, and that you can apply layers on sysread/write 
make for a dizzying set.  

BUG 1
=====
We have no input layer that means to read \R-terminated lines, nor is
there anything can set $/ to to effect this.  You have to read the entire
file in and either split it on \R or otherwise remove them yourself.

This is too hard, and it should be trivial.

We should have a \R layer, and it should perhaps be the default for 
text input. It should certainly play well and get along with chomp.

BUG 2
=====
Speaking of not playing well and getting along with others, :encoding(UTF-8)
later does not behave correctly with respect to utf8 warnings and the three
subclasses thereof.  That is why I will not use it myself: I want to control
all my warnings and exceptions via the use warnings mechnanism.  That means
I use :utf8 and use warnings utf8, maybe with fatals and subclasses, as
needed.

--tom

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About