Front page | perl.perl5.porters |
Postings from August 2012
perl #114602] utf8 problems (still)...
From: Linda W
August 31, 2012 00:29
perl #114602] utf8 problems (still)...
Message ID: 504067C9.email@example.com
Leon Timmermans wrote:
>>> When three or four clever people on this list tell you ....
>> yeah, I'm well aware this side of the problem:
>> Ability to learn and correct is inversely proportional to self perceived
>> cleverness and knowledge.
> Insulting people who are putting an effort into helping you is
> exceedingly aggregating. This habit of yours is downright abusive and
> simply not appropriate on this list, or any other place for that
It wasn't intended as an insult. It's was presenting data backed by
evidence. It really wasn't intended personally -- it describes a group
of people who have certain characteristics -- most certainly those who
have a high self-perception of their ability, talent and/or knowledge.
It describes a tendency of *EVEN* those that ARE bright, to be blind to
their blind spots -- but it's much worse as the discrepancy increases
with what the person actually knows and what they think they know.
I'm sorry if you took it personally - it wasn't intended as such.
For myself, I am aware I know only a little about multiple things,
however, in my experience, the number of things and the amount usually
is enough to put most people to sleep if I go into it too much. But I
would definitely NOT think nor claim to know more about someone who is
truly a master of their field (usually I find that they are ones that
don't claim such -- but you find out in talking to them. Those that
repeatedly tell you -- are more often trying to convince you for
purposes of some point or argument).
Now anyone can take that as an insult if they want, or not. It's not
personally directed at anyone. But if it offends you, I would suggest
that perhaps it is touching on an uncomfortable truth.
> Stop playing the victim of a conspiracy, start taking responsibility
> for your own actions.
So far Nicholas Clark has been the only person who'd say has been on
the level and truthful.
That doesn't mean I necessarily agree with his stance, but he is someone
with whom I **could** discuss the problem that I was trying to
show/demonstrate/discuss for the past week or more (months if you count
That he would 'get' what I was talking about in 1 response, -- that's
someone who is is able to communicate (bidirectionally) me feeling like
someone is playing word games.
You (Leon) at least got what the essential function of 'P' with out
me feeling like I was talking to people who had no clue of programming
or perl or had it bother them so much they couldn't understand the point
I was trying to make. I felt the details and exact implementation of P
would simply be another side track for discussion about it's internals
when they had nothing to do with the point I was making.
And folks, given the directions this has gone off on -- when Nick summed
up the issue in 1 note -- you know, I'm right. Anything and everything was
picked at about how I said this or not having crossed a 't' or dotted
Eric went off on E9 vs. 0xe9... and my point wasn't about my thinking I
was writing "E9" vs. "\xe9", but that I was using 'chr(0xe9)... which I
would expect to produce different output than if I did a
printf("%c", 0xe9) (cf. printf("%c", chr(0xe9)) ).
I could have posted post the module, but I felt it would detract
focus from the essential issue of perl -- instead of doing something
useful with output -- throws up an warning (and maybe even an error
Instead of throwing a warning, on a wide char and then corrupting
output, it could do what it does for every OTHER wide char not in the
0x80-0xff range -- and put it it's unicode representation.
Whereas -- I knowingly, **for this example*** didn't set UTF-8 -- this
isn't a problem in most of my programs...
BUT it comes up frequently enough because there are many gotches in
perl related to this problem.
Having perl knowingly do the wrong thing (as it does now),or having it
die altogether when it has a good idea of what the user wants -- cannot
be called as something "serving backwards compatibility".
It's a warning that you want to make an error? How can that be backwards
compatible with any code?
I assert that people refer back to basic perl design philosophy: DWIM.
If this was cobol or fortran, I'd expect it to stay broken on
principle / standard. But being 'hard assed' and deliberately throwing
errors and warnings on output AND corrupting it to make sure they are
screwed -- rather than following perl's internal design that would normally
auto-convert to the right format.
my $a="42"; $b="43"; my $c=$a+$b; print "c=$c";
Do you get a warning for string to integer to string conversion?
It happens automatically.
Why generate a warning when printing a wide char out to a terminal -- why
not assume the user has a terminal that prints in unicode and just print it
like you do with the string? You don't print "Warning integer
encountered in string" or "strings encountered in addition".
"Perl is about helping you get from here to there with minimum fuss and
maximum enjoyment." What about generating warnings and then converting
output inconsistently is either?
"...One of the things that changes is how the community thinks Perl
should behave by **default** [emphasis mine]. (This is in conflict with
the desire for Perl to behae as it always did.)".... so added was
strict, threads came and morphed ... "Other things have come or gone.
Some experiments didn't work out and we took them out of Perl, replacing
them with other experiments. Pseudohashes, for instance..."... (Camel)
The point is perl changes and changing to default to Unicode would be a
move toward the future that wouldn't hurt compatibility -- as it's
already an "illegal case". I simply propose to make it put out UTF-8
be consistent ACROSS it's characters set -- because right now, it throws out
a warning and only converts wide chars <0x100 (& >0x7f) to binary -- the
rest IT's ALREADY PUTTING OUT IN UTF-8. So why the "deadzone" in 0x7f-xff?
It doesn't work without warnings in any program today. If some chase their
kneejerk reactions, it won't work at all -- so it CAN'T be for
What is the point?
to be something that tried to "Do what you meant" -- it was a stated
This isn't about compatibility -- as it already warns anyone who would
try to use
the feature set the way I am describing it wouldn't be able to without
On 30 August 2012 11:44, Linda W <firstname.lastname@example.org> wrote:
> demerphq wrote:
>> Let me say this once again, there is no bug.
> By bug you mean there is nothing there that isn't "intentional". I'm
> not disputing that.
>> And stop arguing with people that are trying to help you.
> By argue you mean try to get you to understand something that others have
> said they don't understand?
> By help me, what do you mean? Do you mean they are helping me to
> try to get perl to process characters uniformly on output by default?
> As the fact that perl does not do that by default is my problem. It doesn't
> generate UTF-8 for characters, it doesn't generate binary for charact
> C) Unless you tell it otherwise If you ask Perl to output a string
which is flagged as "unicode" and that string contains "wide
characters" which would require it to output octets whose values do
not correspond 1 to 1 with the codepoints of the unicode string it
warns that it is doing so.
D) absent requests to do otherwise chr() outputs a binary string
containing one octet for the range 0..255 and a unicode string for
codepoints of 256 and above. The internal representation of this
codepoint will be in UTF8 and will be multi-octet.
perl #114602] utf8 problems (still)...
by Linda W