develooper Front page | perl.perl5.porters | Postings from August 2013

[perl #117355] [lu]cfirst don't respect 'use bytes'

Thread Previous | Thread Next
From:
Victor Efimov via RT
Date:
August 16, 2013 09:41
Subject:
[perl #117355] [lu]cfirst don't respect 'use bytes'
Message ID:
rt-3.6.HEAD-2552-1376646101-1691.117355-15-0@perl.org
UPD:
encoding::warnings is indeed broken (bug opened 5 years ago)
https://rt.cpan.org/Public/Bug/Display.html?id=33989
and cannot be fixed.

On Thu Aug 15 17:45:08 2013, vsespb wrote:
> On Thu Aug 15 15:46:39 2013, aristotle wrote:
> 
> > That is really the last remnant (I think) of The Unicode Bug.
> 
> More precisely http://perldoc.perl.org/perlunicode.html "When Unicode
> Does Not Happen"
> 
> > And however useful it may be while the bug persists, a workaround is
> all it is. It *isn’t* a legitimately good use case for bytes.pm.
> 
> Yes, workarounds needed until issue is fixed.
> 
> And probably until all CPAN code, which misuse UTF-8, is fixed. I found
> several examples (some of them never going to be fixed - authors refuse
> to do that).
> 
> https://rt.cpan.org/Public/Bug/Display.html?id=87863
> https://rt.cpan.org/Public/Bug/Display.html?id=87807
> https://rt.cpan.org/Public/Bug/Display.html?id=30271
> https://github.com/akarelas/xml-myxml/issues/2
> 
> 
> > I have no idea what the concept of assertions or that of unit tests
> has to do with the internal representation of strings
> 
> > OK, to cut a long story short, the line is
> > ...
> 
> Exactly. Your explanation is correct.
> 
> > This is not a bug, though it certainly is suboptimal.
> 
> Of course I agree that this is a feature, not a bug.
> Point was it's suboptimal.
> That is why I need to check it in assertions and unit tests ("unit
> tests" is opposite to what is said in bytes.pm "use only for debugging
> purposes")
> 
> > encoding::warnings
> 
> Seems a great module. At least great idea. However for my case it does
> not work or I misunderstood its usage (it does not catch error and
> actually silently fixes the "Unicode bug" with filenames in perl - i.e.
> with this pragma program behaves differently)
> 
> =======================
> use Encode;
> use utf8;
> use strict;
> use warnings;
> my $u = "\x{442}\x{435}\x{441}\x{442}"; # same as "тест"
> my $bin = "\xf1\xf2\xf3";
> my $ascii = "x";
> my ($ascii_u, undef) = split(/ /, "$ascii $u");
> 
> print "original bin length:\t";
> print length($bin) . "\t" . bytes::length($bin) ."\n";
> 
> my $bin_a = do {
>  use encoding::warnings 'FATAL';
>  $bin.$ascii;
> };
> 
> print "bin_a length:\t";
> print length($bin_a) . "\t" . bytes::length($bin_a) ."\n";
> 
> my $bin_u = $bin.$ascii_u; # THIS LINE CONTAINS A BUG
> 
> die unless $bin_u eq $bin_a;
> print "bin_u and bin_a are same!\n";
> 
> 
> use Devel::Peek;
> Dump $bin_a; Dump $bin_u;
> 
> open my $f, ">", "$bin_u.tmp";
> binmode $f;
> syswrite $f, "TEST";
> close $f;
> 
> open $f, "<", "$bin_a.tmp" or die "file not found $!";
> =======================
> 
> > because if you try to catch it manually, you will miss places where
> you would need to put checks
> > Also, if you *already know* (some of) the places
> 
> that's an idea of unit tests - catch bugs in known places.
> 
> > utf8::downgrade($bin, 1);
> > utf8::downgrade($ascii_u, 1);
> > my $bin_u = $bin.$ascii_u; # THIS LINE NO LONGER CONTAINS A BUG
> 
> Yes. This is a fix for the bug. Now I need to unit test the fix with
> bytes::length or encoding::warnings. (i.e. a practice to write tests
> after bug found)
> 
> > This is exactly the same bug as in your first comment on this issue
> 
> Yes, I reposted a bit re-worked example when answered another comment.
> 




---
via perlbug:  queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=117355

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About