develooper Front page | perl.perl5.porters | Postings from February 2014

[perl #121292] perlunicode claims about a UTF-8 BOM in perl source are incorrect

From:
Tony Cook
Date:
February 21, 2014 00:42
Subject:
[perl #121292] perlunicode claims about a UTF-8 BOM in perl source are incorrect
Message ID:
rt-4.0.18-21210-1392943324-623.121292-75-0@perl.org
# New Ticket Created by  Tony Cook 
# Please include the string:  [perl #121292]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org/Ticket/Display.html?id=121292 >



This is a bug report for perl from tony@develop-help.com,
generated with the help of perlbug 1.40 running under perl 5.19.10.


-----------------------------------------------------------------
[Please describe your issue here]

perlunicode claims:

=item C<BOM>-marked scripts and UTF-16 scripts autodetected

If a Perl script begins marked with the Unicode C<BOM> (UTF-16LE, UTF16-BE,
or UTF-8), or if the script looks like non-C<BOM>-marked UTF-16 of either
endianness, Perl will correctly read in the script as Unicode.
(C<BOM>less UTF-8 cannot be effectively recognized or differentiated from
ISO 8859-1 or other eight-bit encodings.)

ie. that the following code, hexdumped:

00000000  ef bb bf 70 72 69 6e 74  20 22 54 65 73 74 5c 6e  |...print "Test\n|
00000010  22 3b 0a 70 72 69 6e 74  20 6f 72 64 28 27 ce a3  |";.print ord('..|
00000020  27 29 2c 20 22 5c 6e 22  3b 0a                    |'), "\n";.|
0000002a

should be treated as unicode, implying to me at least that it should
act as an implied C<use utf8;>

This doesn't occur:

tony@mars:.../git/perl2$ ./perl test.pl
Test
206

Is the documentation correct, or unclear, or is the behaviour
incorrect?

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=low
---
Site configuration information for perl 5.19.10:

Configured by tony at Fri Feb 21 10:14:52 EST 2014.

Summary of my perl5 (revision 5 version 19 subversion 10) configuration:
  Commit id: 3e63bed3c572617faf16446e7b44b5ea0b78e979
  Platform:
    osname=linux, osvers=3.2.0-4-amd64, archname=x86_64-linux
    uname='linux mars 3.2.0-4-amd64 #1 smp debian 3.2.46-1+deb7u1 x86_64 gnulinux '
    config_args='-des -Dusedevel -Uusedl'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.7.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='ld', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    libc=libc-2.13.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.13'
  Dynamic Linking:
    dlsrc=dl_none.xs, dlext=none, d_dlsymun=undef, ccdlflags=''
    cccdlflags='', lddlflags=''


---
@INC for perl 5.19.10:
    lib
    /usr/local/lib/perl5/site_perl/5.19.10/x86_64-linux
    /usr/local/lib/perl5/site_perl/5.19.10
    /usr/local/lib/perl5/5.19.10/x86_64-linux
    /usr/local/lib/perl5/5.19.10
    .

---
Environment for perl 5.19.10:
    HOME=/home/tony
    LANG=en_AU.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/tony/perl5/perlbrew/bin:/home/tony/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
    PERLBREW_BASHRC_VERSION=0.43
    PERLBREW_HOME=/home/tony/.perlbrew
    PERLBREW_PATH=/home/tony/perl5/perlbrew/bin
    PERLBREW_ROOT=/home/tony/perl5/perlbrew
    PERL_BADLANG (unset)
    SHELL=/bin/bash




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About