develooper Front page | perl.perl5.changes | Postings from February 2018

[perl.git] branch smoke-me/khw-encode created.v5.27.8-158-gbcbdaff88b

From:
Karl Williamson
Date:
February 4, 2018 05:38
Subject:
[perl.git] branch smoke-me/khw-encode created.v5.27.8-158-gbcbdaff88b
Message ID:
E1eiD0O-0000cE-Bh@git.dc.perl.space
In perl.git, the branch smoke-me/khw-encode has been created

<https://perl5.git.perl.org/perl.git/commitdiff/bcbdaff88b3e6635466de0f0be4d181d9cb870b1?hp=0000000000000000000000000000000000000000>

        at  bcbdaff88b3e6635466de0f0be4d181d9cb870b1 (commit)

- Log -----------------------------------------------------------------
commit bcbdaff88b3e6635466de0f0be4d181d9cb870b1
Author: Karl Williamson <khw@cpan.org>
Date:   Sat Feb 3 22:16:39 2018 -0700

    perlapi: Rmv nonapplicable text

commit 61164ba99bc2c9292bce1d02bf55208a79f438a5
Author: Karl Williamson <khw@cpan.org>
Date:   Sat Feb 3 22:15:51 2018 -0700

    regcomp.c: Fix comment

commit 77e891ae507d65b947e1ecfa3dcd01e1da5c82f4
Author: Karl Williamson <khw@cpan.org>
Date:   Sat Feb 3 22:14:22 2018 -0700

    Add uvchr_to_utf8_flags_msgs()
    
    This is propmpted by Encode's needs.  When called with the proper
    parameter, it returns any warnings instead of displaying them directly.

commit b1076691529a620b138c3bfd06e3345a00d465e5
Author: Karl Williamson <khw@cpan.org>
Date:   Sat Feb 3 22:09:28 2018 -0700

    APItest:t/utf8_warn_base.pl: Clarify some comments

commit 5d468350cd3d2c8ac6abbe176a745e06eb1bd371
Author: Karl Williamson <khw@cpan.org>
Date:   Sat Feb 3 22:06:35 2018 -0700

    APItest:t/utf8_warn_base.pl: Move a variable outside sub()
    
    This is in preparation for a future commit which will want to refer to
    this variable independently.

commit e324119d3da95fca4830533c404406ce0c5a8180
Author: Karl Williamson <khw@cpan.org>
Date:   Fri Feb 2 11:38:29 2018 -0700

    APItest:t/utf8_warn_base.pl; Fix 'ok' tests
    
    This was putting the condition for the ok in a string, which always
    succeeds

commit 4a315706abca84c03d1bbff11b9c53ac74a8cf1e
Author: Karl Williamson <khw@cpan.org>
Date:   Fri Feb 2 10:43:33 2018 -0700

    utf8.c: Extract code into separate function
    
    This is in preparation for the next commit which will use this code in
    multiple places

commit 82c7742885ea8c1156f4b9603c11919b6c38734d
Author: Pali <pali@cpan.org>
Date:   Wed Sep 13 00:30:29 2017 +0200

    Rewrite encode, decode, encode_utf8, decode_utf8 and from_to functions to XS

commit a5793e3eab5a7ffff7e63db394afac308faa2c1e
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Dec 28 14:57:22 2017 -0700

    encengine.c: Properly indent code within blocks
    
    This makes it much more legible

commit 649decd43c4859928e217351db0e8675093a8860
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Dec 28 14:29:43 2017 -0700

    Speed up UTF-8 validation checking on modern perls
    
    Perl 5.26 introduced infrastructure in the core that can be used by
    Encode to check UTF-8 stream validity much faster than before.
    
    It is not clear when or if this functionality will be backported into
    Devel::PPPort, in part because there is no one available currently who
    knows how to do it, and in part because it may be that everyone else
    relies on Encode, so it's not needed generally to be backported.
    
    This commit replaces the current scheme for checking UTF-8 validity if
    the infrastructure is availabe, by one in which normal processing
    doesn't require having to decode the UTF-8 into code points.  The
    copying of characters individually from the input to the output is
    changed to be a single operation for each entire span of valid input at
    once.
    
    Thus in the normal case, what ends up happening is a tight loop to
    check the validity, and then a memmove of the entire input to the
    output, then return.
    
    If an error is found, it copies all the valid input before the error,
    then handles the character in error, then positions to the next input
    position, and repeats the whole process starting from there.
    
    Thus, this does not need to know about the intricacies of UTF-8
    malformations, relying on the core to handle this.
    
    There are currently some problems with Encode on EBCDIC platforms.  The
    infrastructure is known to correctly work there, so I'm hopeful this
    will solve these portability issues.

commit 69a6ab58b4ba5ea074be25ceee729074685f0b5a
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Dec 28 14:09:06 2017 -0700

    Encode/Encode.xs: Pull condition out of loop
    
    The value for this condition is known before the loop, so move it
    outside the loop.

commit cd909a7a28f344e7d227c3540d70919bc6ac4212
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Dec 28 14:06:45 2017 -0700

    Encode/encode.h: Use system REPLACEMENT char if available
    
    On modern perls, there is a definition for the REPLACEMENT CHARACTER
    UTF-8 string.  Use this if available, as it is portable to EBCDIC, and
    this one isn't.

commit be06575c3742af30afd0112b5cf2fbcfc1040a81
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Dec 28 14:04:15 2017 -0700

    Encode: Add comments
    
    This documents process_utf8(), and adds another helpful comment

commit 110fae385c2995aa8354f00756146f71ed58094c
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Dec 28 14:01:34 2017 -0700

    Encode: White space only
    
    This correctly indents things in blocks, and removes trailing space

-----------------------------------------------------------------------

-- 
Perl5 Master Repository



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About