develooper Front page | perl.perl5.changes | Postings from February 2018

[perl.git] branch hv/study_chunk updated. v5.27.8-173-ge924a6821c

From:
Hugo Van der Sanden
Date:
February 3, 2018 10:28
Subject:
[perl.git] branch hv/study_chunk updated. v5.27.8-173-ge924a6821c
Message ID:
E1ehv39-00009v-HW@git.dc.perl.space
In perl.git, the branch hv/study_chunk has been updated

<https://perl5.git.perl.org/perl.git/commitdiff/e924a6821c5dd3178ec97b2f153775fee695b9eb?hp=8a8e4c850828677eadf9ac7f15fde1d404530e8b>

  discards  8a8e4c850828677eadf9ac7f15fde1d404530e8b (commit)
  discards  ad84d6341dc7d3dabafb396d59c98b08e6fbc436 (commit)
  discards  7e283120f6d8661dfab266c08d1be068660f694c (commit)
  discards  52a435b2afb995163e2ac36afd7e688d30fd6a83 (commit)
  discards  2601b591976aa7294cab164057e646e427291f63 (commit)
  discards  6d2a7904cf9cb86a40707cb4e745a1b9d758983c (commit)
  discards  16035a735e540cf3b494001908727f21601773bd (commit)
  discards  4a16522f46b9e5b417ab45af5358a1f5866bbcb8 (commit)
  discards  5da789a4e725d858352f98f4bf231a1b3b13107f (commit)
  discards  64f052f58da62a726a0ff6b3729dd6b3cce4d675 (commit)
  discards  7c2692cff18aca77c1e14ecc75f294c5e42485c2 (commit)
  discards  4966c323132732d48607346d29c709ff779fa410 (commit)
  discards  ad72f835fb469a6a3e689c92afb8d0df019c54de (commit)
  discards  871dd2d391a1afacb6ad787d3f7f5f846bc1514a (commit)
  discards  1f4f8c38209e0c659e6dbef8a36f21e03824f6b9 (commit)
  discards  12d6bc3c33d2862d5bdadbdfc2d94babb6739303 (commit)
  discards  fec9f0efcc1138a13c0f520d4d88b842946537f1 (commit)
  discards  23eefa591c9d5dfb86a5f3ce317c1081b6a65e12 (commit)
  discards  d06b188fb04c07fdc1739d73aba95039f36d5b95 (commit)
  discards  6adbe7ea1eab9edc4d0db9e6af29a8918d0afbdb (commit)
  discards  324689378266c0f4a45fd8513d0b1047f61118d2 (commit)
  discards  80eb6f0d283957ccbfe381faa1d5c5707311a2ca (commit)
  discards  393188e1b8986ec23aed4d8b9a34dcf104fd03fb (commit)
  discards  01f2cc60cd5c04819d6baab61665da46cd00bb5c (commit)
  discards  140e5808ce4482dd9e6f8a0be03f9a245c710657 (commit)
  discards  388fd6de5870075d27a0cdc98b552bd3a54a28c4 (commit)
  discards  f3498cf07f339af82aafc0c9f522eee8cb1c256c (commit)
  discards  4501f682906c7400bd0de83203966dac80d5d6c1 (commit)
  discards  ac2e8c664172bcd60edc6b1c967afc1f586d8ba5 (commit)
- Log -----------------------------------------------------------------
commit e924a6821c5dd3178ec97b2f153775fee695b9eb
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Tue Dec 2 14:31:55 2014 +0000

    WIP not for merging
    
    Performance data generated with:
      Porting/bench.pl -j=1 --raw --benchfile=t/perf/regex \
          -w=t/perf/regex-results ./blead ./study_chunk
    
    Note that bench.pl doesn't count compile-time work, so we must take care
    that what we're trying to test happens at run-time.

commit 0e437983e471994cc81a9df947f71f4703b0b0c9
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Sat Jan 10 13:00:42 2015 +0000

    study_chunk: rename 'pars' to 'parens'
    
    .. to make it a bit more greppable

commit 1a297140abc40e3cbf8a5bad399b66ddfedd27fa
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Thu Jan 8 12:30:36 2015 +0000

    study_chunk: cleaner rck_curlyish
    
    We know node, so use it.

commit b3466fd6eb55a42f3a6769ef9d6b3ac691249e6e
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Mon Dec 8 14:43:42 2014 +0000

    study_chunk: clean up rck_exactfish
    
    Separate mandatory fixups and the different styles of optimization
    check.

commit c2e0ac5a79214e6e826e6bedd8133e104fc245f2
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Mon Dec 8 13:27:28 2014 +0000

    study_chunk: clean up rck_exact
    
    Separate mandatory fixups and the different styles of optimization
    check; avoid unwarranted chumminess with magic.

commit 211c968708404c541507ddb732b0be26a6b2fee0
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Wed Dec 3 13:45:00 2014 +0000

    study_chunk: pass rck_params_t to study_chunk
    
    and_withp is never set by callers, so change that also to be initialized
    by study_chunk

commit 12027d2980e4773af8303087ffe446cf450315eb
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Sun Nov 30 11:23:05 2014 +0000

    study_chunk: extract the rest of the rck types
    
    rck_endlikish, rck_logical, rck_gpos, rck_trie, and rck_default for
    everything else

commit ab995813d6d16dde0054c339d38755b93c6f0909
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Sat Nov 29 18:18:32 2014 +0000

    study_chunk: extract rck_open, rck_close, rck_eval

commit b9d961b70ab67d508133d07814b1a8b4776d7430
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Sat Nov 29 16:14:15 2014 +0000

    study_chunk: extract rck_lookaround

commit 852ada7744dae7551ae3c7c612191f8fc4b904e5
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Fri Nov 28 01:40:34 2014 +0000

    study_chunk: extract rck_eolish

commit 646f724b9eba78bee9327466b70c218a20b3eeef
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Fri Nov 28 01:26:36 2014 +0000

    study_chunk: extract rck_simple

commit 5b130927d8437a6e6376a02111a523d7a95b8f7d
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Tue Nov 25 15:36:32 2014 +0000

    study_chunk: extract rck_lnbreak

commit 95cd0e61b845cebfa9607c5bcc1093959aebe792
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Fri Jan 2 12:04:46 2015 +0000

    study_chunk: extract rck_plus, rck_star, rck_curlyish

commit 5684f60fe64cd8f46518be255388cba1edb6786c
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Fri Jan 2 12:01:53 2015 +0000

    study_chunk: extract rck_do_curly
    
    .. and simplify optimize_curly_tail to rck_elide_nothing()

commit a0c8826a5115e6c27ac6e97f4b3ef7c82affa392
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Fri Jan 2 11:26:07 2015 +0000

    study_chunk: extract rck_refish, rck_clump

commit 4810dd5e33ecdfd30bf5498f1109b39ffffceec7
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Fri Jan 2 11:19:57 2015 +0000

    study_chunk: extract rck_whilem

commit c610fc583b8044353d04d51a7dc841e5ea656443
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Tue Nov 25 12:04:24 2014 +0000

    study_chunk: extract rck_exactfish
    
    Also move the join_exact() call into the exact/exactfish handling.

commit b03117ceebd6122538af3ef77f826a9732322e92
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Mon Nov 24 12:30:10 2014 +0000

    study_chunk: extract rck_exact

commit ba9d4e8c70ef7269c825d40c73837d351fb642ea
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Mon Nov 24 12:18:10 2014 +0000

    study_chunk: extract rck_suspend, rck_gosub
    
    Add helper function rck_enframe.

commit ea9b4fa824a2f316d3b515b7b1b2bf989308e318
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Mon Nov 24 11:25:58 2014 +0000

    study_chunk: simplify PAREN_TEST and related macros
    
    Define PAREN_OFFSET to point to the bitvector for the relevant depth, update
    PAREN_TEST, PAREN_SET and PAREN_UNSET to take a depth instead of a pointer,
    and simplify the various users.

commit a5fe4ff7bf80e301793e96025fbe64a18a9c7ce4
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Sun Nov 23 16:07:52 2014 +0000

    study_chunk: split up rck_branch
    
    Give BRANCH, BRANCHJ and IFTHEN each their own peep routines; move the
    shared code to sc_peep_make_trie.

commit acf9e9aa5331e29c1b7a741698511874167d4571
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Sat Nov 22 14:06:23 2014 +0000

    study_chunk: extract rck_branch

commit 6a7085e2c31490bbff6c68a058082b35fe344899
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Fri Nov 21 16:25:19 2014 +0000

    study_chunk: extract rck_definep

commit 4c38558d91cfcb50e1559fdcbd3b75f820327b18
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Wed Nov 19 15:17:59 2014 +0000

    study_chunk: extract rck_elide_nothing

commit cedad1fcc5cecd7b2338f9f8fb616c7c92d47e6f
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Fri Nov 21 14:25:01 2014 +0000

    study_chunk: remove JOIN_EXACT macro
    
    It's used only once, and the code is clearer when we can see the condition
    it applies.

commit 0e294f28c90eeb41c21189c09595ee34bc2c39e4
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Wed Nov 19 14:52:25 2014 +0000

    study_chunk_one_node: reindent
    
    1900-odd lines of whitespace, deindented after removal of outer loop

commit b0af93b64e83d37381ed0b7e0df3a8b0329104fa
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Wed Nov 19 14:41:41 2014 +0000

    study_chunk: switch study_chunk_one_frame to study_chunk_one_node
    
    Move per-frame diagnostics and outer loop up to study_chunk, leaving
    just the work to do on a single node. This requires a return value
    to show if the inner loop should be terminated.
    
    The outdent of what remains is done separately in the next commit.

commit aa51b913d7d27b2780b7ea3549f34c3f33e1766c
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Mon Nov 17 21:31:13 2014 +0000

    study_chunk: separate study_chunk_one_frame
    
    Removes the need for labels fake_study_recurse and finish.

commit 3cde5702d5683c1a39574d845a5fefc9d355fe22
Author: Hugo van der Sanden <hv@crypt.org>
Date:   Mon Nov 17 17:39:19 2014 +0000

    study_chunk: move params into struct for refactoring

-----------------------------------------------------------------------

Summary of changes:
 charclass_invlists.h               |   14 +-
 dist/Devel-PPPort/PPPort_pm.PL     |    6 +-
 dist/Devel-PPPort/parts/inc/mess   |   21 +-
 dist/Devel-PPPort/parts/inc/misc   |    3 -
 dist/Devel-PPPort/t/mess.t         |    2 +-
 dist/if/if.pm                      |   88 +-
 dist/if/t/if.t                     |  100 +-
 dump.c                             |   10 +-
 embed.fnc                          |   28 +-
 embed.h                            |   14 +-
 embedvar.h                         |    3 +
 ext/POSIX/Makefile.PL              |    3 +-
 ext/POSIX/POSIX.xs                 |  104 +-
 ext/POSIX/lib/POSIX.pm             |    5 +-
 ext/POSIX/t/export.t               |    6 +-
 ext/XS-APItest/APItest.xs          |   53 +-
 ext/XS-APItest/t/locale.t          |   47 +-
 ext/XS-APItest/t/utf8_warn_base.pl | 1400 ++++++++++++++------------
 hints/hpux.sh                      |    7 +
 inline.h                           |   38 +-
 intrpvar.h                         |    8 +
 locale.c                           | 1942 +++++++++++++++++++++---------------
 mg.c                               |   18 +-
 numeric.c                          |   17 +-
 perl.c                             |   58 +-
 perl.h                             |   93 +-
 perl_langinfo.h                    |   14 +-
 pod/perldebguts.pod                |    2 +
 pod/perlembed.pod                  |    4 +-
 pp_sys.c                           |    2 +-
 proto.h                            |   49 +-
 regcomp.c                          |  205 +++-
 regcomp.sym                        |    1 +
 regen/mk_invlists.pl               |   23 +-
 regexec.c                          |  626 +++++++++---
 regnodes.h                         |  321 +++---
 sv.c                               |   28 +-
 t/TEST                             |    3 +-
 t/loc_tools.pl                     |    4 +
 t/op/warn.t                        |   21 +-
 t/re/anyof.t                       |    2 +-
 t/re/re_tests                      |    4 +
 t/re/regexp.t                      |   11 +-
 t/re/script_run.t                  |    9 +-
 toke.c                             |    2 -
 utf8.c                             |  304 +++++-
 utf8.h                             |    2 +
 vutil.c                            |   31 +-
 48 files changed, 3598 insertions(+), 2158 deletions(-)

diff --git a/charclass_invlists.h b/charclass_invlists.h
index bbff009d8f..60c64ef157 100644
--- a/charclass_invlists.h
+++ b/charclass_invlists.h
@@ -20393,7 +20393,7 @@ static const UV _Perl_SCX_invlist[] = { /* for ASCII/Latin1 */
 
 #endif	/* defined(PERL_IN_PERL_C) */
 
-#if defined(PERL_IN_REGEXEC_C)
+#if defined(PERL_CORE) || defined(PERL_EXT)
 
 /* Negative enum values indicate the need to use an auxiliary table
  * consisting of the list of enums this one expands to.  The absolute
@@ -22754,7 +22754,7 @@ static const SCX_enum _Perl_SCX_invmap[] = { /* for ASCII/Latin1 */
 	SCX_Unknown
 };
 
-#endif	/* defined(PERL_IN_REGEXEC_C) */
+#endif	/* defined(PERL_CORE) || defined(PERL_EXT) */
 
 #if defined(PERL_IN_PERL_C)
 
@@ -56784,7 +56784,7 @@ static const UV _Perl_SCX_invlist[] = { /* for EBCDIC 1047 */
 
 #endif	/* defined(PERL_IN_PERL_C) */
 
-#if defined(PERL_IN_REGEXEC_C)
+#if defined(PERL_CORE) || defined(PERL_EXT)
 
 /* Negative enum values indicate the need to use an auxiliary table
  * consisting of the list of enums this one expands to.  The absolute
@@ -59171,7 +59171,7 @@ static const SCX_enum _Perl_SCX_invmap[] = { /* for EBCDIC 1047 */
 	SCX_Unknown
 };
 
-#endif	/* defined(PERL_IN_REGEXEC_C) */
+#endif	/* defined(PERL_CORE) || defined(PERL_EXT) */
 
 #if defined(PERL_IN_PERL_C)
 
@@ -93385,7 +93385,7 @@ static const UV _Perl_SCX_invlist[] = { /* for EBCDIC 037 */
 
 #endif	/* defined(PERL_IN_PERL_C) */
 
-#if defined(PERL_IN_REGEXEC_C)
+#if defined(PERL_CORE) || defined(PERL_EXT)
 
 /* Negative enum values indicate the need to use an auxiliary table
  * consisting of the list of enums this one expands to.  The absolute
@@ -95768,7 +95768,7 @@ static const SCX_enum _Perl_SCX_invmap[] = { /* for EBCDIC 037 */
 	SCX_Unknown
 };
 
-#endif	/* defined(PERL_IN_REGEXEC_C) */
+#endif	/* defined(PERL_CORE) || defined(PERL_EXT) */
 
 #if defined(PERL_IN_PERL_C)
 
@@ -109537,5 +109537,5 @@ static const U8 WB_table[24][24] = {
  * 5671c3de473b25e7ea47097e4906260624dfabe3e9b1739f490aecbc3d858459 lib/unicore/mktables
  * 21653d2744fdd071f9ef138c805393901bb9547cf3e777ebf50215a191f986ea lib/unicore/version
  * 913d2f93f3cb6cdf1664db888bf840bc4eb074eef824e082fceda24a9445e60c regen/charset_translations.pl
- * 46df154c4b2265cab87816c85428df795cd652e193330c00a0463257b2cee92f regen/mk_invlists.pl
+ * 4898ec84e2b81e8bf948dcdb1c015c14f258cc652337122719885a276ea46d7b regen/mk_invlists.pl
  * ex: set ro: */
diff --git a/dist/Devel-PPPort/PPPort_pm.PL b/dist/Devel-PPPort/PPPort_pm.PL
index 983b6b44bc..15cfe63405 100644
--- a/dist/Devel-PPPort/PPPort_pm.PL
+++ b/dist/Devel-PPPort/PPPort_pm.PL
@@ -539,7 +539,7 @@ package Devel::PPPort;
 use strict;
 use vars qw($VERSION $data);
 
-$VERSION = '3.38';
+$VERSION = '3.39';
 
 sub _init_data
 {
@@ -620,6 +620,8 @@ __DATA__
 
 %include memory
 
+%include magic
+
 %include misc
 
 %include format
@@ -660,8 +662,6 @@ __DATA__
 
 %include pvs
 
-%include magic
-
 %include cop
 
 %include grok
diff --git a/dist/Devel-PPPort/parts/inc/mess b/dist/Devel-PPPort/parts/inc/mess
index d73b4e5605..eb2de7b15e 100644
--- a/dist/Devel-PPPort/parts/inc/mess
+++ b/dist/Devel-PPPort/parts/inc/mess
@@ -19,10 +19,8 @@ mess_nocontext
 mess
 
 warn_nocontext
-Perl_warn_nocontext
 
 croak_nocontext
-Perl_croak_nocontext
 
 croak_no_modify
 Perl_croak_no_modify
@@ -184,28 +182,20 @@ mess_sv(pTHX_ SV *basemsg, bool consume)
 #define warn_nocontext warn
 #endif
 
-#ifndef Perl_warn_nocontext
-#define Perl_warn_nocontext warn_nocontext
-#endif
-
 #ifndef croak_nocontext
 #define croak_nocontext croak
 #endif
 
-#ifndef Perl_croak_nocontext
-#define Perl_croak_nocontext croak_nocontext
-#endif
-
 #ifndef croak_no_modify
-#define croak_no_modify() croak("%s", PL_no_modify)
+#define croak_no_modify() croak_nocontext("%s", PL_no_modify)
 #define Perl_croak_no_modify() croak_no_modify()
 #endif
 
 #ifndef croak_memory_wrap
 #if { VERSION >= 5.9.2 } || ( { VERSION >= 5.8.6 } && { VERSION < 5.9.0 } )
-#  define croak_memory_wrap() croak("%s", PL_memory_wrap)
+#  define croak_memory_wrap() croak_nocontext("%s", PL_memory_wrap)
 #else
-#  define croak_memory_wrap() croak("panic: memory wrap")
+#  define croak_memory_wrap() croak_nocontext("panic: memory wrap")
 #endif
 #endif
 
@@ -216,8 +206,9 @@ mess_sv(pTHX_ SV *basemsg, bool consume)
 #ifndef croak_xs_usage
 #if { NEED croak_xs_usage }
 void
-croak_xs_usage(pTHX_ const CV *const cv, const char *const params)
+croak_xs_usage(const CV *const cv, const char *const params)
 {
+    dTHX;
     const GV *const gv = CvGV(cv);
 
     PERL_ARGS_ASSERT_CROAK_XS_USAGE;
@@ -435,7 +426,7 @@ if ($] ge '5.006') {
 
 if (ord('A') != 65) {
     skip 'skip: no ASCII support', 0 for 1..24;
-} elsif ($] ge '5.008') {
+} elsif ($] ge '5.008' && $] ne '5.012000') {
     undef $die;
     ok !defined eval { Devel::PPPort::croak_sv(eval '"\N{U+E1}\n"') };
     ok $@, "\xE1\n";
diff --git a/dist/Devel-PPPort/parts/inc/misc b/dist/Devel-PPPort/parts/inc/misc
index 36ee57fe49..949c481088 100644
--- a/dist/Devel-PPPort/parts/inc/misc
+++ b/dist/Devel-PPPort/parts/inc/misc
@@ -43,7 +43,6 @@ C_ARRAY_LENGTH
 C_ARRAY_END
 SvRX
 SvRXOK
-PERL_MAGIC_qr
 cBOOL
 OpHAS_SIBLING
 OpSIBLING
@@ -53,8 +52,6 @@ OpMAYBESIB_set
 
 =implementation
 
-__UNDEFINED__ PERL_MAGIC_qr             'r'
-
 __UNDEFINED__ cBOOL(cbool) ((cbool) ? (bool)1 : (bool)0)
 __UNDEFINED__ OpHAS_SIBLING(o)      (cBOOL((o)->op_sibling))
 __UNDEFINED__ OpSIBLING(o)          (0 + (o)->op_sibling)
diff --git a/dist/Devel-PPPort/t/mess.t b/dist/Devel-PPPort/t/mess.t
index e0c746d6ed..9a9822ade0 100644
--- a/dist/Devel-PPPort/t/mess.t
+++ b/dist/Devel-PPPort/t/mess.t
@@ -191,7 +191,7 @@ if ($] ge '5.006') {
 
 if (ord('A') != 65) {
     skip 'skip: no ASCII support', 0 for 1..24;
-} elsif ($] ge '5.008') {
+} elsif ($] ge '5.008' && $] ne '5.012000') {
     undef $die;
     ok !defined eval { Devel::PPPort::croak_sv(eval '"\N{U+E1}\n"') };
     ok $@, "\xE1\n";
diff --git a/dist/if/if.pm b/dist/if/if.pm
index d1cbd00f35..166de7bb10 100644
--- a/dist/if/if.pm
+++ b/dist/if/if.pm
@@ -1,6 +1,6 @@
 package if;
 
-$VERSION = '0.0607';
+$VERSION = '0.0608';
 
 sub work {
   my $method = shift() ? 'import' : 'unimport';
@@ -25,76 +25,70 @@ __END__
 
 =head1 NAME
 
-if - C<use> a Perl module if a condition holds (also can C<no> a module)
+if - C<use> a Perl module if a condition holds
 
 =head1 SYNOPSIS
 
-  use if CONDITION, MODULE => ARGUMENTS;
-  no if CONDITION, MODULE => ARGUMENTS;
+    use if CONDITION, "MODULE", ARGUMENTS;
+    no  if CONDITION, "MODULE", ARGUMENTS;
 
 =head1 DESCRIPTION
 
-The C<if> module is used to conditionally load or unload another module.
-The construct
+=head2 C<use if>
 
-  use if CONDITION, MODULE => ARGUMENTS;
+The C<if> module is used to conditionally load another module.  The construct:
 
-will load MODULE only if CONDITION evaluates to true.
-The above statement has no effect unless C<CONDITION> is true.
-If the CONDITION does evaluate to true, then the above line has
-the same effect as:
+    use if CONDITION, "MODULE", ARGUMENTS;
 
-  use MODULE ARGUMENTS;
+... will load C<MODULE> only if C<CONDITION> evaluates to true; it has no
+effect if C<CONDITION> evaluates to false.  (The module name, assuming it
+contains at least one C<::>, must be quoted when C<'use strict "subs";'> is in
+effect.)  If the CONDITION does evaluate to true, then the above line has the
+same effect as:
 
-The use of C<< => >> above provides necessary quoting of C<MODULE>.
-If you don't use the fat comma (eg you don't have any ARGUMENTS),
-then you'll need to quote the MODULE.
+    use MODULE ARGUMENTS;
 
-If you wanted ARGUMENTS to be an empty list, i.e. have the effect of:
+For example, the F<Unicode::UCD> module's F<charinfo> function will use two functions from F<Unicode::Normalize> only if a certain condition is met:
+
+    use if defined &DynaLoader::boot_DynaLoader,
+        "Unicode::Normalize" => qw(getCombinClass NFD);
+
+Suppose you wanted C<ARGUMENTS> to be an empty list, I<i.e.>, to have the
+effect of:
 
     use MODULE ();
 
-you can't do this with the C<if> pragma; however, you can achieve
+You can't do this with the C<if> pragma; however, you can achieve
 exactly this effect, at compile time, with:
 
     BEGIN { require MODULE if CONDITION }
 
-=head2 EXAMPLES
-
-The following line is taken from the testsuite for L<File::Map>:
-
-  use if $^O ne 'MSWin32', POSIX => qw/setlocale LC_ALL/;
-
-If run on any operating system other than Windows,
-this will import the functions C<setlocale> and C<LC_ALL> from L<POSIX>.
-On Windows it does nothing.
-
-The following is used to L<deprecate> core modules beyond a certain version of Perl:
+=head2 C<no if>
 
-  use if $] > 5.016, 'deprecate';
+The C<no if> construct is mainly used to deactivate categories of warnings
+when those categories would produce superfluous output under specified
+versions of F<perl>.
 
-This line is taken from L<Text::Soundex> 3.04,
-and marks it as deprecated beyond Perl 5.16.
-If you C<use Text::Soundex> in Perl 5.18, for example,
-and you have used L<warnings>,
-then you'll get a warning message
-(the deprecate module looks to see whether the
-calling module was C<use>'d from a core library directory,
-and if so, generates a warning),
-unless you've installed a more recent version of L<Text::Soundex> from CPAN.
+For example, the C<redundant> category of warnings was introduced in
+Perl-5.22.  This warning flags certain instances of superfluous arguments to
+C<printf> and C<sprintf>.  But if your code was running warnings-free on
+earlier versions of F<perl> and you don't care about C<redundant> warnings in
+more recent versions, you can call:
 
-You can also specify to NOT use something:
+    use warnings;
+    no if $] >= 5.022, q|warnings|, qw(redundant);
 
- no if $] ge 5.021_006, warnings => "locale";
+    my $test    = { fmt  => "%s", args => [ qw( x y ) ] };
+    my $result  = sprintf $test->{fmt}, @{$test->{args}};
 
-This warning category was added in the specified Perl version (a development
-release).  Without the C<'if'>, trying to use it in an earlier release would
-generate an unknown warning category error.
+The C<no if> construct assumes that a module or pragma has correctly
+implemented an C<unimport()> method -- but most modules and pragmata have not.
+That explains why the C<no if> construct is of limited applicability.
 
 =head1 BUGS
 
-The current implementation does not allow specification of the
-required version of the module.
+The current implementation does not allow specification of the required
+version of the module.
 
 =head1 SEE ALSO
 
@@ -105,8 +99,8 @@ Unlike C<if> though, L<Module::Requires> is not a core module.
 L<Module::Load::Conditional> provides a number of functions you can use to
 query what modules are available, and then load one or more of them at runtime.
 
-L<provide> can be used to select one of several possible modules to load,
-based on what version of Perl is running.
+The L<provide> module from CPAN can be used to select one of several possible
+modules to load based on the version of Perl that is running.
 
 =head1 AUTHOR
 
diff --git a/dist/if/t/if.t b/dist/if/t/if.t
index 4a2b351aaf..827d93cbcb 100644
--- a/dist/if/t/if.t
+++ b/dist/if/t/if.t
@@ -1,9 +1,9 @@
 #!./perl
 
 use strict;
-use Test::More tests => 10;
+use Test::More tests => 18;
 
-my $v_plus = $] + 1;
+my $v_plus  = $] + 1;
 my $v_minus = $] - 1;
 
 unless (eval 'use open ":std"; 1') {
@@ -12,29 +12,85 @@ unless (eval 'use open ":std"; 1') {
   eval 'sub open::foo{}';		# Just in case...
 }
 
-no strict;
+{
+    no strict;
 
-is( eval "use if ($v_minus > \$]), strict => 'subs'; \${'f'} = 12", 12,
-    '"use if" with a false condition, fake pragma');
-is( eval "use if ($v_minus > \$]), strict => 'refs'; \${'f'} = 12", 12,
-    '"use if" with a false condition and a pragma');
+    is( eval "use if ($v_minus > \$]), strict => 'subs'; \${'f'} = 12", 12,
+        '"use if" with a false condition, fake pragma');
+    is( eval "use if ($v_minus > \$]), strict => 'refs'; \${'f'} = 12", 12,
+        '"use if" with a false condition and a pragma');
 
-is( eval "use if ($v_plus > \$]), strict => 'subs'; \${'f'} = 12", 12,
-    '"use if" with a true condition, fake pragma');
+    is( eval "use if ($v_plus > \$]), strict => 'subs'; \${'f'} = 12", 12,
+        '"use if" with a true condition, fake pragma');
 
-is( eval "use if ($v_plus > \$]), strict => 'refs'; \${'f'} = 12", undef,
-    '"use if" with a true condition and a pragma');
-like( $@, qr/while "strict refs" in use/, 'expected error message'),
+    is( eval "use if ($v_plus > \$]), strict => 'refs'; \${'f'} = 12", undef,
+        '"use if" with a true condition and a pragma');
+    like( $@, qr/while "strict refs" in use/, 'expected error message'),
 
-# Old version had problems with the module name 'open', which is a keyword too
-# Use 'open' =>, since pre-5.6.0 could interpret differently
-is( (eval "use if ($v_plus > \$]), 'open' => IN => ':crlf'; 12" || 0), 12,
-    '"use if" with open');
+    # Old version had problems with the module name 'open', which is a keyword too
+    # Use 'open' =>, since pre-5.6.0 could interpret differently
+    is( (eval "use if ($v_plus > \$]), 'open' => IN => ':crlf'; 12" || 0), 12,
+        '"use if" with open');
 
-is(eval "use if ($v_plus > \$])", undef,
-   "Too few args to 'use if' returns <undef>");
-like($@, qr/Too few arguments to 'use if'/, "  ... and returns correct error");
+    is(eval "use if ($v_plus > \$])", undef,
+       "Too few args to 'use if' returns <undef>");
+    like($@, qr/Too few arguments to 'use if'/, "  ... and returns correct error");
 
-is(eval "no if ($v_plus > \$])", undef,
-   "Too few args to 'no if' returns <undef>");
-like($@, qr/Too few arguments to 'no if'/, "  ... and returns correct error");
+    is(eval "no if ($v_plus > \$])", undef,
+       "Too few args to 'no if' returns <undef>");
+    like($@, qr/Too few arguments to 'no if'/, "  ... and returns correct error");
+}
+
+{
+    note(q|RT 132732: strict 'subs'|);
+    use strict "subs";
+
+    {
+        SKIP: {
+            unless ($] >= 5.018) {
+                skip "bigrat apparently not testable prior to perl-5.18", 4;
+            }
+            note(q|strict "subs" : 'use if' : condition false|);
+            eval "use if (0 > 1), q|bigrat|, qw(hex oct);";
+            ok (! main->can('hex'), "Cannot call bigrat::hex() in importing package");
+            ok (! main->can('oct'), "Cannot call bigrat::oct() in importing package");
+
+            note(q|strict "subs" : 'use if' : condition true|);
+            eval "use if (1 > 0), q|bigrat|, qw(hex oct);";
+            ok (  main->can('hex'), "Can call bigrat::hex() in importing package");
+            ok (  main->can('oct'), "Can call bigrat::oct() in importing package");
+        }
+    }
+
+    {
+        note(q|strict "subs" : 'no if' : condition variable|);
+        note(($] >= 5.022) ? "Recent enough Perl: $]" : "Older Perl: $]");
+        use warnings;
+        SKIP: {
+            unless ($] >= 5.022) {
+                skip "Redundant argument warning not available in pre-5.22 perls", 4;
+            }
+
+            {
+                no if $] >= 5.022, q|warnings|, qw(redundant);
+                my ($test, $result, $warn);
+                local $SIG{__WARN__} = sub { $warn = shift };
+                $test = { fmt  => "%s", args => [ qw( x y ) ] };
+                $result = sprintf $test->{fmt}, @{$test->{args}};
+                is($result, $test->{args}->[0], "Got expected string");
+                ok(! $warn, "Redundant argument warning suppressed");
+            }
+
+            {
+                use if $] >= 5.022, q|warnings|, qw(redundant);
+                my ($test, $result, $warn);
+                local $SIG{__WARN__} = sub { $warn = shift };
+                $test = { fmt  => "%s", args => [ qw( x y ) ] };
+                $result = sprintf $test->{fmt}, @{$test->{args}};
+                is($result, $test->{args}->[0], "Got expected string");
+                like($warn, qr/Redundant argument in sprintf/,
+                    "Redundant argument warning generated and capture");
+            }
+        }
+    }
+}
diff --git a/dump.c b/dump.c
index bdf285303d..41b5f377b5 100644
--- a/dump.c
+++ b/dump.c
@@ -493,9 +493,10 @@ Perl_sv_peek(pTHX_ SV *sv)
 	}
     }
     else if (SvNOKp(sv)) {
-	STORE_LC_NUMERIC_UNDERLYING_SET_STANDARD();
+        DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
+        STORE_LC_NUMERIC_SET_STANDARD();
 	Perl_sv_catpvf(aTHX_ t, "(%" NVgf ")",SvNVX(sv));
-	RESTORE_LC_NUMERIC_UNDERLYING();
+        RESTORE_LC_NUMERIC();
     }
     else if (SvIOKp(sv)) {
 	if (SvIsUV(sv))
@@ -1826,9 +1827,10 @@ Perl_do_sv_dump(pTHX_ I32 level, PerlIO *file, SV *sv, I32 nest, I32 maxnest, bo
 		&& type != SVt_PVCV && type != SVt_PVFM  && type != SVt_REGEXP
 		&& type != SVt_PVIO && !isGV_with_GP(sv) && !SvVALID(sv))
 	       || type == SVt_NV) {
-	STORE_LC_NUMERIC_UNDERLYING_SET_STANDARD();
+        DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
+        STORE_LC_NUMERIC_SET_STANDARD();
 	Perl_dump_indent(aTHX_ level, file, "  NV = %.*" NVgf "\n", NV_DIG, SvNVX(sv));
-	RESTORE_LC_NUMERIC_UNDERLYING();
+        RESTORE_LC_NUMERIC();
     }
 
     if (SvROK(sv)) {
diff --git a/embed.fnc b/embed.fnc
index aa7c9d3d00..ae073ffc43 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -806,9 +806,7 @@ AndmoR	|bool	|is_utf8_invariant_string|NN const U8* const s		    \
 AnidR	|bool	|is_utf8_invariant_string_loc|NN const U8* const s	    \
 		|STRLEN len						    \
 		|NULLOK const U8 ** ep
-#ifndef EBCDIC
 AniR	|unsigned int|_variant_byte_number|PERL_UINTMAX_T word
-#endif
 #if defined(PERL_CORE) || defined(PERL_EXT)
 EinR	|Size_t	|variant_under_utf8_count|NN const U8* const s		    \
 		|NN const U8* const e
@@ -898,8 +896,11 @@ ADMpR	|bool	|is_utf8_punct	|NN const U8 *p
 ADMpR	|bool	|is_utf8_xdigit	|NN const U8 *p
 AMpR	|bool	|_is_utf8_mark	|NN const U8 *p
 ADMpR	|bool	|is_utf8_mark	|NN const U8 *p
-EXdpR	|bool	|isSCRIPT_RUN	|NN const U8 *s|NN const U8 *send    \
-				|const bool utf8_target
+#if defined(PERL_CORE) || defined(PERL_EXT)
+EXdpR	|bool	|isSCRIPT_RUN	|NN const U8 *s|NN const U8 *send   \
+				|const bool utf8_target		    \
+				|NULLOK SCX_enum * ret_script
+#endif
 : Used in perly.y
 p	|OP*	|jmaybe		|NN OP *o
 : Used in pp.c 
@@ -1860,11 +1861,17 @@ Adop	|UV	|utf8n_to_uvchr	|NN const U8 *s				    \
 				|STRLEN curlen				    \
 				|NULLOK STRLEN *retlen			    \
 				|const U32 flags
-Adp	|UV	|utf8n_to_uvchr_error|NN const U8 *s			    \
+Adop	|UV	|utf8n_to_uvchr_error|NN const U8 *s			    \
 				|STRLEN curlen				    \
 				|NULLOK STRLEN *retlen			    \
 				|const U32 flags			    \
 				|NULLOK U32 * errors
+AMdp	|UV	|utf8n_to_uvchr_msgs|NN const U8 *s			    \
+				|STRLEN curlen				    \
+				|NULLOK STRLEN *retlen			    \
+				|const U32 flags			    \
+				|NULLOK U32 * errors			    \
+				|NULLOK AV ** msgs
 AipnR	|UV	|valid_utf8_to_uvchr	|NN const U8 *s|NULLOK STRLEN *retlen
 Ap	|UV	|utf8n_to_uvuni|NN const U8 *s|STRLEN curlen|NULLOK STRLEN *retlen|U32 flags
 
@@ -2450,6 +2457,7 @@ Es	|SSize_t|study_chunk	|NN RExC_state_t *pRExC_state \
 				|NN rck_params_t *params
 Es	|bool	|study_chunk_one_node|NN RExC_state_t *pRExC_state \
 				|NN rck_params_t *params
+EsR	|SV *	|get_ANYOFM_contents|NN const regnode * n
 Es	|void	|rck_elide_nothing|NN regnode *node
 Es	|bool	|rck_definep	|NN RExC_state_t *pRExC_state \
 				|NN rck_params_t *params
@@ -2545,7 +2553,7 @@ Es	|const regnode*|dumpuntil|NN const regexp *r|NN const regnode *start \
 				|NN SV* sv|I32 indent|U32 depth
 Es	|void	|put_code_point	|NN SV* sv|UV c
 Es	|bool	|put_charclass_bitmap_innards|NN SV* sv		    \
-				|NN char* bitmap		    \
+				|NULLOK char* bitmap		    \
 				|NULLOK SV* nonbitmap_invlist	    \
 				|NULLOK SV* only_utf8_locale_invlist\
 				|NULLOK const regnode * const node  \
@@ -2588,6 +2596,12 @@ ERp	|bool	|_is_grapheme	|NN const U8 * strbeg|NN const U8 * s|NN const U8 *stren
 ERs	|bool	|isFOO_utf8_lc	|const U8 classnum|NN const U8* character
 ERns	|char *|find_next_ascii|NN char* s|NN const char * send|const bool is_utf8
 ERns	|char *|find_next_non_ascii|NN char* s|NN const char * send|const bool is_utf8
+ERns	|char *	|find_next_masked|NN char * s				\
+				 |NN const char * send			\
+				 |const U8 byte|const U8 mask
+ERns	|char *|find_span_end	|NN char* s|NN const char * send|const char span_byte
+ERns	|U8 *|find_span_end_mask|NN U8 * s|NN const U8 * send	\
+				|const U8 span_byte|const U8 mask
 ERs	|SSize_t|regmatch	|NN regmatch_info *reginfo|NN char *startpos|NN regnode *prog
 WERs	|I32	|regrepeat	|NN regexp *prog|NN char **startposp \
 				|NN const regnode *p \
@@ -2824,6 +2838,8 @@ s	|bool	|isa_lookup	|NN HV *stash|NN const char * const name \
 
 #if defined(PERL_IN_LOCALE_C)
 sn	|const char*|category_name |const int category
+s	|const char*|switch_category_locale_to_template|const int switch_category|const int template_category|NULLOK const char * template_locale
+s	|void	|restore_switched_locale|const int category|NULLOK const char * const original_locale
 #  ifdef HAS_NL_LANGINFO
 sn	|const char*|my_nl_langinfo|const nl_item item|bool toggle
 #  else
diff --git a/embed.h b/embed.h
index ecb4d798bf..4a3fa99825 100644
--- a/embed.h
+++ b/embed.h
@@ -46,6 +46,7 @@
 #define _to_utf8_lower_flags(a,b,c,d,e,f,g)	Perl__to_utf8_lower_flags(aTHX_ a,b,c,d,e,f,g)
 #define _to_utf8_title_flags(a,b,c,d,e,f,g)	Perl__to_utf8_title_flags(aTHX_ a,b,c,d,e,f,g)
 #define _to_utf8_upper_flags(a,b,c,d,e,f,g)	Perl__to_utf8_upper_flags(aTHX_ a,b,c,d,e,f,g)
+#define _variant_byte_number	S__variant_byte_number
 #define amagic_call(a,b,c,d)	Perl_amagic_call(aTHX_ a,b,c,d)
 #define amagic_deref_call(a,b)	Perl_amagic_deref_call(aTHX_ a,b)
 #define apply_attrs_string(a,b,c,d)	Perl_apply_attrs_string(aTHX_ a,b,c,d)
@@ -736,7 +737,7 @@
 #define utf8_to_uvchr(a,b)	Perl_utf8_to_uvchr(aTHX_ a,b)
 #define utf8_to_uvuni(a,b)	Perl_utf8_to_uvuni(aTHX_ a,b)
 #define utf8_to_uvuni_buf(a,b,c)	Perl_utf8_to_uvuni_buf(aTHX_ a,b,c)
-#define utf8n_to_uvchr_error(a,b,c,d,e)	Perl_utf8n_to_uvchr_error(aTHX_ a,b,c,d,e)
+#define utf8n_to_uvchr_msgs(a,b,c,d,e,f)	Perl_utf8n_to_uvchr_msgs(aTHX_ a,b,c,d,e,f)
 #define utf8n_to_uvuni(a,b,c,d)	Perl_utf8n_to_uvuni(aTHX_ a,b,c,d)
 #define uvoffuni_to_utf8_flags(a,b,c)	Perl_uvoffuni_to_utf8_flags(aTHX_ a,b,c)
 #define uvuni_to_utf8(a,b)	Perl_uvuni_to_utf8(aTHX_ a,b)
@@ -774,9 +775,6 @@
 #if !(defined(HAS_SIGACTION) && defined(SA_SIGINFO))
 #define csighandler		Perl_csighandler
 #endif
-#if !defined(EBCDIC)
-#define _variant_byte_number	S__variant_byte_number
-#endif
 #if !defined(HAS_TRUNCATE) && !defined(HAS_CHSIZE) && defined(F_FREESP)
 #define my_chsize(a,b)		Perl_my_chsize(aTHX_ a,b)
 #endif
@@ -920,7 +918,6 @@
 #define current_re_engine()	Perl_current_re_engine(aTHX)
 #define cv_ckproto_len_flags(a,b,c,d,e)	Perl_cv_ckproto_len_flags(aTHX_ a,b,c,d,e)
 #define grok_atoUV		Perl_grok_atoUV
-#define isSCRIPT_RUN(a,b,c)	Perl_isSCRIPT_RUN(aTHX_ a,b,c)
 #define mg_find_mglob(a)	Perl_mg_find_mglob(aTHX_ a)
 #define multiconcat_stringify(a)	Perl_multiconcat_stringify(aTHX_ a)
 #define multideref_stringify(a,b)	Perl_multideref_stringify(aTHX_ a,b)
@@ -999,6 +996,7 @@
 #define sv_or_pv_pos_u2b(a,b,c,d)	S_sv_or_pv_pos_u2b(aTHX_ a,b,c,d)
 #  endif
 #  if defined(PERL_CORE) || defined(PERL_EXT)
+#define isSCRIPT_RUN(a,b,c,d)	Perl_isSCRIPT_RUN(aTHX_ a,b,c,d)
 #define variant_under_utf8_count	S_variant_under_utf8_count
 #  endif
 #  if defined(PERL_IN_REGCOMP_C)
@@ -1012,6 +1010,7 @@
 #define compute_EXACTish	S_compute_EXACTish
 #define construct_ahocorasick_from_trie(a,b,c)	S_construct_ahocorasick_from_trie(aTHX_ a,b,c)
 #define edit_distance		S_edit_distance
+#define get_ANYOFM_contents(a)	S_get_ANYOFM_contents(aTHX_ a)
 #define get_ANYOF_cp_list_for_ssc(a,b)	S_get_ANYOF_cp_list_for_ssc(aTHX_ a,b)
 #define get_invlist_iter_addr	S_get_invlist_iter_addr
 #define grok_bslash_N(a,b,c,d,e,f,g)	S_grok_bslash_N(aTHX_ a,b,c,d,e,f,g)
@@ -1150,7 +1149,10 @@
 #define backup_one_WB(a,b,c,d)	S_backup_one_WB(aTHX_ a,b,c,d)
 #define find_byclass(a,b,c,d,e)	S_find_byclass(aTHX_ a,b,c,d,e)
 #define find_next_ascii		S_find_next_ascii
+#define find_next_masked	S_find_next_masked
 #define find_next_non_ascii	S_find_next_non_ascii
+#define find_span_end		S_find_span_end
+#define find_span_end_mask	S_find_span_end_mask
 #define isFOO_utf8_lc(a,b)	S_isFOO_utf8_lc(aTHX_ a,b)
 #define isGCB(a,b,c,d,e)	S_isGCB(aTHX_ a,b,c,d,e)
 #define isLB(a,b,c,d,e,f)	S_isLB(aTHX_ a,b,c,d,e,f)
@@ -1656,7 +1658,9 @@
 #  endif
 #  if defined(PERL_IN_LOCALE_C)
 #define category_name		S_category_name
+#define restore_switched_locale(a,b)	S_restore_switched_locale(aTHX_ a,b)
 #define save_to_buffer		S_save_to_buffer
+#define switch_category_locale_to_template(a,b,c)	S_switch_category_locale_to_template(aTHX_ a,b,c)
 #    if defined(USE_LOCALE)
 #define new_collate(a)		S_new_collate(aTHX_ a)
 #define new_ctype(a)		S_new_ctype(aTHX_ a)
diff --git a/embedvar.h b/embedvar.h
index 0922ee45d1..fe33c86ccc 100644
--- a/embedvar.h
+++ b/embedvar.h
@@ -187,6 +187,7 @@
 #define PL_lastgotoprobe	(vTHX->Ilastgotoprobe)
 #define PL_laststatval		(vTHX->Ilaststatval)
 #define PL_laststype		(vTHX->Ilaststype)
+#define PL_locale_utf8ness	(vTHX->Ilocale_utf8ness)
 #define PL_localizing		(vTHX->Ilocalizing)
 #define PL_localpatches		(vTHX->Ilocalpatches)
 #define PL_lockhook		(vTHX->Ilockhook)
@@ -221,6 +222,7 @@
 #define PL_numeric_radix_sv	(vTHX->Inumeric_radix_sv)
 #define PL_numeric_standard	(vTHX->Inumeric_standard)
 #define PL_numeric_underlying	(vTHX->Inumeric_underlying)
+#define PL_numeric_underlying_is_standard	(vTHX->Inumeric_underlying_is_standard)
 #define PL_ofsgv		(vTHX->Iofsgv)
 #define PL_oldname		(vTHX->Ioldname)
 #define PL_op			(vTHX->Iop)
@@ -336,6 +338,7 @@
 #define PL_tmps_stack		(vTHX->Itmps_stack)
 #define PL_top_env		(vTHX->Itop_env)
 #define PL_toptarget		(vTHX->Itoptarget)
+#define PL_underlying_numeric_obj	(vTHX->Iunderlying_numeric_obj)
 #define PL_unicode		(vTHX->Iunicode)
 #define PL_unitcheckav		(vTHX->Iunitcheckav)
 #define PL_unitcheckav_save	(vTHX->Iunitcheckav_save)
diff --git a/ext/POSIX/Makefile.PL b/ext/POSIX/Makefile.PL
index 1ed4d32982..5d5c009c3c 100644
--- a/ext/POSIX/Makefile.PL
+++ b/ext/POSIX/Makefile.PL
@@ -50,7 +50,8 @@ my @names =
       ESOCKTNOSUPPORT ESPIPE ESRCH ESTALE ETIME ETIMEDOUT ETOOMANYREFS ETXTBSY
       EUSERS EWOULDBLOCK EXDEV FILENAME_MAX F_OK HUPCL ICANON ICRNL IEXTEN
       IGNBRK IGNCR IGNPAR INLCR INPCK INT_MAX INT_MIN ISIG ISTRIP IXOFF IXON
-      LC_ALL LC_COLLATE LC_CTYPE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME
+      LC_ADDRESS LC_ALL LC_COLLATE LC_CTYPE LC_IDENTIFICATION LC_MEASUREMENT
+      LC_MESSAGES LC_MONETARY LC_NUMERIC LC_PAPER LC_TELEPHONE LC_TIME
       LINK_MAX LONG_MAX LONG_MIN L_ctermid L_cuserid MAX_CANON
       MAX_INPUT MB_LEN_MAX MSG_CTRUNC MSG_DONTROUTE MSG_EOR MSG_OOB MSG_PEEK
       MSG_TRUNC MSG_WAITALL NAME_MAX NCCS NGROUPS_MAX NOFLSH OPEN_MAX OPOST
diff --git a/ext/POSIX/POSIX.xs b/ext/POSIX/POSIX.xs
index a70ec21c93..1dbcd076e4 100644
--- a/ext/POSIX/POSIX.xs
+++ b/ext/POSIX/POSIX.xs
@@ -1598,8 +1598,8 @@ static const struct lconv_offset lconv_strings[] = {
 
 /* The Linux man pages say these are the field names for the structure
  * components that are LC_NUMERIC; the rest being LC_MONETARY */
-#   define isLC_NUMERIC_STRING(name) (strEQ(name, "decimal_point")     \
-                                      || strEQ(name, "thousands_sep")  \
+#   define isLC_NUMERIC_STRING(name) (   strEQ(name, "decimal_point")   \
+                                      || strEQ(name, "thousands_sep")   \
                                                                         \
                                       /* There should be no harm done   \
                                        * checking for this, even if     \
@@ -2124,7 +2124,12 @@ localeconv()
 	localeconv(); /* A stub to call not_here(). */
 #else
 	struct lconv *lcbuf;
-
+#  if defined(USE_ITHREADS)                                             \
+   && defined(HAS_POSIX_2008_LOCALE)                                    \
+   && defined(HAS_LOCALECONV_L) /* Prefer this thread-safe version */
+        bool do_free = FALSE;
+        locale_t cur = uselocale((locale_t) 0);
+#  endif
         DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
 
         /* localeconv() deals with both LC_NUMERIC and LC_MONETARY, but
@@ -2144,9 +2149,22 @@ localeconv()
 
 	RETVAL = newHV();
 	sv_2mortal((SV*)RETVAL);
+#  if defined(USE_ITHREADS)                         \
+   && defined(HAS_POSIX_2008_LOCALE)                \
+   && defined(HAS_LOCALECONV_L)
 
-        lcbuf = localeconv();
+        if (cur == LC_GLOBAL_LOCALE) {
+            cur = duplocale(LC_GLOBAL_LOCALE);
+            do_free = TRUE;
+        }
 
+        lcbuf = localeconv_l(cur);
+#  else
+        LOCALE_LOCK;    /* Prevent interference with other threads using
+                           localeconv() */
+
+        lcbuf = localeconv();
+#  endif
 	if (lcbuf) {
 	    const struct lconv_offset *strings = lconv_strings;
 	    const struct lconv_offset *integers = lconv_integers;
@@ -2171,19 +2189,19 @@ localeconv()
 		const char *value = *((const char **)(ptr + strings->offset));
 
 		if (value && *value) {
+                    const STRLEN value_len = strlen(value);
+
+                    /* We mark it as UTF-8 if a utf8 locale and is valid and
+                     * variant under UTF-8 */
+                    const bool is_utf8 = is_utf8_locale
+                                     &&  is_utf8_non_invariant_string(
+                                                                (U8*) value,
+                                                                value_len);
 		    (void) hv_store(RETVAL,
-                        strings->name,
-                        strlen(strings->name),
-                        newSVpvn_utf8(
-                                value,
-                                strlen(value),
-
-                                /* We mark it as UTF-8 if a utf8 locale and is
-                                 * valid and variant under UTF-8 */
-                                     is_utf8_locale
-                                && ! is_utf8_invariant_string((U8 *) value, 0)
-                                &&   is_utf8_string((U8 *) value, 0)),
-                    0);
+                                    strings->name,
+                                    strlen(strings->name),
+                                    newSVpvn_utf8(value, value_len, is_utf8),
+                                    0);
             }
                 strings++;
 	    }
@@ -2197,8 +2215,16 @@ localeconv()
                 integers++;
             }
 	}
-
-        RESTORE_LC_NUMERIC_STANDARD();
+#  if defined(USE_ITHREADS)                         \
+   && defined(HAS_POSIX_2008_LOCALE)                \
+   && defined(HAS_LOCALECONV_L)
+        if (do_free) {
+            freelocale(cur);
+        }
+#  else
+        LOCALE_UNLOCK;
+#  endif
+        RESTORE_LC_NUMERIC();
 #endif  /* HAS_LOCALECONV */
     OUTPUT:
 	RETVAL
@@ -3252,10 +3278,27 @@ write(fd, buffer, nbytes)
 void
 abort()
 
+#ifdef I_WCHAR
+#  include <wchar.h>
+#endif
+
 int
 mblen(s, n)
 	char *		s
 	size_t		n
+    PREINIT:
+#if defined(USE_ITHREADS) && defined(HAS_MBRLEN)
+        mbstate_t ps;
+#endif
+    CODE:
+#if defined(USE_ITHREADS) && defined(HAS_MBRLEN)
+        PERL_UNUSED_RESULT(mbrlen(NULL, 0, &ps));   /* Initialize state */
+        RETVAL = mbrlen(s, n, &ps); /* Prefer reentrant version */
+#else
+        RETVAL = mblen(s, n);
+#endif
+    OUTPUT:
+        RETVAL
 
 size_t
 mbstowcs(s, pwcs, n)
@@ -3268,6 +3311,21 @@ mbtowc(pwc, s, n)
 	wchar_t *	pwc
 	char *		s
 	size_t		n
+    PREINIT:
+#if defined(USE_ITHREADS) && defined(HAS_MBRTOWC)
+        mbstate_t ps;
+#endif
+    CODE:
+#if defined(USE_ITHREADS) && defined(HAS_MBRTOWC)
+        memset(&ps, 0, sizeof(ps));;
+        PERL_UNUSED_RESULT(mbrtowc(pwc, NULL, 0, &ps));/* Reset any shift state */
+        errno = 0;
+        RETVAL = mbrtowc(pwc, s, n, &ps);   /* Prefer reentrant version */
+#else
+        RETVAL = mbtowc(pwc, s, n);
+#endif
+    OUTPUT:
+        RETVAL
 
 int
 wcstombs(s, pwcs, n)
@@ -3295,6 +3353,7 @@ strtod(str)
         DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
         STORE_LC_NUMERIC_FORCE_TO_UNDERLYING();
 	num = strtod(str, &unparsed);
+        RESTORE_LC_NUMERIC();
 	PUSHs(sv_2mortal(newSVnv(num)));
 	if (GIMME_V == G_ARRAY) {
 	    EXTEND(SP, 1);
@@ -3303,7 +3362,6 @@ strtod(str)
 	    else
 		PUSHs(&PL_sv_undef);
 	}
-        RESTORE_LC_NUMERIC_STANDARD();
 
 #ifdef HAS_STRTOLD
 
@@ -3317,6 +3375,7 @@ strtold(str)
         DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
         STORE_LC_NUMERIC_FORCE_TO_UNDERLYING();
 	num = strtold(str, &unparsed);
+        RESTORE_LC_NUMERIC();
 	PUSHs(sv_2mortal(newSVnv(num)));
 	if (GIMME_V == G_ARRAY) {
 	    EXTEND(SP, 1);
@@ -3325,7 +3384,6 @@ strtold(str)
 	    else
 		PUSHs(&PL_sv_undef);
 	}
-        RESTORE_LC_NUMERIC_STANDARD();
 
 #endif
 
@@ -3571,6 +3629,12 @@ strftime(fmt, sec, min, hour, mday, mon, year, wday = -1, yday = -1, isdst = -1)
                     || (   is_utf8_non_invariant_string((U8*) buf, len)
 #ifdef USE_LOCALE_TIME
                         && _is_cur_LC_category_utf8(LC_TIME)
+#else   /* If can't check directly, at least can see if script is consistent,
+           under UTF-8, which gives us an extra measure of confidence. */
+
+                        && isSCRIPT_RUN((const U8 *) buf, buf + len,
+                                        TRUE, /* Means assume UTF-8 */
+                                        NULL)
 #endif
                 )) {
 		    SvUTF8_on(sv);
diff --git a/ext/POSIX/lib/POSIX.pm b/ext/POSIX/lib/POSIX.pm
index 1270fc9dcc..8f61f6ede9 100644
--- a/ext/POSIX/lib/POSIX.pm
+++ b/ext/POSIX/lib/POSIX.pm
@@ -4,7 +4,7 @@ use warnings;
 
 our ($AUTOLOAD, %SIGRT);
 
-our $VERSION = '1.81';
+our $VERSION = '1.82';
 
 require XSLoader;
 
@@ -306,7 +306,8 @@ my %default_export_tags = ( # cf. exports policy below
 		_POSIX_STREAM_MAX _POSIX_TZNAME_MAX)],
 
     locale_h =>	[qw(LC_ALL LC_COLLATE LC_CTYPE LC_MESSAGES
-		    LC_MONETARY LC_NUMERIC LC_TIME NULL
+		    LC_MONETARY LC_NUMERIC LC_TIME LC_IDENTIFICATION
+                    LC_MEASUREMENT LC_PAPER LC_TELEPHONE LC_ADDRESS NULL
 		    localeconv setlocale)],
 
     math_h =>   [qw(FP_ILOGB0 FP_ILOGBNAN FP_INFINITE FP_NAN FP_NORMAL
diff --git a/ext/POSIX/t/export.t b/ext/POSIX/t/export.t
index 6637fa6452..50648c8b33 100644
--- a/ext/POSIX/t/export.t
+++ b/ext/POSIX/t/export.t
@@ -45,8 +45,10 @@ my %expect = (
             FLT_ROUNDS F_DUPFD F_GETFD F_GETFL F_GETLK F_OK F_RDLCK
             F_SETFD F_SETFL F_SETLK F_SETLKW F_UNLCK F_WRLCK HUGE_VAL
             HUPCL ICANON ICRNL IEXTEN IGNBRK IGNCR IGNPAR INLCR INPCK
-            INT_MAX INT_MIN ISIG ISTRIP IXOFF IXON LC_ALL LC_COLLATE
-            LC_CTYPE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME LDBL_DIG
+            INT_MAX INT_MIN ISIG ISTRIP IXOFF IXON
+            LC_ADDRESS LC_ALL LC_COLLATE LC_CTYPE LC_IDENTIFICATION
+            LC_MEASUREMENT LC_MESSAGES LC_MONETARY LC_NUMERIC LC_PAPER
+            LC_TELEPHONE LC_TIME LDBL_DIG
             LDBL_EPSILON LDBL_MANT_DIG LDBL_MAX LDBL_MAX_10_EXP
             LDBL_MAX_EXP LDBL_MIN LDBL_MIN_10_EXP LDBL_MIN_EXP LINK_MAX
             LONG_MAX LONG_MIN L_ctermid L_cuserid MAX_CANON
diff --git a/ext/XS-APItest/APItest.xs b/ext/XS-APItest/APItest.xs
index 0ad08237af..0be5d95310 100644
--- a/ext/XS-APItest/APItest.xs
+++ b/ext/XS-APItest/APItest.xs
@@ -1379,16 +1379,55 @@ bytes_cmp_utf8(bytes, utf8)
     OUTPUT:
 	RETVAL
 
+AV *
+test_utf8n_to_uvchr_msgs(s, len, flags)
+        char *s
+        STRLEN len
+        U32 flags
+    PREINIT:
+        STRLEN retlen;
+        UV ret;
+        U32 errors;
+        AV *msgs = NULL;
+
+    CODE:
+        RETVAL = newAV();
+        sv_2mortal((SV*)RETVAL);
+
+        ret = utf8n_to_uvchr_msgs((U8*)  s,
+                                         len,
+                                         &retlen,
+                                         flags,
+                                         &errors,
+                                         &msgs);
+
+        /* Returns the return value in [0]; <retlen> in [1], <errors> in [2] */
+        av_push(RETVAL, newSVuv(ret));
+        if (retlen == (STRLEN) -1) {
+            av_push(RETVAL, newSViv(-1));
+        }
+        else {
+            av_push(RETVAL, newSVuv(retlen));
+        }
+        av_push(RETVAL, newSVuv(errors));
+
+        /* And any messages in [3] */
+        if (msgs) {
+            av_push(RETVAL, newRV_noinc((SV*)msgs));
+        }
+
+    OUTPUT:
+        RETVAL
+
 AV *
 test_utf8n_to_uvchr_error(s, len, flags)
 
-        SV *s
-        SV *len
-        SV *flags
+        char *s
+        STRLEN len
+        U32 flags
     PREINIT:
         STRLEN retlen;
         UV ret;
-        STRLEN slen;
         U32 errors;
 
     CODE:
@@ -1401,10 +1440,10 @@ test_utf8n_to_uvchr_error(s, len, flags)
         RETVAL = newAV();
         sv_2mortal((SV*)RETVAL);
 
-        ret = utf8n_to_uvchr_error((U8*) SvPV(s, slen),
-                                         SvUV(len),
+        ret = utf8n_to_uvchr_error((U8*) s,
+                                         len,
                                          &retlen,
-                                         SvUV(flags),
+                                         flags,
                                          &errors);
 
         /* Returns the return value in [0]; <retlen> in [1], <errors> in [2] */
diff --git a/ext/XS-APItest/t/locale.t b/ext/XS-APItest/t/locale.t
index 7f9915d2b3..064627d424 100644
--- a/ext/XS-APItest/t/locale.t
+++ b/ext/XS-APItest/t/locale.t
@@ -33,7 +33,7 @@ SKIP: {
 }
 
 my %correct_C_responses = (
-        # Commented out entries are ones which there is room for variation
+        # Entries that are undef could have varying returns
                             ABDAY_1 => 'Sun',
                             ABDAY_2 => 'Mon',
                             ABDAY_3 => 'Tue',
@@ -55,8 +55,8 @@ my %correct_C_responses = (
                             ABMON_9 => 'Sep',
                             ALT_DIGITS => '',
                             AM_STR => 'AM',
-                            #CODESET => 'ANSI_X3.4-1968',
-                            #CRNCYSTR => '-',
+                            CODESET => undef,
+                            CRNCYSTR => undef,
                             DAY_1 => 'Sunday',
                             DAY_2 => 'Monday',
                             DAY_3 => 'Tuesday',
@@ -64,12 +64,12 @@ my %correct_C_responses = (
                             DAY_5 => 'Thursday',
                             DAY_6 => 'Friday',
                             DAY_7 => 'Saturday',
-                            #D_FMT => '%m/%d/%y',
-                            #D_T_FMT => '%a %b %e %H:%M:%S %Y',
+                            D_FMT => undef,
+                            D_T_FMT => undef,
                             ERA => '',
-                            #ERA_D_FMT => '',
-                            #ERA_D_T_FMT => '',
-                            #ERA_T_FMT => '',
+                            ERA_D_FMT => undef,
+                            ERA_D_T_FMT => undef,
+                            ERA_T_FMT => undef,
                             MON_1 => 'January',
                             MON_10 => 'October',
                             MON_11 => 'November',
@@ -82,13 +82,15 @@ my %correct_C_responses = (
                             MON_7 => 'July',
                             MON_8 => 'August',
                             MON_9 => 'September',
-                            #NOEXPR => '^[nN]',
+                            NOEXPR => undef,
+                            NOSTR => undef,
                             PM_STR => 'PM',
                             RADIXCHAR => '.',
                             THOUSEP => '',
-                            #T_FMT => '%H:%M:%S',
-                            #T_FMT_AMPM => '%I:%M:%S %p',
-                            #YESEXPR => '^[yY]',
+                            T_FMT => undef,
+                            T_FMT_AMPM => undef,
+                            YESEXPR => undef,
+                            YESSTR => undef,
                         );
 
 my $hdr = "../../perl_langinfo.h";
@@ -111,14 +113,15 @@ SKIP: {
     # For non-nl_langinfo systems, those values are arbitrary negative numbers
     # set in the header.  Otherwise they are the nl_langinfo approved values,
     # which for the moment is the item name.
+    # The relevant lines look like: #  define PERL_YESSTR -54
     while (<$fh>) {
         chomp;
         next unless / - \d+ $ /x;
-        s/ ^ .* PERL_//x;
+        s/ ^ .* PERL_ //x;
         m/ (.*) \  (.*) /x;
         $items{$1} = ($has_nl_langinfo)
-                     ? $1
-                     : $2;
+                     ? $1       # Yields 'YESSTR'
+                     : $2;      # Yields -54
     }
 
     # Get the translation from item name to numeric value.
@@ -127,10 +130,16 @@ SKIP: {
     foreach my $formal_item (sort keys %items) {
         if (exists $correct_C_responses{$formal_item}) {
             my $item = eval $items{$formal_item};
-            next if $@;
-            is (test_Perl_langinfo($item),
-                $correct_C_responses{$formal_item},
-                "Returns expected value for $formal_item");
+            skip "This platform apparently doesn't support $formal_item", 1 if $@;
+            if (defined $correct_C_responses{$formal_item}) {
+                is (test_Perl_langinfo($item),
+                    $correct_C_responses{$formal_item},
+                    "Returns expected value for $formal_item");
+            }
+            else {
+                ok (defined test_Perl_langinfo($item),
+                    "Returns a value for $formal_item");
+            }
         }
     }
 }
diff --git a/ext/XS-APItest/t/utf8_warn_base.pl b/ext/XS-APItest/t/utf8_warn_base.pl
index 91de8a8711..6c3b04afeb 100644
--- a/ext/XS-APItest/t/utf8_warn_base.pl
+++ b/ext/XS-APItest/t/utf8_warn_base.pl
@@ -702,653 +702,673 @@ sub do_warnings_test(@)
 my $num_test_files = $ENV{TEST_JOBS} || 1;
 $num_test_files = 10 if $num_test_files > 10;
 
+# We only really need to test utf8n_to_uvchr_msgs() once with this flag.
+my $tested_CHECK_ONLY = 0;
+
 my $test_count = -1;
 foreach my $test (@tests) {
-    $test_count++;
-    next if $test_count % $num_test_files != $::TEST_CHUNK;
-
-    my ($testname, $bytes, $allowed_uv, $needed_to_discern_len) = @$test;
-
-    my $length = length $bytes;
-    my $initially_overlong = $testname =~ /overlong/;
-    my $initially_orphan   = $testname =~ /orphan/;
-    my $will_overflow = $allowed_uv < 0;
-
-    my $uv_string = sprintf(($allowed_uv < 0x100) ? "%02X" : "%04X", $allowed_uv);
-    my $display_bytes = display_bytes($bytes);
-
-    my $controlling_warning_category;
-    my $utf8n_flag_to_warn;
-    my $utf8n_flag_to_disallow;
-    my $uvchr_flag_to_warn;
-    my $uvchr_flag_to_disallow;
-
-    # We want to test that the independent flags are actually independent.
-    # For example, that a surrogate doesn't trigger a non-character warning,
-    # and conversely, turning off an above-Unicode flag doesn't suppress a
-    # surrogate warning.  Earlier versions of this file used nested loops to
-    # test all possible combinations.  But that creates lots of tests, making
-    # this run too long.  What is now done instead is to use the complement of
-    # the category we are testing to greatly reduce the combinatorial
-    # explosion.  For example, if we have a surrogate and we aren't expecting
-    # a warning about it, we set all the flags for non-surrogates to raise
-    # warnings.  If one shows up, it indicates the flags aren't independent.
-    my $utf8n_flag_to_warn_complement;
-    my $utf8n_flag_to_disallow_complement;
-    my $uvchr_flag_to_warn_complement;
-    my $uvchr_flag_to_disallow_complement;
-
-    # Many of the code points being tested are middling in that if code point
-    # edge cases work, these are very likely to as well.  Because this test
-    # file takes a while to execute, we skip testing the edge effects of code
-    # points deemed middling, while testing their basics and continuing to
-    # fully test the non-middling code points.
-    my $skip_most_tests = 0;
-
-    my $cp_message_qr;      # Pattern that matches the message raised when
-                            # that message contains the problematic code
-                            # point.  The message is the same (currently) both
-                            # when going from/to utf8.
-    my $non_cp_trailing_text;   # The suffix text when the message doesn't
-                                # contain a code point.  (This is a result of
-                                # some sort of malformation that means we
-                                # can't get an exact code poin
-    my $extended_cp_message_qr = qr/\QCode point 0x$uv_string is not Unicode,\E
-                        \Q requires a Perl extension, and so is not\E
-                        \Q portable\E/x;
-    my $extended_non_cp_trailing_text
-                        = "is a Perl extension, and so is not portable";
-
-    # What bytes should have been used to specify a code point that has been
-    # specified as an overlong.
-    my $correct_bytes_for_overlong;
-
-    # Is this test malformed from the beginning?  If so, we know to generally
-    # expect that the tests will show it isn't valid.
-    my $initially_malformed = 0;
-
-    if ($initially_overlong || $initially_orphan) {
-        $non_cp_trailing_text = "if you see this, there is an error";
-        $cp_message_qr = qr/\Q$non_cp_trailing_text\E/;
-        $initially_malformed = 1;
-        $utf8n_flag_to_warn     = 0;
-        $utf8n_flag_to_disallow = 0;
-
-        $utf8n_flag_to_warn_complement =     $::UTF8_WARN_SURROGATE;
-        $utf8n_flag_to_disallow_complement = $::UTF8_DISALLOW_SURROGATE;
-        if (! $will_overflow && $allowed_uv <= 0x10FFFF) {
-            $utf8n_flag_to_warn_complement     |= $::UTF8_WARN_SUPER;
-            $utf8n_flag_to_disallow_complement |= $::UTF8_DISALLOW_SUPER;
-            if (($allowed_uv & 0xFFFF) != 0xFFFF) {
-                $utf8n_flag_to_warn_complement      |= $::UTF8_WARN_NONCHAR;
-                $utf8n_flag_to_disallow_complement  |= $::UTF8_DISALLOW_NONCHAR;
-            }
-        }
-        if (! is_extended_utf8($bytes)) {
-            $utf8n_flag_to_warn_complement |= $::UTF8_WARN_PERL_EXTENDED;
-            $utf8n_flag_to_disallow_complement  |= $::UTF8_DISALLOW_PERL_EXTENDED;
-        }
-
-        $controlling_warning_category = 'utf8';
-
-        if ($initially_overlong) {
-            if (! defined $needed_to_discern_len) {
-                $needed_to_discern_len = overlong_discern_len($bytes);
-            }
-            $correct_bytes_for_overlong = display_bytes_no_quotes(chr $allowed_uv);
-        }
-    }
-    elsif($will_overflow || $allowed_uv > 0x10FFFF) {
-
-        # Set the SUPER flags; later, we test for PERL_EXTENDED as well.
-        $utf8n_flag_to_warn     = $::UTF8_WARN_SUPER;
-        $utf8n_flag_to_disallow = $::UTF8_DISALLOW_SUPER;
-        $uvchr_flag_to_warn     = $::UNICODE_WARN_SUPER;
-        $uvchr_flag_to_disallow = $::UNICODE_DISALLOW_SUPER;;
-
-        # Below, we add the flags for non-perl_extended to the code points
-        # that don't fit that category.  Special tests are done for this
-        # category in the inner loop.
-        $utf8n_flag_to_warn_complement     = $::UTF8_WARN_NONCHAR
-                                            |$::UTF8_WARN_SURROGATE;
-        $utf8n_flag_to_disallow_complement = $::UTF8_DISALLOW_NONCHAR
-                                            |$::UTF8_DISALLOW_SURROGATE;
-        $uvchr_flag_to_warn_complement     = $::UNICODE_WARN_NONCHAR
-                                            |$::UNICODE_WARN_SURROGATE;
-        $uvchr_flag_to_disallow_complement = $::UNICODE_DISALLOW_NONCHAR
-                                            |$::UNICODE_DISALLOW_SURROGATE;
-        $controlling_warning_category = 'non_unicode';
-
-        if ($will_overflow) {  # This is realy a malformation
-            $non_cp_trailing_text = "if you see this, there is an error";
-            $cp_message_qr = qr/\Q$non_cp_trailing_text\E/;
-            $initially_malformed = 1;
-            if (! defined $needed_to_discern_len) {
-                $needed_to_discern_len = overflow_discern_len($length);
-            }
-        }
-        elsif (requires_extended_utf8($allowed_uv)) {
-            $cp_message_qr = $extended_cp_message_qr;
-            $non_cp_trailing_text = $extended_non_cp_trailing_text;
-            $needed_to_discern_len = 1 unless defined $needed_to_discern_len;
-        }
-        else {
-            $cp_message_qr = qr/\QCode point 0x$uv_string is not Unicode,\E
-                                \Q may not be portable\E/x;
-            $non_cp_trailing_text = "is for a non-Unicode code point, may not"
-                                . " be portable";
-            $utf8n_flag_to_warn_complement     |= $::UTF8_WARN_PERL_EXTENDED;
-            $utf8n_flag_to_disallow_complement
-                                           |= $::UTF8_DISALLOW_PERL_EXTENDED;
-            $uvchr_flag_to_warn_complement |= $::UNICODE_WARN_PERL_EXTENDED;
-            $uvchr_flag_to_disallow_complement
-                                        |= $::UNICODE_DISALLOW_PERL_EXTENDED;
-        }
-    }
-    elsif ($allowed_uv >= 0xD800 && $allowed_uv <= 0xDFFF) {
-        $cp_message_qr = qr/UTF-16 surrogate U\+$uv_string/;
-        $non_cp_trailing_text = "is for a surrogate";
-        $needed_to_discern_len = 2 unless defined $needed_to_discern_len;
-        $skip_most_tests = 1 if $allowed_uv > 0xD800 && $allowed_uv < 0xDFFF;
-
-        $utf8n_flag_to_warn     = $::UTF8_WARN_SURROGATE;
-        $utf8n_flag_to_disallow = $::UTF8_DISALLOW_SURROGATE;
-        $uvchr_flag_to_warn     = $::UNICODE_WARN_SURROGATE;
-        $uvchr_flag_to_disallow = $::UNICODE_DISALLOW_SURROGATE;;
-
-        $utf8n_flag_to_warn_complement     = $::UTF8_WARN_NONCHAR
-                                            |$::UTF8_WARN_SUPER
-                                            |$::UTF8_WARN_PERL_EXTENDED;
-        $utf8n_flag_to_disallow_complement = $::UTF8_DISALLOW_NONCHAR
-                                            |$::UTF8_DISALLOW_SUPER
-                                            |$::UTF8_DISALLOW_PERL_EXTENDED;
-        $uvchr_flag_to_warn_complement     = $::UNICODE_WARN_NONCHAR
-                                            |$::UNICODE_WARN_SUPER
-                                            |$::UNICODE_WARN_PERL_EXTENDED;
-        $uvchr_flag_to_disallow_complement = $::UNICODE_DISALLOW_NONCHAR
-                                            |$::UNICODE_DISALLOW_SUPER
-                                            |$::UNICODE_DISALLOW_PERL_EXTENDED;
-        $controlling_warning_category = 'surrogate';
-    }
-    elsif (   ($allowed_uv >= 0xFDD0 && $allowed_uv <= 0xFDEF)
-           || ($allowed_uv & 0xFFFE) == 0xFFFE)
-    {
-        $cp_message_qr = qr/\QUnicode non-character U+$uv_string\E
-                            \Q is not recommended for open interchange\E/x;
-        $non_cp_trailing_text = "if you see this, there is an error";
-        $needed_to_discern_len = $length unless defined $needed_to_discern_len;
-        if (   ($allowed_uv > 0xFDD0 && $allowed_uv < 0xFDEF)
-            || ($allowed_uv > 0xFFFF && $allowed_uv < 0x10FFFE))
-        {
-            $skip_most_tests = 1;
-        }
-
-        $utf8n_flag_to_warn     = $::UTF8_WARN_NONCHAR;
-        $utf8n_flag_to_disallow = $::UTF8_DISALLOW_NONCHAR;
-        $uvchr_flag_to_warn     = $::UNICODE_WARN_NONCHAR;
-        $uvchr_flag_to_disallow = $::UNICODE_DISALLOW_NONCHAR;;
-
-        $utf8n_flag_to_warn_complement     = $::UTF8_WARN_SURROGATE
-                                            |$::UTF8_WARN_SUPER
-                                            |$::UTF8_WARN_PERL_EXTENDED;
-        $utf8n_flag_to_disallow_complement = $::UTF8_DISALLOW_SURROGATE
-                                            |$::UTF8_DISALLOW_SUPER
-                                            |$::UTF8_DISALLOW_PERL_EXTENDED;
-        $uvchr_flag_to_warn_complement     = $::UNICODE_WARN_SURROGATE
-                                            |$::UNICODE_WARN_SUPER
-                                            |$::UNICODE_WARN_PERL_EXTENDED;
-        $uvchr_flag_to_disallow_complement = $::UNICODE_DISALLOW_SURROGATE
-                                            |$::UNICODE_DISALLOW_SUPER
-                                            |$::UNICODE_DISALLOW_PERL_EXTENDED;
-
-        $controlling_warning_category = 'nonchar';
-    }
-    else {
-        die "Can't figure out what type of warning to test for $testname"
-    }
-
-    die 'Didn\'t set $needed_to_discern_len for ' . $testname
-                                        unless defined $needed_to_discern_len;
-
-    # We try various combinations of malformations that can occur
-    foreach my $short (0, 1) {
-      next if $skip_most_tests && $short;
-      foreach my $unexpected_noncont (0, 1) {
-        next if $skip_most_tests && $unexpected_noncont;
-        foreach my $overlong (0, 1) {
-          next if $overlong && $skip_most_tests;
-          next if $initially_overlong && ! $overlong;
-
-          # If we're creating an overlong, it can't be longer than the
-          # maximum length, so skip if we're already at that length.
-          next if   (! $initially_overlong && $overlong)
-                   &&  $length >= $::max_bytes;
-
-          my $this_cp_message_qr = $cp_message_qr;
-          my $this_non_cp_trailing_text = $non_cp_trailing_text;
-
-          foreach my $malformed_allow_type (0..2) {
-            # 0 don't allow this malformation; ignored if no malformation
-            # 1 allow, with REPLACEMENT CHARACTER returned
-            # 2 allow, with intended code point returned.  All malformations
-            #   other than overlong can't determine the intended code point,
-            #   so this isn't valid for them.
-            next if     $malformed_allow_type == 2
-                    && ($will_overflow || $short || $unexpected_noncont);
-            next if $skip_most_tests && $malformed_allow_type;
-
-            # Here we are in the innermost loop for malformations.  So we
-            # know which ones are in effect.  Can now change the input to be
-            # appropriately malformed.  We also can set up certain other
-            # things now, like whether we expect a return flag from this
-            # malformation, and which flag.
-
-            my $this_bytes = $bytes;
-            my $this_length = $length;
-            my $this_expected_len = $length;
-            my $this_needed_to_discern_len = $needed_to_discern_len;
-
-            my @malformation_names;
-            my @expected_malformation_warnings;
-            my @expected_malformation_return_flags;
-
-            # Contains the flags for any allowed malformations.  Currently no
-            # combinations of on/off are tested for.  It's either all are
-            # allowed, or none are.
-            my $allow_flags = 0;
-            my $overlong_is_in_perl_extended_utf8 = 0;
-            my $dont_use_overlong_cp = 0;
-
-            if ($initially_orphan) {
-                next if $overlong || $short || $unexpected_noncont;
-            }
-
-            if ($overlong) {
-                if (! $initially_overlong) {
-                    my $new_expected_len;
-
-                    # To force this malformation, we convert the original start
-                    # byte into a continuation byte with the same data bits as
-                    # originally. ...
-                    my $start_byte = substr($this_bytes, 0, 1);
-                    my $converted_to_continuation_byte
-                                            = start_byte_to_cont($start_byte);
-
-                    # ... Then we prepend it with a known overlong sequence.
-                    # This should evaluate to the exact same code point as the
-                    # original.  We try to avoid an overlong using Perl
-                    # extended UTF-8.  The code points are the highest
-                    # representable as overlongs on the respective platform
-                    # without using extended UTF-8.
-                    if (native_to_I8($start_byte) lt "\xFC") {
-                        $start_byte = I8_to_native("\xFC");
-                        $new_expected_len = 6;
-                    }
-                    elsif (! isASCII && native_to_I8($start_byte) lt "\xFE") {
-
-                        # FE is not extended UTF-8 on EBCDIC
-                        $start_byte = I8_to_native("\xFE");
-                        $new_expected_len = 7;
-                    }
-                    else {  # Must use extended UTF-8.  On ASCII platforms, we
-                            # could express some overlongs here starting with
-                            # \xFE, but there's no real reason to do so.
-                        $overlong_is_in_perl_extended_utf8 = 1;
-                        $start_byte = I8_to_native("\xFF");
-                        $new_expected_len = $::max_bytes;
-                        $this_cp_message_qr = $extended_cp_message_qr;
-
-                        # The warning that gets raised doesn't include the
-                        # code point in the message if the code point can be
-                        # expressed without using extended UTF-8, but the
-                        # particular overlong sequence used is in extended
-                        # UTF-8.  To do otherwise would be confusing to the
-                        # user, as it would claim the code point requires
-                        # extended, when it doesn't.
-                        $dont_use_overlong_cp = 1
-                                    unless requires_extended_utf8($allowed_uv);
-                        $this_non_cp_trailing_text
-                                              = $extended_non_cp_trailing_text;
-                    }
-
-                    # Splice in the revise continuation byte, preceded by the
-                    # start byte and the proper number of the lowest
-                    # continuation bytes.
-                    $this_bytes =   $start_byte
-                                . ($native_lowest_continuation_chr
-                                    x (  $new_expected_len
-                                       - 1
-                                       - length($this_bytes)))
-                                .  $converted_to_continuation_byte
-                                .  substr($this_bytes, 1);
-                    $this_length = length($this_bytes);
-                    $this_needed_to_discern_len =    $new_expected_len
-                                                - (  $this_expected_len
-                                                - $this_needed_to_discern_len);
-                    $this_expected_len = $new_expected_len;
-                }
-            }
-
-            if ($short) {
-
-                # To force this malformation, just tell the test to not look
-                # as far as it should into the input.
-                $this_length--;
-                $this_expected_len--;
-
-                $allow_flags |= $::UTF8_ALLOW_SHORT if $malformed_allow_type;
-            }
+  $test_count++;
+  next if $test_count % $num_test_files != $::TEST_CHUNK;
+
+  my ($testname, $bytes, $allowed_uv, $needed_to_discern_len) = @$test;
+
+  my $length = length $bytes;
+  my $initially_overlong = $testname =~ /overlong/;
+  my $initially_orphan   = $testname =~ /orphan/;
+  my $will_overflow = $allowed_uv < 0;
+
+  my $uv_string = sprintf(($allowed_uv < 0x100) ? "%02X" : "%04X", $allowed_uv);
+  my $display_bytes = display_bytes($bytes);
+
+  my $controlling_warning_category;
+  my $utf8n_flag_to_warn;
+  my $utf8n_flag_to_disallow;
+  my $uvchr_flag_to_warn;
+  my $uvchr_flag_to_disallow;
+
+  # We want to test that the independent flags are actually independent.
+  # For example, that a surrogate doesn't trigger a non-character warning,
+  # and conversely, turning off an above-Unicode flag doesn't suppress a
+  # surrogate warning.  Earlier versions of this file used nested loops to
+  # test all possible combinations.  But that creates lots of tests, making
+  # this run too long.  What is now done instead is to use the complement of
+  # the category we are testing to greatly reduce the combinatorial
+  # explosion.  For example, if we have a surrogate and we aren't expecting
+  # a warning about it, we set all the flags for non-surrogates to raise
+  # warnings.  If one shows up, it indicates the flags aren't independent.
+  my $utf8n_flag_to_warn_complement;
+  my $utf8n_flag_to_disallow_complement;
+  my $uvchr_flag_to_warn_complement;
+  my $uvchr_flag_to_disallow_complement;
+
+  # Many of the code points being tested are middling in that if code point
+  # edge cases work, these are very likely to as well.  Because this test
+  # file takes a while to execute, we skip testing the edge effects of code
+  # points deemed middling, while testing their basics and continuing to
+  # fully test the non-middling code points.
+  my $skip_most_tests = 0;
+
+  my $cp_message_qr;      # Pattern that matches the message raised when
+                          # that message contains the problematic code
+                          # point.  The message is the same (currently) both
+                          # when going from/to utf8.
+  my $non_cp_trailing_text;   # The suffix text when the message doesn't
+                              # contain a code point.  (This is a result of
+                              # some sort of malformation that means we
+                              # can't get an exact code poin
+  my $extended_cp_message_qr = qr/\QCode point 0x$uv_string is not Unicode,\E
+                      \Q requires a Perl extension, and so is not\E
+                      \Q portable\E/x;
+  my $extended_non_cp_trailing_text
+                      = "is a Perl extension, and so is not portable";
+
+  # What bytes should have been used to specify a code point that has been
+  # specified as an overlong.
+  my $correct_bytes_for_overlong;
+
+  # Is this test malformed from the beginning?  If so, we know to generally
+  # expect that the tests will show it isn't valid.
+  my $initially_malformed = 0;
+
+  if ($initially_overlong || $initially_orphan) {
+      $non_cp_trailing_text = "if you see this, there is an error";
+      $cp_message_qr = qr/\Q$non_cp_trailing_text\E/;
+      $initially_malformed = 1;
+      $utf8n_flag_to_warn     = 0;
+      $utf8n_flag_to_disallow = 0;
+
+      $utf8n_flag_to_warn_complement =     $::UTF8_WARN_SURROGATE;
+      $utf8n_flag_to_disallow_complement = $::UTF8_DISALLOW_SURROGATE;
+      if (! $will_overflow && $allowed_uv <= 0x10FFFF) {
+          $utf8n_flag_to_warn_complement     |= $::UTF8_WARN_SUPER;
+          $utf8n_flag_to_disallow_complement |= $::UTF8_DISALLOW_SUPER;
+          if (($allowed_uv & 0xFFFF) != 0xFFFF) {
+              $utf8n_flag_to_warn_complement      |= $::UTF8_WARN_NONCHAR;
+              $utf8n_flag_to_disallow_complement  |= $::UTF8_DISALLOW_NONCHAR;
+          }
+      }
+      if (! is_extended_utf8($bytes)) {
+          $utf8n_flag_to_warn_complement |= $::UTF8_WARN_PERL_EXTENDED;
+          $utf8n_flag_to_disallow_complement  |= $::UTF8_DISALLOW_PERL_EXTENDED;
+      }
 
-            if ($unexpected_noncont) {
+      $controlling_warning_category = 'utf8';
 
-                # To force this malformation, change the final continuation
-                # byte into a start byte.
-                my $pos = ($short) ? -2 : -1;
-                substr($this_bytes, $pos, 1) = $known_start_byte;
-                $this_expected_len--;
-            }
+      if ($initially_overlong) {
+          if (! defined $needed_to_discern_len) {
+              $needed_to_discern_len = overlong_discern_len($bytes);
+          }
+          $correct_bytes_for_overlong = display_bytes_no_quotes(chr $allowed_uv);
+      }
+  }
+  elsif($will_overflow || $allowed_uv > 0x10FFFF) {
+
+      # Set the SUPER flags; later, we test for PERL_EXTENDED as well.
+      $utf8n_flag_to_warn     = $::UTF8_WARN_SUPER;
+      $utf8n_flag_to_disallow = $::UTF8_DISALLOW_SUPER;
+      $uvchr_flag_to_warn     = $::UNICODE_WARN_SUPER;
+      $uvchr_flag_to_disallow = $::UNICODE_DISALLOW_SUPER;;
+
+      # Below, we add the flags for non-perl_extended to the code points
+      # that don't fit that category.  Special tests are done for this
+      # category in the inner loop.
+      $utf8n_flag_to_warn_complement     = $::UTF8_WARN_NONCHAR
+                                          |$::UTF8_WARN_SURROGATE;
+      $utf8n_flag_to_disallow_complement = $::UTF8_DISALLOW_NONCHAR
+                                          |$::UTF8_DISALLOW_SURROGATE;
+      $uvchr_flag_to_warn_complement     = $::UNICODE_WARN_NONCHAR
+                                          |$::UNICODE_WARN_SURROGATE;
+      $uvchr_flag_to_disallow_complement = $::UNICODE_DISALLOW_NONCHAR
+                                          |$::UNICODE_DISALLOW_SURROGATE;
+      $controlling_warning_category = 'non_unicode';
+
+      if ($will_overflow) {  # This is realy a malformation
+          $non_cp_trailing_text = "if you see this, there is an error";
+          $cp_message_qr = qr/\Q$non_cp_trailing_text\E/;
+          $initially_malformed = 1;
+          if (! defined $needed_to_discern_len) {
+              $needed_to_discern_len = overflow_discern_len($length);
+          }
+      }
+      elsif (requires_extended_utf8($allowed_uv)) {
+          $cp_message_qr = $extended_cp_message_qr;
+          $non_cp_trailing_text = $extended_non_cp_trailing_text;
+          $needed_to_discern_len = 1 unless defined $needed_to_discern_len;
+      }
+      else {
+          $cp_message_qr = qr/\QCode point 0x$uv_string is not Unicode,\E
+                              \Q may not be portable\E/x;
+          $non_cp_trailing_text = "is for a non-Unicode code point, may not"
+                              . " be portable";
+          $utf8n_flag_to_warn_complement     |= $::UTF8_WARN_PERL_EXTENDED;
+          $utf8n_flag_to_disallow_complement
+                                          |= $::UTF8_DISALLOW_PERL_EXTENDED;
+          $uvchr_flag_to_warn_complement |= $::UNICODE_WARN_PERL_EXTENDED;
+          $uvchr_flag_to_disallow_complement
+                                      |= $::UNICODE_DISALLOW_PERL_EXTENDED;
+      }
+  }
+  elsif ($allowed_uv >= 0xD800 && $allowed_uv <= 0xDFFF) {
+      $cp_message_qr = qr/UTF-16 surrogate U\+$uv_string/;
+      $non_cp_trailing_text = "is for a surrogate";
+      $needed_to_discern_len = 2 unless defined $needed_to_discern_len;
+      $skip_most_tests = 1 if $allowed_uv > 0xD800 && $allowed_uv < 0xDFFF;
+
+      $utf8n_flag_to_warn     = $::UTF8_WARN_SURROGATE;
+      $utf8n_flag_to_disallow = $::UTF8_DISALLOW_SURROGATE;
+      $uvchr_flag_to_warn     = $::UNICODE_WARN_SURROGATE;
+      $uvchr_flag_to_disallow = $::UNICODE_DISALLOW_SURROGATE;;
+
+      $utf8n_flag_to_warn_complement     = $::UTF8_WARN_NONCHAR
+                                          |$::UTF8_WARN_SUPER
+                                          |$::UTF8_WARN_PERL_EXTENDED;
+      $utf8n_flag_to_disallow_complement = $::UTF8_DISALLOW_NONCHAR
+                                          |$::UTF8_DISALLOW_SUPER
+                                          |$::UTF8_DISALLOW_PERL_EXTENDED;
+      $uvchr_flag_to_warn_complement     = $::UNICODE_WARN_NONCHAR
+                                          |$::UNICODE_WARN_SUPER
+                                          |$::UNICODE_WARN_PERL_EXTENDED;
+      $uvchr_flag_to_disallow_complement = $::UNICODE_DISALLOW_NONCHAR
+                                          |$::UNICODE_DISALLOW_SUPER
+                                          |$::UNICODE_DISALLOW_PERL_EXTENDED;
+      $controlling_warning_category = 'surrogate';
+  }
+  elsif (   ($allowed_uv >= 0xFDD0 && $allowed_uv <= 0xFDEF)
+          || ($allowed_uv & 0xFFFE) == 0xFFFE)
+  {
+      $cp_message_qr = qr/\QUnicode non-character U+$uv_string\E
+                          \Q is not recommended for open interchange\E/x;
+      $non_cp_trailing_text = "if you see this, there is an error";
+      $needed_to_discern_len = $length unless defined $needed_to_discern_len;
+      if (   ($allowed_uv > 0xFDD0 && $allowed_uv < 0xFDEF)
+          || ($allowed_uv > 0xFFFF && $allowed_uv < 0x10FFFE))
+      {
+          $skip_most_tests = 1;
+      }
 
-            # The whole point of a test that is malformed from the beginning
-            # is to test for that malformation.  If we've modified things so
-            # much that we don't have enough information to detect that
-            # malformation, there's no point in testing.
-            next if    $initially_malformed
-                    && $this_expected_len < $this_needed_to_discern_len;
-
-            # Here, we've transformed the input with all of the desired
-            # non-overflow malformations.  We are now in a position to
-            # construct any potential warnings for those malformations.  But
-            # it's a pain to get the detailed messages exactly right, so for
-            # now XXX, only do so for those that return an explicit code
-            # point.
-
-            if ($initially_orphan) {
-                push @malformation_names, "orphan continuation";
-                push @expected_malformation_return_flags,
-                                                    $::UTF8_GOT_CONTINUATION;
-                $allow_flags |= $::UTF8_ALLOW_CONTINUATION
-                                                    if $malformed_allow_type;
-                push @expected_malformation_warnings, qr/unexpected continuation/;
-            }
+      $utf8n_flag_to_warn     = $::UTF8_WARN_NONCHAR;
+      $utf8n_flag_to_disallow = $::UTF8_DISALLOW_NONCHAR;
+      $uvchr_flag_to_warn     = $::UNICODE_WARN_NONCHAR;
+      $uvchr_flag_to_disallow = $::UNICODE_DISALLOW_NONCHAR;;
+
+      $utf8n_flag_to_warn_complement     = $::UTF8_WARN_SURROGATE
+                                          |$::UTF8_WARN_SUPER
+                                          |$::UTF8_WARN_PERL_EXTENDED;
+      $utf8n_flag_to_disallow_complement = $::UTF8_DISALLOW_SURROGATE
+                                          |$::UTF8_DISALLOW_SUPER
+                                          |$::UTF8_DISALLOW_PERL_EXTENDED;
+      $uvchr_flag_to_warn_complement     = $::UNICODE_WARN_SURROGATE
+                                          |$::UNICODE_WARN_SUPER
+                                          |$::UNICODE_WARN_PERL_EXTENDED;
+      $uvchr_flag_to_disallow_complement = $::UNICODE_DISALLOW_SURROGATE
+                                          |$::UNICODE_DISALLOW_SUPER
+                                          |$::UNICODE_DISALLOW_PERL_EXTENDED;
+
+      $controlling_warning_category = 'nonchar';
+  }
+  else {
+      die "Can't figure out what type of warning to test for $testname"
+  }
+
+  die 'Didn\'t set $needed_to_discern_len for ' . $testname
+                                      unless defined $needed_to_discern_len;
+
+  # We try various combinations of malformations that can occur
+  foreach my $short (0, 1) {
... 6588 lines suppressed ...

-- 
Perl5 Master Repository



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About