develooper Front page | perl.perl5.changes | Postings from March 2019

[perl.git] branch blead updated. v5.29.8-74-g465848b5c5

Karl Williamson
March 12, 2019 16:01
[perl.git] branch blead updated. v5.29.8-74-g465848b5c5
Message ID:
In perl.git, the branch blead has been updated


- Log -----------------------------------------------------------------
commit 465848b5c535041b50179c7fe361c169bd817143
Author: Karl Williamson <>
Date:   Thu Mar 7 15:31:58 2019 -0700

    is_invlist(): Allow NULL input
    For generality, it should allow a NULL and return FALSE.

commit 526f2ca9d7c56ba5eaab5dd85f67f05cdd6cead6
Author: Karl Williamson <>
Date:   Mon Mar 11 17:10:06 2019 -0600

    perlunicode: Update, clarify
    This updates to match the latest Unicode document on regular
    expressions, and to incorporate changes that have happened to Perl that
    didn't get updated here.  It also includes new clarifications about some
    of the Unicode requirements.

commit 83082a1983fd9a12e9f69740f60a1ce0897b68c8
Author: Karl Williamson <>
Date:   Thu Mar 7 15:29:28 2019 -0700

    perlapi: Clarify entry for hv_store()


Summary of changes:
 embed.fnc           |  2 +-
 hv.c                |  4 +++-
 invlist_inline.h    |  4 +---
 pod/perlunicode.pod | 40 ++++++++++++++++++++++++----------------
 proto.h             |  2 --
 5 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index 2f8dd63487..4b04389c20 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -1742,7 +1742,7 @@ EMpX	|SV*	|invlist_clone	|NN SV* const invlist|NULLOK SV* newlist
 #if defined(PERL_IN_REGCOMP_C) || defined(PERL_IN_REGEXEC_C) || defined(PERL_IN_TOKE_C) || defined(PERL_IN_UTF8_C) || defined(PERL_IN_PP_C)
 EiMRn	|UV*	|invlist_array	|NN SV* const invlist
-EiMRn	|bool	|is_invlist	|NN SV* const invlist
+EiMRn	|bool	|is_invlist	|NULLOK SV* const invlist
 EiMRn	|bool*	|get_invlist_offset_addr|NN SV* invlist
 EiMRn	|UV	|_invlist_len	|NN SV* const invlist
 EMiRn	|bool	|_invlist_contains_cp|NN SV* const invlist|const UV cp
diff --git a/hv.c b/hv.c
index fc90a5146b..1371b2f99a 100644
--- a/hv.c
+++ b/hv.c
@@ -260,7 +260,9 @@ if all your code does is create SVs then store them in a hash, C<hv_store>
 will own the only reference to the new SV, and your code doesn't need to do
 anything further to tidy up.  Note that C<hv_store_ent> only reads the C<key>;
 unlike C<val> it does not take ownership of it, so maintaining the correct
-reference count on C<key> is entirely the caller's responsibility.  C<hv_store>
+reference count on C<key> is entirely the caller's responsibility.  The reason
+it does not take ownership, is that C<key> is not used after this function
+returns, and so can be freed immediately.  C<hv_store>
 is not implemented as a call to C<hv_store_ent>, and does not create a temporary
 SV for the key, so if your key data is not already in SV form then use
 C<hv_store> in preference to C<hv_store_ent>.
diff --git a/invlist_inline.h b/invlist_inline.h
index 1304b4543a..4bab3d83a6 100644
--- a/invlist_inline.h
+++ b/invlist_inline.h
@@ -23,9 +23,7 @@
 S_is_invlist(SV* const invlist)
-    return SvTYPE(invlist) == SVt_INVLIST;
+    return invlist != NULL && SvTYPE(invlist) == SVt_INVLIST;
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod
index d6931e4d02..955893f690 100644
--- a/pod/perlunicode.pod
+++ b/pod/perlunicode.pod
@@ -37,7 +37,7 @@ implement the Unicode standard or the accompanying technical reports
 from cover to cover, Perl does support many Unicode features.
 Also, the use of Unicode may present security issues that aren't
-obvious, see L</Security Implications of Unicode>.
+obvious, see L</Security Implications of Unicode> below.
 =over 4
@@ -853,8 +853,8 @@ L<perlrecharclass/POSIX Character Classes>.
 This property is used when you need to know in what Unicode version(s) a
 character is.
-The "*" above stands for some two digit Unicode version number, such as
-C<1.1> or C<4.0>; or the "*" can also be C<Unassigned>.  This property will
+The "*" above stands for some Unicode version number, such as
+C<1.1> or C<12.0>; or the "*" can also be C<Unassigned>.  This property will
 match the code points whose final disposition has been settled as of the
 Unicode release given by the version number; C<\p{Present_In: Unassigned}>
 will match those code points whose meaning has yet to be assigned.
@@ -1089,7 +1089,7 @@ The following list of Unicode supported features for regular expressions describ
 all features currently directly supported by core Perl.  The references
 to "Level I<N>" and the section numbers refer to
 L<UTS#18 "Unicode Regular Expressions"|>,
-version 13, November 2013.
+version 18, October 2016.
 =head3 Level 1 - Basic Unicode Support
@@ -1244,28 +1244,36 @@ L<UAX#29 "Unicode Text Segmentation"|>,
 =head3 Level 3 - Tailored Support
  RL3.1   Tailored Punctuation            - Missing
- RL3.2   Tailored Grapheme Clusters      - Missing       [12]
+ RL3.2   Tailored Grapheme Clusters      - Missing       [13]
  RL3.3   Tailored Word Boundaries        - Missing
  RL3.4   Tailored Loose Matches          - Retracted by Unicode
  RL3.5   Tailored Ranges                 - Retracted by Unicode
- RL3.6   Context Matching                - Missing       [13]
+ RL3.6   Context Matching                - Partial       [14]
  RL3.7   Incremental Matches             - Missing
- RL3.8   Unicode Set Sharing             - Unicode is proposing
-                                           to retract this
+ RL3.8   Unicode Set Sharing             - Retracted by Unicode
  RL3.9   Possible Match Sets             - Missing
  RL3.10  Folded Matching                 - Retracted by Unicode
- RL3.11  Submatchers                     - Missing
+ RL3.11  Submatchers                     - Partial       [15]
 =over 4
-=item [12]
+=item [13]
 Perl has L<Unicode::Collate>, but it isn't integrated with regular
 expressions.  See
 L<UTS#10 "Unicode Collation Algorithms"|>.
-=item [13]
-Perl has C<(?<=x)> and C<(?=x)>, but lookaheads or lookbehinds should
-see outside of the target substring
+=item [14]
+Perl has C<(?<=x)> and C<(?=x)>, but this requirement says that it
+should be possible to specify that matches may occur only in a substring
+with the lookaheads and lookbehinds able to see beyond that matchable
+=item [15]
+Perl has user-defined properties (L</"User-Defined Character
+Properties">) to look at single code points in ways beyond Unicode, and
+it might be possible, though probably not very clean, to use code blocks
+and things like C<(?(DEFINE)...)> (see L<perlre> to do more specialized
@@ -1326,10 +1334,10 @@ encoding of numbers up to C<0x7FFF_FFFF>.  Perl continues to allow those,
 and has extended that up to 13 bytes to encode code points up to what
 can fit in a 64-bit word.  However, Perl will warn if you output any of
 these as being non-portable; and under strict UTF-8 input protocols,
-they are forbidden.  In addition, it is deprecated to use a code point
+they are forbidden.  In addition, it is now illegal to use a code point
 larger than what a signed integer variable on your system can hold.  On
 32-bit ASCII systems, this means C<0x7FFF_FFFF> is the legal maximum
-going forward (much higher on 64-bit systems).
+(much higher on 64-bit systems).
 =item *
@@ -1513,7 +1521,7 @@ noncharacters.
 The maximum Unicode code point is C<U+10FFFF>, and Unicode only defines
 operations on code points up through that.  But Perl works on code
-points up to the maximum permissible unsigned number available on the
+points up to the maximum permissible signed number available on the
 platform.  However, Perl will not accept these from input streams unless
 lax rules are being used, and will warn (using the warning category
 C<"non_unicode">, which is a sub-category of C<"utf8">) if any are output.
diff --git a/proto.h b/proto.h
index 500c5813c6..31d77c1e4e 100644
--- a/proto.h
+++ b/proto.h
@@ -5696,8 +5696,6 @@ PERL_STATIC_INLINE UV*	S_invlist_array(SV* const invlist)
 PERL_STATIC_INLINE bool	S_is_invlist(SV* const invlist)
-	assert(invlist)

Perl5 Master Repository Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About