develooper Front page | perl.perl5.changes | Postings from September 2019

[perl.git] branch blead updated. v5.31.4-37-g59e5493636

From:
Karl Williamson
Date:
September 27, 2019 17:20
Subject:
[perl.git] branch blead updated. v5.31.4-37-g59e5493636
Message ID:
E1iDtv6-0004zW-2c@git.dc.perl.space
In perl.git, the branch blead has been updated

<https://perl5.git.perl.org/perl.git/commitdiff/59e54936361dbc8a6aa3224d5456d809c079d269?hp=76d3ad4c2443f94d2d636a40a01762c27bbf1c10>

- Log -----------------------------------------------------------------
commit 59e54936361dbc8a6aa3224d5456d809c079d269
Author: Karl Williamson <khw@cpan.org>
Date:   Sat Sep 21 13:18:12 2019 -0600

    regcomp.h: Parenthesize param in macro expansion
    
    This is always a good idea

commit c5184715c0018eac1440599795d6341d07559dd4
Author: Karl Williamson <khw@cpan.org>
Date:   Sat Sep 21 13:14:25 2019 -0600

    regcomp.h: Remove duplicate macro expansion
    
    This macro has the same definition as another.

commit b61e55cb1695ff940310c75f08e41cfbfc16d73c
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Sep 26 22:09:51 2019 -0600

    regcomp.c: Clarify some comments

commit a2f213ef6995b39265d4ac5097a63ca063dbb346
Author: Karl Williamson <khw@cpan.org>
Date:   Sun Sep 22 15:09:05 2019 -0600

    regcomp.sym Update and improve descriptions of some nodes
    
    EXACTFU nodes always now fold their strings; the information here had
    not been updated to reflect that change.
    
    And the descriptions of several EXACTish nodes are now changed to be
    slightly shorter and to remove mention of the string length, which is
    problematic, and is covered in the description for EXACT

commit 484678fc0e05755eaaecb74c8b1cf89e1e54984b
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Sep 26 16:30:21 2019 -0600

    regen/regcomp.pl: Rename variable
    
    The old name was misleading.

commit e21ef6928fa32f8c21414f00ec4a6cae741dec7a
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Sep 26 16:23:33 2019 -0600

    regen/regcomp.pl, regcomp.sym: Comments
    
    I spent some time in this code trying to understand some things, and as
    a result I'm commenting previously undocumented features.  The comments
    about what an entry in regcomp.sym should look like are moved to that
    file, rather than the file that reads it.  The former is most often
    touched, and they had gotten out-of-sync in the latter.  Things now make
    more sense to me, and hopefully anyone using this in the future.

commit 27c3e5ad94fad01593474ee3038849be74be86a0
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Sep 26 20:49:53 2019 -0600

    Silence verbatim line pod warning in perldebguts
    
    This generated pod has many lines that really can't be wrapped.  So
    change the podcheck.t db to ignore these errors in this file.

commit cac0218b84f9ae47a2369c80f167b508534e0351
Author: Karl Williamson <khw@cpan.org>
Date:   Thu Sep 26 13:14:14 2019 -0600

    Add note to debugging output if regex already compiled
    
    Prior to this commit, the debugging output says "Compiling REx foo".
    But there was no indication that it was skipped due to the pattern
    already being compiled; so that was confusing to people, and was a Stack
    Overflow question of what is going on.  Now there's an extra message
    that the recompilation is skipped.

commit 7cb9b5f3b2a5b765e3399f08c283a1156931be4e
Author: Karl Williamson <khw@cpan.org>
Date:   Sun Sep 22 15:48:51 2019 -0600

    perlrun: Note that -W can't be in PERL5OPT

-----------------------------------------------------------------------

Summary of changes:
 pod/perldebguts.pod            | 32 +++++++++++++++----------------
 pod/perlrun.pod                |  2 ++
 regcomp.c                      | 15 +++++++++++----
 regcomp.h                      |  5 +++--
 regcomp.sym                    | 37 +++++++++++++++++++++++++-----------
 regen/regcomp.pl               | 43 +++++++++++++++++++-----------------------
 regnodes.h                     | 14 +++++++-------
 t/porting/known_pod_issues.dat |  2 +-
 8 files changed, 85 insertions(+), 65 deletions(-)

diff --git a/pod/perldebguts.pod b/pod/perldebguts.pod
index b439380d8a..1e23b84af4 100644
--- a/pod/perldebguts.pod
+++ b/pod/perldebguts.pod
@@ -562,7 +562,7 @@ will be lost.
 
 =for regcomp.pl begin
 
- # TYPE arg-description [num-args] [longjump-len] DESCRIPTION
+ # TYPE arg-description [regnode-struct-suffix] [longjump-len] DESCRIPTION
 
  # Exit points
 
@@ -663,25 +663,25 @@ will be lost.
  EXACTL           str        Like EXACT, but /l is in effect (used so
                              locale-related warnings can be checked
                              for).
- EXACTF           str        Match this string using /id rules (w/len);
+ EXACTF           str        Like EXACT, but match using /id rules;
                              (string not UTF-8, not guaranteed to be
                              folded).
- EXACTFL          str        Match this string using /il rules (w/len);
-                             (string not guaranteed to be folded).
- EXACTFU          str        Match this string using /iu rules (w/len);
-                             (string folded iff in UTF-8; non-UTF8
-                             folded length <= unfolded).
- EXACTFAA         str        Match this string using /iaa rules (w/len)
-                             (string folded iff in UTF-8; non-UTF8
-                             folded length <= unfolded).
-
- EXACTFUP         str        Match this string using /iu rules (w/len);
+ EXACTFL          str        Like EXACT, but match using /il rules;
+                             (string not likely to be folded).
+ EXACTFU          str        Like EXACT, but match using /iu rules;
+                             (string folded).
+ EXACTFAA         str        Like EXACT, but match using /iaa rules;
+                             (string folded iff pattern is UTF8; folded
+                             length <= unfolded).
+
+ EXACTFUP         str        Like EXACT, but match using /iu rules;
                              (string not UTF-8, not guaranteed to be
-                             folded; and its Problematic).
+                             folded; and it is Problematic).
 
- EXACTFLU8        str        Like EXACTFU, but use /il, UTF-8, folded,
-                             and everything in it is above 255.
- EXACTFAA_NO_TRIE str        Match this string using /iaa rules (w/len)
+ EXACTFLU8        str        Like EXACTFU, but use /il, UTF-8, (string
+                             is folded, and everything in it is above
+                             255.
+ EXACTFAA_NO_TRIE str        Like EXACT, but match using /iaa rules
                              (string not UTF-8, not guaranteed to be
                              folded, not currently trie-able).
 
diff --git a/pod/perlrun.pod b/pod/perlrun.pod
index 2a32976c01..b32598424f 100644
--- a/pod/perlrun.pod
+++ b/pod/perlrun.pod
@@ -949,6 +949,8 @@ X<-X>
 Disables all warnings regardless of C<use warnings> or C<$^W>.
 See L<warnings>.
 
+Forbidden in L</C<PERL5OPT>>.
+
 =item B<-x>
 X<-x>
 
diff --git a/regcomp.c b/regcomp.c
index b389f9ec7f..e74f4d8fab 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -7584,6 +7584,12 @@ Perl_re_op_compile(pTHX_ SV ** const patternp, int pat_count,
         && memEQ(RX_PRECOMP(old_re), exp, plen)
 	&& !runtime_code /* with runtime code, always recompile */ )
     {
+        DEBUG_COMPILE_r({
+            SV *dsv= sv_newmortal();
+            RE_PV_QUOTED_DECL(s, RExC_utf8, dsv, exp, plen, PL_dump_re_max_len);
+            Perl_re_printf( aTHX_  "%sSkipping recompilation of unchanged REx%s %s\n",
+                          PL_colors[4], PL_colors[5], s);
+        });
         return old_re;
     }
 
@@ -19601,8 +19607,9 @@ S_nextchar(pTHX_ RExC_state_t *pRExC_state)
 STATIC void
 S_change_engine_size(pTHX_ RExC_state_t *pRExC_state, const Ptrdiff_t size)
 {
-    /* 'size' is the delta to add or subtract from the current memory allocated
-     * to the regex engine being constructed */
+    /* 'size' is the delta number of smallest regnode equivalents to add or
+     * subtract from the current memory allocated to the regex engine being
+     * constructed. */
 
     PERL_ARGS_ASSERT_CHANGE_ENGINE_SIZE;
 
@@ -19634,8 +19641,8 @@ S_change_engine_size(pTHX_ RExC_state_t *pRExC_state, const Ptrdiff_t size)
 STATIC regnode_offset
 S_regnode_guts(pTHX_ RExC_state_t *pRExC_state, const U8 op, const STRLEN extra_size, const char* const name)
 {
-    /* Allocate a regnode for 'op', with 'extra_size' extra space.  It aligns
-     * and increments RExC_size and RExC_emit
+    /* Allocate a regnode for 'op', with 'extra_size' extra (smallest) regnode
+     * equivalents space.  It aligns and increments RExC_size and RExC_emit
      *
      * It returns the regnode's offset into the regex engine program */
 
diff --git a/regcomp.h b/regcomp.h
index 62f4398ed1..d9f2cbe63e 100644
--- a/regcomp.h
+++ b/regcomp.h
@@ -331,11 +331,12 @@ struct regnode_ssc {
 #define FLAGS(p)	((p)->flags)	/* Caution: Doesn't apply to all      \
 					   regnode types.  For some, it's the \
 					   character set of the regnode */
-#define	OPERAND(p)	(((struct regnode_string *)p)->string)
+#define	OPERAND(p)	STRING(p)
+
 #define MASK(p)		((char*)OPERAND(p))
 #define	STR_LEN(p)	(((struct regnode_string *)p)->str_len)
 #define	STRING(p)	(((struct regnode_string *)p)->string)
-#define STR_SZ(l)	((l + sizeof(regnode) - 1) / sizeof(regnode))
+#define STR_SZ(l)	(((l) + sizeof(regnode) - 1) / sizeof(regnode))
 #define NODE_SZ_STR(p)	(STR_SZ(STR_LEN(p))+1)
 
 #undef NODE_ALIGN
diff --git a/regcomp.sym b/regcomp.sym
index c69e4c9452..8a2fb240f1 100644
--- a/regcomp.sym
+++ b/regcomp.sym
@@ -11,14 +11,29 @@
 # Note that the order in this file is important.
 #
 # Format for first section: 
-# NAME \s+ TYPE, arg-description [num-args] [flags] [longjump] ; DESCRIPTION
+# NAME \s+ TYPE, arg-description [struct regnode suffix] [flags] [longjump] ; DESCRIPTION
+#   arg-description is currently unused
+#   suffix is appended to 'struct_regnode_' giving which one to use.  If empty,
+#       it means plain 'struct regnode'.  If the regnode is a string one, this
+#       should instead refer to the base regnode, without the char[1] element
+#       of the structure
 #   flag <S> means is REGNODE_SIMPLE; flag <V> means is REGNODE_VARIES; <.> is
-#   a placeholder
-#   longjump is 1 if the (first) argument holds the next offset.
-#
+#       a placeholder
+#   longjump is 1 if the (first) argument holds the next offset (instead of the
+#       usual 'next_offset' field
 #
 # run perl regen.pl after editing this file
 
+#                             +- suffix of which struct regnode to use e.g.,
+#                             | +- flags  (S or V)               struct regnode_1
+#                         un- | | +- longjmp (0, blank, or 1)  blank means 0
+# Name        Type       used | | | ; comment
+# --------------------------------------------------------------------------
+# IFMATCH     BRANCHJ,    off 1 . 1 ; Succeeds if the following matches.
+# UNLESSM     BRANCHJ,    off 1 . 1 ; Fails if the following matches.
+# SUSPEND     BRANCHJ,    off 1 V 1 ; "Independent" sub-RE.
+# IFTHEN      BRANCHJ,    off 1 V 1 ; Switch, should be preceded by switcher.
+# GROUPP      GROUPP,     num 1     ; Whether the group matched.
 
 
 #* Exit points
@@ -103,22 +118,22 @@ BRANCH      BRANCH,     node 0 V  ; Match this alternative, or the next...
 
 EXACT       EXACT,      str       ; Match this string (flags field is the length).
 EXACTL      EXACT,      str       ; Like EXACT, but /l is in effect (used so locale-related warnings can be checked for).
-EXACTF      EXACT,      str       ; Match this string using /id rules (w/len); (string not UTF-8, not guaranteed to be folded).
-EXACTFL     EXACT,      str       ; Match this string using /il rules (w/len); (string not guaranteed to be folded).
-EXACTFU     EXACT,      str	  ; Match this string using /iu rules (w/len); (string folded iff in UTF-8; non-UTF8 folded length <= unfolded).
-EXACTFAA    EXACT,      str	  ; Match this string using /iaa rules (w/len) (string folded iff in UTF-8; non-UTF8 folded length <= unfolded).
+EXACTF      EXACT,      str       ; Like EXACT, but match using /id rules; (string not UTF-8, not guaranteed to be folded).
+EXACTFL     EXACT,      str       ; Like EXACT, but match using /il rules; (string not likely to be folded).
+EXACTFU     EXACT,      str	  ; Like EXACT, but match using /iu rules; (string folded).
+EXACTFAA    EXACT,      str	  ; Like EXACT, but match using /iaa rules; (string folded iff pattern is UTF8; folded length <= unfolded).
 
 # End of important relative ordering.
 
-EXACTFUP    EXACT,      str	  ; Match this string using /iu rules (w/len); (string not UTF-8, not guaranteed to be folded; and its Problematic).
+EXACTFUP    EXACT,      str	  ; Like EXACT, but match using /iu rules; (string not UTF-8, not guaranteed to be folded; and it is Problematic).
 # In order for a non-UTF-8 EXACTFAA to think the pattern is pre-folded when
 # matching a UTF-8 target string, there would have to be something like an
 # EXACTFAA_MICRO which would not be considered pre-folded for UTF-8 targets,
 # since the fold of the MICRO SIGN would not be done, and would be
 # representable in the UTF-8 target string.
 
-EXACTFLU8   EXACT,      str	  ; Like EXACTFU, but use /il, UTF-8, folded, and everything in it is above 255.
-EXACTFAA_NO_TRIE  EXACT, str	  ; Match this string using /iaa rules (w/len) (string not UTF-8, not guaranteed to be folded, not currently trie-able).
+EXACTFLU8   EXACT,      str	  ; Like EXACTFU, but use /il, UTF-8, (string is folded, and everything in it is above 255.
+EXACTFAA_NO_TRIE  EXACT, str	  ; Like EXACT, but match using /iaa rules (string not UTF-8, not guaranteed to be folded, not currently trie-able).
 
 
 EXACT_ONLY8 EXACT,      str       ; Like EXACT, but only UTF-8 encoded targets can match
diff --git a/regen/regcomp.pl b/regen/regcomp.pl
index cb9861318d..2eac179684 100644
--- a/regen/regcomp.pl
+++ b/regen/regcomp.pl
@@ -49,14 +49,17 @@ use strict;
 # name          Both    Name of op/state
 # id            Both    integer value for this opcode/state
 # optype        Both    Either 'op' or 'state'
-# line_num          Both    line_num number of the input file for this item.
+# line_num      Both    line_num number of the input file for this item.
 # type          Op      Type of node (aka regkind)
-# code          Op      what code is associated with this node (???)
-# args          Op      what type of args the node has (which regnode struct)
-# flags         Op      (???)
+# code          Op      Apparently not used
+# suffix        Op      which regnode struct this uses, so if this is '1', it
+#                       uses 'struct regnode_1'
+# flags         Op      S for simple; V for varies
 # longj         Op      Boolean as to if this node is a longjump
-# comment       Both    Comment about node, if any
+# comment       Both    Comment about node, if any.  Placed in perlredebguts
+#                       as its description
 # pod_comment   Both    Special comments for pod output (preceding lines in def)
+#                       Such lines begin with '#*'
 
 # Global State
 my @all;    # all opcodes/state
@@ -97,23 +100,15 @@ sub register_node {
 }
 
 # Parse and add an opcode definition to the global state.
-# An opcode definition looks like this:
+# What an opcode definition looks like is given in regcomp.sym.
 #
-#                             +- args
-#                             | +- flags
-#                             | | +- longjmp
-# Name        Type       code | | | ; comment
-# --------------------------------------------------------------------------
-# IFMATCH     BRANCHJ,    off 1 . 2 ; Succeeds if the following matches.
-# UNLESSM     BRANCHJ,    off 1 . 2 ; Fails if the following matches.
-# SUSPEND     BRANCHJ,    off 1 V 1 ; "Independent" sub-RE.
-# IFTHEN      BRANCHJ,    off 1 V 1 ; Switch, should be preceded by switcher.
-# GROUPP      GROUPP,     num 1     ; Whether the group matched.
-#
-# Not every opcode definition has all of these. We should maybe make this
-# nicer/easier to read in the future. Also note that the above is tab
+# Not every opcode definition has all of the components. We should maybe make
+# this nicer/easier to read in the future. Also note that the above is tab
 # sensitive.
 
+# Special comments for an entry precede it, and begin with '#*' and are placed
+# in the generated pod file just before the entry.
+
 sub parse_opcode_def {
     my ( $text, $line_num, $pod_comment )= @_;
     my $node= {
@@ -129,10 +124,10 @@ sub parse_opcode_def {
         or die "Failed to match $_";
 
     # the content of the "desc" field from the first step is extracted here:
-    @{$node}{qw(type code args flags longj)}= split /[,\s]\s*/, $node->{desc};
+    @{$node}{qw(type code suffix flags longj)}= split /[,\s]\s*/, $node->{desc};
 
     defined $node->{$_} or $node->{$_} = ""
-        for qw(type code args flags longj);
+        for qw(type code suffix flags longj);
 
     register_node($node); # has to be before the type_alias code below
 
@@ -368,7 +363,7 @@ EOP
 
     foreach my $node (@ops) {
         my $size= 0;
-        $size= "EXTRA_SIZE(struct regnode_$node->{args})" if $node->{args};
+        $size= "EXTRA_SIZE(struct regnode_$node->{suffix})" if $node->{suffix};
 
         printf $out "\t%*s\t/* %*s */\n", -37, "$size,", -$rwidth, $node->{name};
     }
@@ -635,11 +630,11 @@ EOD
 
     print <<'END_OF_DESCR';
 
- # TYPE arg-description [num-args] [longjump-len] DESCRIPTION
+ # TYPE arg-description [regnode-struct-suffix] [longjump-len] DESCRIPTION
 END_OF_DESCR
     for my $n (@ops) {
         $node= $n;
-        $code= "$node->{code} " . ( $node->{args} || "" );
+        $code= "$node->{code} " . ( $node->{suffix} || "" );
         $code .= " $node->{longj}" if $node->{longj};
         if ( $node->{pod_comment} ||= "" ) {
 
diff --git a/regnodes.h b/regnodes.h
index 3b93b85aa2..a1929b823f 100644
--- a/regnodes.h
+++ b/regnodes.h
@@ -50,13 +50,13 @@
 #define	BRANCH                	36	/* 0x24 Match this alternative, or the next... */
 #define	EXACT                 	37	/* 0x25 Match this string (flags field is the length). */
 #define	EXACTL                	38	/* 0x26 Like EXACT, but /l is in effect (used so locale-related warnings can be checked for). */
-#define	EXACTF                	39	/* 0x27 Match this string using /id rules (w/len); (string not UTF-8, not guaranteed to be folded). */
-#define	EXACTFL               	40	/* 0x28 Match this string using /il rules (w/len); (string not guaranteed to be folded). */
-#define	EXACTFU               	41	/* 0x29 Match this string using /iu rules (w/len); (string folded iff in UTF-8; non-UTF8 folded length <= unfolded). */
-#define	EXACTFAA              	42	/* 0x2a Match this string using /iaa rules (w/len) (string folded iff in UTF-8; non-UTF8 folded length <= unfolded). */
-#define	EXACTFUP              	43	/* 0x2b Match this string using /iu rules (w/len); (string not UTF-8, not guaranteed to be folded; and its Problematic). */
-#define	EXACTFLU8             	44	/* 0x2c Like EXACTFU, but use /il, UTF-8, folded, and everything in it is above 255. */
-#define	EXACTFAA_NO_TRIE      	45	/* 0x2d Match this string using /iaa rules (w/len) (string not UTF-8, not guaranteed to be folded, not currently trie-able). */
+#define	EXACTF                	39	/* 0x27 Like EXACT, but match using /id rules; (string not UTF-8, not guaranteed to be folded). */
+#define	EXACTFL               	40	/* 0x28 Like EXACT, but match using /il rules; (string not likely to be folded). */
+#define	EXACTFU               	41	/* 0x29 Like EXACT, but match using /iu rules; (string folded). */
+#define	EXACTFAA              	42	/* 0x2a Like EXACT, but match using /iaa rules; (string folded iff pattern is UTF8; folded length <= unfolded). */
+#define	EXACTFUP              	43	/* 0x2b Like EXACT, but match using /iu rules; (string not UTF-8, not guaranteed to be folded; and it is Problematic). */
+#define	EXACTFLU8             	44	/* 0x2c Like EXACTFU, but use /il, UTF-8, (string is folded, and everything in it is above 255. */
+#define	EXACTFAA_NO_TRIE      	45	/* 0x2d Like EXACT, but match using /iaa rules (string not UTF-8, not guaranteed to be folded, not currently trie-able). */
 #define	EXACT_ONLY8           	46	/* 0x2e Like EXACT, but only UTF-8 encoded targets can match */
 #define	EXACTFU_ONLY8         	47	/* 0x2f Like EXACTFU, but only UTF-8 encoded targets can match */
 #define	EXACTFU_S_EDGE        	48	/* 0x30 /di rules, but nothing in it precludes /ui, except begins and/or ends with [Ss]; (string not UTF-8; compile-time only). */
diff --git a/t/porting/known_pod_issues.dat b/t/porting/known_pod_issues.dat
index 36ac9e4797..b2dd8df6d8 100644
--- a/t/porting/known_pod_issues.dat
+++ b/t/porting/known_pod_issues.dat
@@ -367,7 +367,7 @@ install	? Should you be using F<...> or maybe L<...> instead of	1
 pod/perl.pod	Verbatim line length including indents exceeds 79 by	8
 pod/perlandroid.pod	Verbatim line length including indents exceeds 79 by	3
 pod/perlbook.pod	Verbatim line length including indents exceeds 79 by	1
-pod/perldebguts.pod	Verbatim line length including indents exceeds 79 by	27
+pod/perldebguts.pod	Verbatim line length including indents exceeds 79 by	-1
 pod/perldebtut.pod	Verbatim line length including indents exceeds 79 by	3
 pod/perldtrace.pod	Verbatim line length including indents exceeds 79 by	7
 pod/perlgit.pod	? Should you be using F<...> or maybe L<...> instead of	1

-- 
Perl5 Master Repository



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About