develooper Front page | perl.perl5.porters | Postings from February 2003

Re: [PATCH] pack/unpack

From:
hv
Date:
February 10, 2003 16:33
Subject:
Re: [PATCH] pack/unpack
Message ID:
200302110035.h1B0Z8217540@crypt.compulink.co.uk
Wolfgang Laun <laune@alcatel.at> wrote:
:=head1 Code C<@> 
:
:Template code C<@n> in a ()-group is now relative to the position where
:(un)pack was when the last C<(> was encountered. 

This is a change in behaviour from 5.8.0, but I guess it makes sense on
the grounds that the 5.8.0 behaviour isn't particularly useful. What
happens if C<@n> appears _after_ a C<()> group? Current behaviour may
again not be very useful unless you happen to know how often the C<()>
will execute.

An alternative definition would be "counting from the last opened or
closed C<()> group, or the beginning of string if no such group has
been encountered". But I'm just guessing at the utility of this, since
I've not yet ever had cause to use C<@>.

:=item Deep nesting of ()-groups
: 
:Deep nesting (~5000 on my Linux system) of ()-groups resulted
:in a stackoverflow/segmentation fault. Fixed by limiting
:()-group nesting to 100 which should be sufficient for most cases ;-).

.. but'll be very irritating the first time you need 101, and _know_
you have the stack to handle it. Even worse if this is code that was
running fine on 5.8.0 ...

I'm not convinced an arbitrary limit is the right approach. A better
answer is probably for pp_(un)pack to handle their own resizable stacks
for template nesting, which is roughly what we plan to do to fix the
same problem in the regexp engine.

:=head1 To Do
:
:Discuss whether strict semantics concerning unpack going beyond the end 
:of the string-to-be-unpacked need to be defined, and implemented accordingly.
:There are several scenarios how this might happen, and one might
:argue pro or con silence/warn/die in each individual case. Also, should the
:result be undef() or that whatever '' can be converted to?

I rarely use pack/unpack, so I may misguess, but I feel that most uses of
unpack believe they know exactly what packed the string, and never expect
an error. Accordingly I feel that dying is the right approach, and letting
code handling more arbitrary inputs eval() the unpack and warn or not on
error as they choose.

Please let diagnostics.pm know about new warnings/errors, and let
t/lib/warnings know about new warnings. Note also that proto.h is
generated from embed.fnc and others using regen_headers.pl, which
also creates other files that need to know changes in function
prototypes. The patch below to embed.fnc matches the changes you
put in proto.h. Note however that some of the modified lines start
with an "A", which means they are part of the public API. That means
you can't go changing their interfaces willy-nilly; best is probably
to create a new named function with the desired prototype and leave
the existing one as a stub calling into the new one.

If you can resolve this issue with changes to the API and supply
a new patch, I'll be happy to apply it and worry later about patches
to make any other changes that might arise from my comments above.
Oh, it does pass all tests for me here, but you probably should test
it under threading at some point - I think that would have picked up
the failure to patch embed.fnc and regen_headers, and might catch
some function call misbehaviour.

Hugo
--- embed.fnc.old	Sun Feb  9 23:24:33 2003
+++ embed.fnc	Tue Feb 11 01:16:47 2003
@@ -579,7 +579,7 @@
 Ap	|void	|set_numeric_radix
 Ap	|void	|set_numeric_standard
 Apd	|void	|require_pv	|const char* pv
-Apd	|void	|pack_cat	|SV *cat|char *pat|char *patend|SV **beglist|SV **endlist|SV ***next_in_list|U32 flags
+Apd	|void	|pack_cat	|SV *cat|char *pat|char *patend|SV **beglist|SV **endlist|SV ***next_in_list|U32 flags|int level
 p	|void	|pidgone	|Pid_t pid|int status
 Ap	|void	|pmflag		|U32* pmfl|int ch
 p	|OP*	|pmruntime	|OP* pm|OP* expr|OP* repl
@@ -799,7 +799,7 @@
 #if defined(UNLINK_ALL_VERSIONS)
 Ap	|I32	|unlnk		|char* f
 #endif
-Apd	|I32	|unpack_str	|char *pat|char *patend|char *s|char *strbeg|char *strend|char **new_s|I32 ocnt|U32 flags
+Apd	|I32	|unpack_str	|char *pat|char *patend|char *s|char *strbeg|char *strend|char **new_s|U32 flags|int level
 Ap	|void	|unsharepvn	|const char* sv|I32 len|U32 hash
 p	|void	|unshare_hek	|HEK* hek
 p	|void	|utilize	|int aver|I32 floor|OP* version|OP* id|OP* arg
@@ -1055,7 +1055,8 @@
 s	|SV*	|mul128		|SV *sv|U8 m
 s	|SV*	|is_an_int	|char *s|STRLEN l
 s	|int	|div128		|SV *pnum|bool *done
-s	|char *	|next_symbol	|char *pat|char *patend
+s	|I32	|next_symbol	|char **pat|char *patend
+s	|I32	|is_next_symbol	|char **pat|char *patend|char c
 s	|I32	|find_count	|char **ppat|char *patend|int *star
 s	|char *	|group_end	|char *pat|char *patend|char ender
 s	|I32	|measure_struct	|char *pat|char *patend



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About