On 4/10/07, Tels <nospam-abuse@bloodgate.com> wrote: > But what if you need negative integers, too? Expression signed numbers in an > unsigned type like size_t is quite awkward. I'll explain this a bit better if only to get it clear in my own head. The core engine expresses capture variables with an array of I32 start/end pairs. So a match like: "bwbs" =~ /(.)(..)/; Would result in the following structure: regexp_paren_pair pairs[] = { {0, 3}, { 0, 1}, {1, 3}, { -1, -1 } }; I.e. $& = "bwb"; $1 = "b"; $2 = "wb" and $3 = undef, $4 = undef, .... The only thing using a I32 gives us here is having -1 mark the end of capture buffers. dmq had the idea of using 0 instead which of course would entail expressing what was {0, 3} as {1, 4}. Unless anyone else has any better ideas. Using a size_t-ish type such as STRLEN would allow us to have giant capture vars and to wrap regexp libs that allow them, such as the POSIX regex library. I32 is also used for functions such as Perl_reg_numbered_buff_fetch() which take a I32 paren argument indicating what capture buffer should be retrived. -2 is used for $`, -1 for $' and 0 for $&. This limits the number of capture vars to around 2**32/2. In any case other parts of the interface present bigger issues, but this is one of the things that would be nice to get right the first time.Thread Previous | Thread Next