develooper Front page | perl.perl5.porters | Postings from February 2013

Re: [perl #116086] split "\x20" doesn't work as documented

Thread Previous | Thread Next
From:
Dr.Ruud
Date:
February 17, 2013 14:30
Subject:
Re: [perl #116086] split "\x20" doesn't work as documented
Message ID:
5120E986.2070601@isolution.nl
On 2013-02-17 13:43, demerphq wrote:
> On 17 February 2013 13:14, Dr.Ruud <rvtol+usenet@isolution.nl> wrote:
>> On 2013-02-17 02:15, demerphq wrote:

>>> We changed which level of the perl parser handles escapes intended for
>>> the regex engine.
>>>
>>> Previous to this the \x20 would be resolved to a space, and as far as
>>> the regex engine was concerned the pattern would be " ".
>>>
>>> After this change the \x20 would be delivered to the regex engine
>>> verbatim and the \x20 form would not be recognized by the heuristic
>>> that handles the " " case.
>>>
>>> This change was very desirable for many reasons, and as it doesnt
>>> actually contradict the docs, unless Ricardo says otherwise I consider
>>> this Not A Bug.
>>
>>
>> See some split() cases below. So, #4 should behave as #7..10.
>>
>> So, the PATTERN "\x20" should be compiled as /\x20/, not as " ".
>
> So you dont agree that the original ticket is a bug?

Heheh, "the ticket is a bug".
I am really not sure whether the ticket is wrong or not.
So I will just live with any outcome.

- - - - - - -

If the compile-time "" operation (AKA qq: explicit string extrapolation) 
is always done first, like at 'pre-processor' level, then that is 
easiest to explain and document, I think.
" " always becomes ' ', and "$x\n" always becomes ($x."\n"), etc.

But then we should compile the split-PATTERN "a*" as /a[*]/.
Also because the split-PATTERN " $" currently leads to:
'Final $ should be \$ or $name at -e line 5, within string'.

So the split-PATTERN ' $' currently behaves different from " $".
Should split( q{\x20} ) then not also differ from split( qq{\x20} )?

The other way is to completely defer extrapolation in split-PATTERN 
context, and add some flag to the PATTERN if "" or qq was (not) around.

So I see issues with both ways. IMO the clearest is to not allow any 
literal string but a single white space, which then rejects " $" as a 
bad split-PATTERN, instead of as a bad string. Sure, that will break 
some code.

-- 
Ruud


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About