On 17/02/13 01:16, yves orton via RT wrote: > On 13 December 2012 18:32, Daniel Lukasiak <perlbug-followup@perl.org> wrote: >> # New Ticket Created by Daniel Lukasiak >> # Please include the string: [perl #116086] >> # in the subject line of all future correspondence about this issue. >> # <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=116086 > >> >> >> This is a bug report for perl from estrai@estrai.com, >> generated with the help of perlbug 1.39 running under perl 5.17.7. >> >> >> ----------------------------------------------------------------- >> >> Hi, >> split() has a special case for " " and "\x20" so they work like \s+ > > Umm. I wasn't aware that we document that "\x20" works the same as " ". > > It used to, as an implementation accident, but I don't believe that we > document that it should. > > The docs look like this: > > As a special case, specifying a PATTERN of space (' ') > will split on white space just as "split" with no arguments does. > Thus, > "split(' ')" can be used to emulate awk's default > behavior, whereas "split(/ /)" will give you as many initial null > fields (empty > string) as there are leading spaces. A "split" on > "/\s+/" is like a "split(' ')" except that any leading whitespace > produces a > null first field. A "split" with no arguments really > does a "split(' ', $_)" internally. > > That doesn't say "\x20" works the same. > > We changed which level of the perl parser handles escapes intended for > the regex engine. > > Previous to this the \x20 would be resolved to a space, and as far as > the regex engine was concerned the pattern would be " ". > > After this change the \x20 would be delivered to the regex engine > verbatim and the \x20 form would not be recognized by the heuristic > that handles the " " case. > > This change was very desirable for many reasons, and as it doesnt > actually contradict the docs, unless Ricardo says otherwise I consider > this Not A Bug. Hi, it looks like split's documentation has been reworded around 5.16 and it is now explicitly mentioning "\x20", vide: perldoc -f split "As another special case, "split" emulates the default behavior of the command line tool awk when the PATTERN is either omitted or a literal string composed of a single space character (such as ' ' or "\x20", but not e.g. "/ /"). In this case, any leading whitespace in EXPR is removed before splitting occurs, and the PATTERN is instead treated as if it were "/\s+/"; in particular, this means that any contiguous whitespace(not just a single space character) is used as a separator. However, this special treatment can be avoided by specifying the pattern "/ /" instead of the string " ", thereby allowing only a single space character to be a separator." -- Daniel ŁukasiakThread Previous | Thread Next