develooper Front page | perl.perl6.language | Postings from April 2005

nbsp in \s, <?ws> and <>

Thread Next
April 15, 2005 14:44
nbsp in \s, <?ws> and <>
Message ID:
Is there a <?ws>-like thingy that is always \s+?

Do \s and <?ws> match non-breaking whitespace, U+00A0?

How about:

    U+0008  backspace
    U+00A0  no break space (Repeated for overview)
    U+1361  ethiopic wordspace
    U+2000  en quad
    U+2001  em quad
    U+2002  en space
    U+2003  em space
    U+2004  three per em space
    U+2005  four per em space
    U+2006  six per em space
    U+2007  figure space
    U+2008  punctuation space
    U+2009  thin space 
    U+200A  hair space
    U+200B  zero width space
    U+202F  narrow no break space
    U+205F  medium mathematic space
    U+2060  word joiner (What is that, anyway?)
    U+3000  ideographic space
    U+FEFF  zero width non-breaking space
\s is said (in S05) to match any unicode whitespace, but letting it
match NBSP and then using \s for splitting things is wrong, I think.

Are the contents of <> split using <?ws>? (Is <<$foo>>, where $foo is
"foo\xA0bar", one or two elements?)


Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About