From:

Date:

July 7, 2011 09:59Subject:

Re: Solving the *real* Dot Problem (was: Is 5.16 the time to remove \N, the complement of \n, from being experimental?)Message ID:

16481.1310057915@chthonAbigail <abigail@abigail.be> wrote on Thu, 07 Jul 2011 11:51:58 +0200: > While Unicode is possible, almost all data I'm applying regexes to is > ASCII data. I use /./ all the time, and for me, it just works. Where it > doesn't, /(?s:.)/ does. /./ and /(?s:.)/ even works fine if I have mostly > ASCII data with some Unicode characters or words thrown in. > Full blown Unicode, which uses stuff where /./ or /(?s:.)/ won't work, > I've yet to have the need to parse it. Do you realize how "lucky" you are? And, perhaps, how unusual? The data we work with in biomedical text mining is asymptotically close to being 100% Unicode data. Think about the dozenish gigabytes of the PubMed Open Access collection alone. That's all in UTF-8 XML. When we convert it to "plain text" for minding, we *must* handle the α stuff correctly. Wrong is not an option. >> Here are 5 possible meanings for dot. I start with the original and *LEAST >> USEFUL OF ALL POSSIBLE MEANINGS*, and progress to the most useful ones, the >> ones that I think people should usually be using these days: >> >> 1 = no re /s (traditional and annoying) >> 2 = use re /s (necessary but insufficient) >> 3 = \V (improved #1) >> 4 = \X (improved #2) >> 5 = \X unless \R (improved #2, #3) >> >> See? How often do you guys write the *wrong* one of those? > Never. > Abigail Abigail, you are not just "one of the guys". You are one of the only people who understands all these differences. I would be sad if you had written the wrong one. But I still bet most people do. Please see my next letter. --tomThread Previous | Thread Next

- Is 5.16 the time to remove \N, the complement of \n, from being experimental? by Karl Williamson
- Re: Is 5.16 the time to remove \N, the complement of \n, frombeing experimental ? by Jesse Vincent
- Re: Is 5.16 the time to remove \N, the complement of \n, from beingexperimental ? by Zsbán Ambrus
- Solving the *real* Dot Problem (was: Is 5.16 the time to remove \N, the complement of \n, from being experimental?) by Tom Christiansen
- Re: Solving the *real* Dot Problem (was: Is 5.16 the time to remove\N, the complement of \n, from being experimental?) by Brian Fraser
- Re: Solving the *real* Dot Problem by Karl Williamson
- Re: Solving the *real* Dot Problem (was: Is 5.16 the time to remove\N, the complement of \n, from being experimental?) by H.Merijn Brand
- Re: Solving the *real* Dot Problem (was: Is 5.16 the time toremove \N, the complement of \n, from being experimental?) by Abigail
- Re: Solving the *real* Dot Problem by Johan Vromans
**Re: Solving the *real* Dot Problem (was: Is 5.16 the time to remove \N, the complement of \n, from being experimental?)**by Tom Christiansen- Re: Solving the *real* Dot Problem by Johan Vromans
- FMTEYEWTK about Unicode Grapheme Matching (was: Solving the *real* Dot Problem) by Tom Christiansen
- Re: Solving the *real* Dot Problem by David Golden
- Re: Is 5.16 the time to remove \N, the complement of \n, frombeing experimental ? by Abigail
- Re: Is 5.16 the time to remove \N, the complement of \n, from beingexperimental ? by Karl Williamson

nntp.perl.org: Perl Programming lists via nntp and http.

Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About