On 3/20/08 5:05 PM, Gunnar Hjalmarsson wrote: > David Newman wrote: >> I have some CSV input files that contain control and extended ASCII >> characters, > > <snip> > >> The Text::CSV or Tie::Handle::CSV modules don't like these characters; >> the snippets below both return errors when they get to one. > > <snip> > >> my $csv = Text::CSV->new(); > > In the docs for Text::CSV, that way of creating a new object is > mentioned at the top of the SYNOPSIS section. The solution to your > problem is stated right after that. > > So, the usual recommendation: > > "Read the docs for the module you are using." > > is very much applicable. ;-) <time passes, seasons change, children grow up> OK, thanks for this polite RTFM. However, it doesn't answer the root question, namely how to parse text that contains Western European characters such as accents and umlauts. I see from the Text::CSV documentation that this module handles only characters between 0x20 and 0x7e. I also see there is a binary mode for any character, but the documentation does not describe whether the module parses binary-mode characters the same way as ASCII characters. This seems like a fairly standard problem. What's the "right" way (or, given perl culture, "a" way) to handle text outside the 0x20 to 0x7e range? Many thanks! dnThread Previous | Thread Next