develooper Front page | perl.beginners | Postings from March 2008

parsing CSV files with control and extended ASCII characters

Thread Next
From:
David Newman
Date:
March 20, 2008 16:38
Subject:
parsing CSV files with control and extended ASCII characters
I have some CSV input files that contain control and extended ASCII 
characters, including:

- vertical tabs (0x0B)

- acute and grave accents

- tildes

- circumflexes

- umlauts

- nonbreaking spaces (0xA0)

The Text::CSV or Tie::Handle::CSV modules don't like these characters; 
the snippets below both return errors when they get to one.

Is there some other method for stuffing comma-separated ASCII (*any* 
ASCII) into a hash or list?

thanks

dn



snippet 1:

my $file = 'foo.csv';
my $csv = Text::CSV->new();

open (CSV, "<", $file) or die $!;

while (<CSV>) {
     if ($csv->parse($_)) {
         my @columns = $csv->fields();
         print "$columns[0] $columns[1] $columns[6]\n";
     } else {
         my $err = $csv->error_input;
         print "Failed to parse line: $err"; # some characters hit this
     }
}
close CSV;

snippet 2:

my $file = 'foo.csv';
my $fh = Tie::Handle::CSV->new($file, header => 1);

     while (my $csv_line = <$fh>) {
         print $csv_line->{'First Name'} . " " . $csv_line->{'Last 
Name'}  . "\n"; # program dies on first line with 'bad' ASCII characters
     }

close $fh;

Thread Next


Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About