develooper Front page | perl.beginners | Postings from February 2002

Parsing a .csv file

Thread Next
From:
Steven Arbitman
Date:
February 11, 2002 18:25
Subject:
Parsing a .csv file
Message ID:
NEBBLKKJOLEDBEIHKGPEEEDCCHAA.info@starbits.com
Hi all,

I know parsing a comma-separated value file should be easy:
@array = split /,/;  # just split the line on commas

However, my input csv file looks like this:
Name,"street,city,state,zip",phone,email,"comments, may have commas, 2"

Note, not all fields have quotes, only those which contain commas have
quotes.

Even if I could get the input revised to split the address into several
different fields (which I know would be a good idea), the comments remain a
problem.

I can solve the problem using the substr function to examine the incoming
text char by char, replacing commas outside quotes with something else
(tabs), and leaving commas inside quotes, then splitting the line on tabs:

	$len = length ();
	for ($in_quotes=$i=0; $i<$len; $i++) {
		if (substr($_,$i,1) eq "," and !$in_quotes) {
			substr($_,$i,1) = "\t";
		} elsif (substr($_,$i,1) eq '"') {
			substr($_,$i,1)= " ";
			if ($in_quotes) {$in_quotes = 0;}
			else {$in_quotes = 1;}
		}
	}
	@infields = split /\t/;

This has got to be the slowest most inelegant way possible, but I don't see
another.  Is there a better way?

Thanks,
Steve


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About