develooper Front page | perl.beginners | Postings from May 2011

Re: putting file columns into arrays

Thread Previous
From:
Uri Guttman
Date:
May 20, 2011 22:10
Subject:
Re: putting file columns into arrays
Message ID:
87liy0ljum.fsf@quad.sysarch.com
>>>>> "EM" == Eric Mooshagian <ericmooshagian@gmail.com> writes:

  EM> I would like a subroutine that will allow me to easily put columns
  EM> of a tab delimited file into their own arrays.

  EM> I've been calling the following repeatedly for each column:

  EM> my @array1 = getcolvals($filehandle, 0);
  EM> my @array2 = getcolvals($filehandle, 1);  ...etc.

whenever you think you need to name things with numeric parts, you
usually need an array. since you want arrays, then you really want an
array of arrays.

  EM> sub getcolvals {
  EM> 	@_ and not @_ % 2 or die "Incorrect number of arguments to getcolvals!\n";

that is sort of clunky. why not just check @_ == 2?

	@_ == 2 or die ...

  EM> 	my $myfile = shift;
  EM> 	my $mycol = shift;

it is usually better to assign from @_. i posted not to long ago several
reasons why. check the archives for it.

	my( $myfile, $mycol ) = @_ ;

and in this case you won't need a $mycol since the code will load all
the columns into arrays.
	
  EM> 	my @column = ();

you don't need to initialize my arrays to () as my does that for you.

  EM> 	while (<$myfile>) {

this will fail unless you reopen the file each time you call the sub or
you seek to the beginning of the file.

  EM>     		my ($field) = (split /\s/, $_)[$mycol]; 

since you are slicing the split and getting one value, you don't need
the () around $field. 

  EM>         	push @column, $field;     

and you can combing both of those lines into one:

		push @column, (split /\s/, $_)[$mycol] ; 
  EM> 	}

  EM> 	return @column;
  EM> } 

this is untested:

# this is a faster and easier way to get lines from a file
use File::Slurp ;

sub load_columns {

	my( $file_name ) = @_ ;

	$file_name or die 'load_columns: missing file name' ;

	my @lines = read_file $file_name ;

	my $matrix ;

	foreach my $line ( @lines ) {

		my @fields = split ' ', $line ;

		for my $i ( 0 .. $#fields ) {

# build up the array of arrays here. each array gets the next field value

			push( @{$matrix[$i]}, $field[$i] ) ;
		}
	}

	return $matrix ;
}

for more on references and perl data structures read:

	perlreftut
	perllol
	perldsc

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About