develooper Front page | perl.beginners | Postings from February 2009

processing large datafiles

Thread Next
Pedro Soto
February 17, 2009 09:07
processing large datafiles
Message ID:
Dear all,
I need to read a huge file and then write only the columns that match
with ids from another file (with less ids) in a sorted fashion.
I made a script thatdoes the work but it takes a lot of time. I tried
the script with few columns from the huge and it took 5 sec to do the
job. Because I have over 403 000 ids, I calculated more and less 3hr
to run the complete files, but the script is taking longer than that.
I wonder if someone has a better way to do this... I really need to
write the huge file by sorted ids. Any help will be greatly
Here is the code:

use warnings;
use strict;

open(MAP,"") || die;
my %map;
my %locus;

while(<MAP>) {
my @snp =split /\s+/;
if ($snp[0] =~ /Chromosome/) {next};
$locus{$snp[3]} = $snp[2];
close MAP;

open(IN,"trialped.csv") || die;
my @AoA =();
while(<IN>) {
my @temp =split/,/;
close IN;

$out1= "outfile.txt";

open(OUT1,">$out1") || die;
for (my $x=1;$x<=$#AoA;$x++) {
print OUT1 "$x $AoA[$x][0] 0 0 0 1\t";
foreach my $k (sort {$a <=>$b} keys%map) {
 foreach my $val(sort {$a <=>$b} @{$map{$k}}){
     for (my $y=1;$y <$sca;$y++) {
     if($locus{$val} eq $AoA[0][$y]) {
       print "$AoA[$x][$y]";
print OUT1 "\n";

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About