develooper Front page | perl.beginners | Postings from March 2002

Fuzzy Matching

From:
paul.beckett
Date:
March 21, 2002 01:32
Subject:
Fuzzy Matching
Message ID:
E4D0A20B9E9ED4118E3C00508BEED171016B3982@jimserv2.jic.bbsrc.ac.uk
I am attempting to do a "fuzzy match" with the String::Approx (v.3) module,
with very limited success.
I am working with biological genome sequence, this is a 30136242 character
long string (which I load into $seq), each character is either an A , T , G
or C (or in some cases more rarely an N to denote that it could be A,T,G or
C). I then want to match 15 - 20 characters against this 30136242 character
string.

I have written the code below, however I am having problems as the code
seems to stop generally after finding only one hit when I know there are
more in there. The aindex and aslice methods do not seem to have a offset,
so I am having to try to alter the search string myself, to progress along
it. From the documentation I expected aslice to return a two element list
which would be placed into $index and $size, however I seem to get an array
reference returned into $index and $size is left undefined.
Any help / advice on this would be greatly appreciated.
Cheers

Paul


#!/usr/bin/perl -w
use String::Approx qw(amatch aindex aslice); #Fuzzy matching

die "Syntax: primerSearch Chromosome_number, Number_Point_mutations,
Primer_Sequence" if (@ARGV != 3);

open (CHR,"<chromo$ARGV[0]_pseudo_v080501.seq");
$seq = <CHR>;
close (CHR);

$a = $ARGV[2];
# Reverse sequence
my ($ra) =&rev($a);


my $addf = 0;
my $indx;
my $flag;

do {
undef $indx;
undef $flag;
  my ($index,  $size)  = aslice($a, ["$ARGV[1]"], $seq);

  while ( $indx = shift(@$index)) {
    $flag = 1;
    my $sizx = shift(@$index);
    my $sq = substr($seq,$indx,$sizx);
    print ("\t" , $indx+$addf , "\t($sizx)\tSeq: $sq\n");
    $addf += ($indx + 1);
    $seq = substr($seq,$indx,length($seq));
  }

} while ( defined $flag );


sub rev {
  my $reversed_seq = reverse $_[0];
  $reversed_seq =~ tr/ATGC/TACG/;
  return $reversed_seq;
}




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About