develooper Front page | perl.beginners | Postings from March 2002

Re: Books on advanced text processing

Thread Previous
From:
Peter Scott
Date:
March 30, 2002 08:36
Subject:
Re: Books on advanced text processing
Message ID:
4.3.2.7.2.20020330083142.00a81260@shell2.webquarry.com
At 11:36 PM 3/29/02 -0500, Jim Witte wrote:
>   I'm contemplating writing some software to scan through a large volume 
> of email (over 95 MB) to identify threads and remove quoted material.
>Does anyone have any good references on algorithms to do text processing 
>like this for such a massive amount of data?

Is this something you're planning on doing once, or many times?  95MB is 
nothing; right now I'm scanning through several hundred gigabytes of 
text.  Do you need sub-second response on this?  If not, I don't see the 
need for advanced algorithms.


--
Peter Scott
Pacific Systems Design Technologies
http://www.perldebugged.com


Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About