At 11:36 PM 3/29/02 -0500, Jim Witte wrote: > I'm contemplating writing some software to scan through a large volume > of email (over 95 MB) to identify threads and remove quoted material. >Does anyone have any good references on algorithms to do text processing >like this for such a massive amount of data? Is this something you're planning on doing once, or many times? 95MB is nothing; right now I'm scanning through several hundred gigabytes of text. Do you need sub-second response on this? If not, I don't see the need for advanced algorithms. -- Peter Scott Pacific Systems Design Technologies http://www.perldebugged.comThread Previous