develooper Front page | perl.recdescent | Postings from June 2005

[Fwd: Re: (SPAM?) space-separated tokens (FAQ?)]

From:
Ron Smith
Date:
June 30, 2005 11:29
Subject:
[Fwd: Re: (SPAM?) space-separated tokens (FAQ?)]
Message ID:
42C43A14.9010200@sedona.intel.com
Scott wrote:

> 
> 1pu2nmmu5cni
> 
> thumb pick up  near  forefinger  string  
> 
>   And doesn't work because no break.
> 
> 
> 
>>Way number two is:
>>
>>lat:	("i" | "o" | "m" ...!rel_move ) {$SFNParse::abbrevs{$item[1]};}
>>
>>which is the "lookahead" I mentioned previously.
>>
> 
> 
> Whoops! Now for the two previous lines I get:
> 
>   thumb pick up  near  forefinger  string   
> 
> for both, which is incorrect. Looks like I'll have to adopt method 1.
> 

I would still recomend method 2.  What I was trying to show in the
example fix is not a bullet proof fix that solves all of your problems
but to illustrate how "lookahead" can help resolve some of the
ambiguities.  It really depends on the exact nature of the ambiguity as
to how helpful lookahead can be as your "together to get her" example
illustrates.  To correctly parse your example sentence you have to not
only tokenize it correctly, correctly interpret the semantics, but you
have to also *understand* that the sentence is probably refering to a
"stream" as a sequence of things together than as a flowing body of
water that you "get her" in, using a bunch of glommed things to do it with.

As you can tell by now, I *really* don't like depending on white space
as a token separator.  And yes it does take on somewhat of a religious
bent... It takes a bit more effort to figure it out, but one can usually
resolve a problem without enforced white space.

I don't know all of your grammar, and given your "simple" test case my
guess would be that it is relatively complex.  Figuring out exactly how
and where to put in the lookahead conditions takes a bit of thought.

I made a small change to my way #2:

lat: ("i"|"o"|(...!rel_move  "m")){$SFNParse::abbrevs{$item[1]};}

Notice what a difference it makes:

1 pu 5f
   thumb pick up  far  little finger  string
1 pu 5tfo
   thumb pick up  top far outer  little finger  string
1 mu 2n pu 5cni
   thumb move under near  forefinger  string, pick up  center near inner
  little finger  string
1 pu 2n mu 5cni # this is wrong
   thumb pick up  near  forefinger  string, move under center near inner
  little finger  string  (this is wrong)
1pu2nmu5cni # this is wrong
   thumb pick up  near  forefinger  string, move under center near inner
  little finger  string  (this is wrong)
1mu2npu5cni     # Yucch! But it parses....
   thumb move under near  forefinger  string, pick up  center near inner
  little finger  string  (Yucch! But it parses....)
1 pu 2nm mu 5cni
   thumb pick up  near middle  forefinger  string, move under center
near inner  little finger  string
1pu2nmmu5cni
   thumb pick up  near middle  forefinger  string, move under center
near inner  little finger  string




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About