Chaim Frenkel wrote: > > >>>>> "SF" == Steve Fink <sfink@digital-integrity.com> writes: > > SF> Handling the parser's state can be done in a backtracking DFA-like or a > SF> direct NFA-like way. The NFA way is to keep track of all possible parse > SF> states and advance each one in parallel based on the next token. The DFA > SF> way is recursive descent, backing out of blind alleys and trying again, > SF> keeping a single working hypothesis alive at a time. The DFA approach is > SF> probably easier to undo user code in, because in the NFA case you have > SF> to consider each token under the assumptions of all possible parses up > SF> to that point. The NFA case has the advantage that you never have to > SF> back up, so you can permanently forget about a token as soon as it > SF> whizzes by. > > Can an NFA actually handle the interaction between the lexer and tokenizer? > > $foo = s;foo;bar;g; > $foo = s#foo#bar#g; # change the world First, I said NFA-like, not NFA. The parser is actually a pushdown automoton (which is more powerful than an NFA, which is the same power as a DFA). And I was abusing the terms NFA and DFA; the two possibilities I was describing were actually two alternative implementations of an NFA. One just feels like the obvious DFA implementation with backtracking stuck on top. Second, the interaction must be dealt with however the parser works internally, and probably isn't all that different for the major possibilities. But assuming you know all that, I'd say that either approach handles your example equally well. It doesn't cause any backtracking or state-splitting weirdness, it just requires wrapping the parser and lexer up together with a rubber hose. As soon as the lexer sees "s#", it starts treating # as a delimiter -- it doesn't need to conditionally treat the # as either a delimiter or a comment. (Especially since there's nothing following it that could resolve the ambiguity!)Thread Previous | Thread Next