On Wed, 2004-10-27 at 14:01, Luke Palmer wrote: > Aaron Sherman writes: > > > /ab(c|b) {fail unless $1 eq 'c'}/ > > > > Now, what does "fail" mean? I can think of two definitions: > > > > 1. proceed to trap state (backtracking then happens) > > 2. exit (probably using an exception) the rule? > > The first one. Great. In that case, that's what I expected. Number 2 might require my having more control over exceptions than I really want to. > > NDFAs (or NFAs as you say) > > Last time I checked, "nondeterministic" was a word all by itself. :-p Tomato, tomatoe.... Yeah, I know, but I grew up hearing NDFA at school, so I'm stuck. :-) > > should be the same as DFAs for this purpose. I think it's just a > > difference in the way we look at the token-matching process. I see it > > as part of the state, so to say "we both begin and end subexpression > > storage here" is to say, "only the token matched here will be stored > > as the subexpression." This restricts you, though, and I may find that > > in order to say: > > > > (a)((b)c) > > > > I need too much hair and want to put the markers elsewhere.... I'll know > > that by next week. > > I've given this a lot of thought in the past. Way too much, actually. > I think you'll find your DFA approach to fail when you deal with the > generic case of perl 6 rules. Hmmm, actually I thought Perl 6 rules would be trivial because almost all of the non-regular-expression parts are so perfectly impossible to model in an FSA. There's no ambiguity as to how that's done that I can see, it simply has to be a call to external code. Its in places where I think I have to try to cram difficult concepts into the FSA that it gets hard. > DFAs can't adequately model context free grammars. I think it's best > for your brain if you leave it at that. Certainly. Let's just stop here if you thought I was saying anything differently. I *am* talking about using the API for building an FSA to also describe where the non-finite bits live, but that's just an API, and I take no responsibility for deciding how you generate what code to handle the rest, I'm just telling you what bag to drop it in once you do. This, of course, branches off into compiler topics that I didn't want to get into yet. Let me finish the FSA API and the trivial regex parser plugin before we start talking about how a compiler will use this too deeply (and we can have that chat on p6c... I really do have to remember to subscribe to p6c). > Perl 6 also allows conditionals within rules. So if you have an > alternation: > > / [ <( cond_1() )> | <( cond_2() )> | <( cond_3() )> ] / > > You're aware that you need to make those conditions orthogonal, right? > You can only proceed to one of those states if you're sure that it's the > _only_ one that matches. It turns out that this is going to take as > much time _all the time_ as the NFA takes in its worst case. Well, most of that's a matter for the back-end to deal with, and the relatively optimization-free back end I'm planning for a start will output code which does exactly what you describe. Obviously if Perl places restrictions on how optimizations can be applied to such cases, then it will need its own back-end or a generic back-end which has some sort of tagging that turns off optimizations, all of which is fine. None of that touches how we store and represent the FSA before it's compiled to Parrot. > DFAs are a bad idea for Perl 6. You seem to have taken a left turn somewhere. Let's get our assumptions above cleared up before you start leaping to the conclusion that what I'm in the middle of won't work. > On the other hand, optimizing sufficiently simple parts of regular > expressions to DFAs and embedding them in a more general system is a > pretty good idea, IMO. I think that's what I said. -- ☎ 781-324-3772 ✉ ajs@ajs.com ☷ http://www.ajs.com/~ajsThread Previous | Thread Next