Parsers Progress

http://www.metawrap.com/tests/dot/test9.png

Minimal testcase for simplest production I could think of – with an at that un unhandled visitor type. (now implemented of course). The wonderful thing about applying such a complicated visitor pattern to this type of problem is that even though the first type is hard and laborious, the rest seem to quickly drop into place.

space => ‘ ‘

My misssion tonight – get rid of that superflouous end group before the terminator.

But this

space => ‘_’ ‘ ‘

http://www.metawrap.com/tests/dot/test10.png

is of course just plain wrong.

This is better.

http://www.metawrap.com/tests/dot/test11.png

But we are left with the problem, How do we know that we are the last character of the lexeme and the last lexeme of a production with no further substitutions – and thus a potential terminator e. This is non trivial to compute at every symbol. It may be better to gather all these dangling groups as part of garbage collect and make them all point to a single terminator group. I will investigate both paths and see what works best. But now for some sleep.

 

About James McParlane

CTO Massive Interactive. Ex Computer Whiz Kid - Now Grumpy Old Guru.
This entry was posted in Parsing Theory, XPath. Bookmark the permalink.

Leave a comment