grammar::peg::interp -
Interpreter for parsing expression grammars
package require Tcl 8.4
package require grammar::mengine ? 0.1 ?
package require grammar::peg::interp ? 0.1.1 ?
::grammar::peg::interp::setup peg
::grammar::peg::interp::parse nextcmd errorvar astvar
This package provides commands for the controlled matching of a
character stream via a parsing expression grammar and the creation
of an abstract syntax tree for the stream and partials.
It is built on top of the virtual machine provided by the package
grammar::me::tcl and directly interprets the parsing
expression grammar given to it.
In other words, the grammar is not pre-compiled but used as is.
The grammar to be interpreted is taken from a container object
following the interface specified by the package
grammar::peg::container. Only the relevant parts
are copied into the state of this package.
It should be noted that the package provides exactly one instance
of the interpreter, and interpreting a second grammar requires
the user to either abort or complete a running interpretation, or
to put them into different Tcl interpreters.
Also of note is that the implementation assumes a pull-type
handling of the input. In other words, the interpreter pulls
characters from the input stream as it needs them. For usage
in a push environment, i.e. where the environment pushes new
characters as they come we have to put the engine into its
own thread.
The package exports the following API
-
::grammar::peg::interp::setup peg
-
This command (re)initializes the interpreter. It returns the
empty string. This command has to be invoked first, before any
matching run.
Its argument peg is the handle of an object containing the
parsing expression grammar to interpret. This grammar has to be
valid, or an error will be thrown.
-
::grammar::peg::interp::parse nextcmd errorvar astvar
-
This command interprets the loaded grammar and tries to match it
against the stream of characters represented by the command prefix
nextcmd.
The command prefix nextcmd represents the input stream of
characters and is invoked by the interpreter whenever the a new
character from the stream is required.
The callback has to return either the empty list, or a list of 4
elements containing the token, its lexeme attribute, and its location
as line number and column index, in this order.
The empty list is the signal that the end of the input stream has been
reached. The lexeme attribute is stored in the terminal cache, but
otherwise not used by the machine.
The result of the command is a boolean value indicating whether the
matching process was successful (true), or not
(false). In the case of a match failure error information will
be stored into the variable referenced by errorvar. The variable
referenced by astvar will always contain the generated abstract
syntax tree, however in the case of an error it will be only partial
and possibly malformed.
The abstract syntax tree is represented by a nested list, as
described in section AST VALUES of
document grammar::me_ast.
This document, and the package it describes, will undoubtedly contain
bugs and other problems.
Please report such in the category
grammar_peg of the
http://sourceforge.net/tracker/?group_id=12883.
Please also report any ideas for enhancements you may have for either
package and/or documentation.
grammar, expression, push down automaton, state, parsing expression, parsing expression grammar, context-free languages, parsing, transducer, LL(k), TDPL, top-down parsing languages, recursive descent, virtual machine, matching