typed-peg: A parsing library for parsing expression grammars
(require typed-peg) | package: typed-peg |
1 Introduction
This package defines a racket language for parsing expression grammars that creates a parse function for this grammar and it returns a parse tree for the input string. The language also provide a type directed semantics for pretty printing them. The library also uses a type inference algorithm which determines if the PEG is complete, i.e. terminates its execution on all inputs.
2 Requirements
In order to type check, the tool need a working installation of Z3 SMT Solver. The project is known to work with Z3 version 4.8.14.
3 The language typed-peg
Programs written in typed-peg language consist of a parsing expression grammar close to Fordβs original notations for PEGs. When imported, the module will provide two functions:
parse: this function takes an input string and returns a parse tree which corresponds to how the string was matched by the grammar.
pretty: this functions takes a parse tree as input and returns its underlying string.
Parse trees for PEGs follow the structure of the parsing expressions, similarly for regular expression parser trees.
4 Limitations
Currently, the library only supports the core syntax for parsing expressions. We intend to implement other operators in future versions of the library.
5 Languages
Following the racket approach to build small languages, we have build some auxiliar languages to ease the task of use/debug the tool.
peg: default language, provides a parse and pretty printing function for the specified PEG, after infering types for the input PEG.
peg/untyped: disable the type-inference engine. Use at your own risk!
peg/debug/tokenize-only: outputs the result of the lexical analyser.
peg/debug/parse-only: outputs the result of the parser.
peg/debug/constraints-only: outputs the constraints generated by the algorithm.
peg/debug/z3-script-only: outputs the z3 script that encode the constraints.
peg/debug/infer-only: outputs the infered types for each grammar non-terminal.