peggen
1 Overview
2 Generators reference
gen:  Γ
initΔ
gen:  peg
gen:  peg-s
gen:  grm
gen:  expr
gen:  var
gen:  symbol  Var
3  Changing the output structure of the generator
gpeg->string
gpexp->string
4 Generating Ill-typed PEGs
gen:  ill-peg
gen:  ill-peg-s
8.12

peggen🔗ℹ

Elton M. Cardoso, Rodrigo G. Ribeiro, Leonardo V. S. Reis

()

 (require peg-gen) package: peg-gen

1 Overview🔗ℹ

This module defines a collection of PEG rackcheck generators. To generate well-formed PEGs, each term is always generated as a tern (peg, nullable, headset). This approach allows for incremental type correct construction of the term.

When generating a grammar, the context of all possible variables (Γ) is kept and updated accordingly. An additional context called Δ is used to avoid generated calls to non-terminals that would result in left recursion.

The general use case for this library is the function gen:peg whose arguments are the maximum number of variables, the maximum number of terminal symbols, and the maximum depth of a peg.

> (sample (gen:peg 3 2 1) 3)

(list

 '(∅ 0 ())

 '(∅ 0 ())

 (list

  '(R (• 2 1) (S (• R R) (F (/ 2 ϵ) ∅)))

  '(/ 0 0)

  (list

   (cons 'F (TyPEG #t '()))

   (cons 'S (TyPEG #f '(R)))

   (cons 'R (TyPEG #f '())))))

The output of the generator is a list of triple (G, e, Context)
  • G (Grammar) : A list of variables followe by a peg. The list ends with ∅ (empty grammar)

  • e: A peg expression

  • Context : The type context for gamma

2 Generators reference🔗ℹ

procedure

(gen:Γ maxVars [#:varSize varSize])  gen?

  maxVars : exact-positive-integer?
  varSize : exact-positive-integer? = 0
Creates a generator that generates a list of variable names (as symbols) with no more than maxVars elements. The varSize is the number of letters in the variable (0 means one letter, 1 means two letters, and so on).

procedure

(initΔ Γ)  (hash/c symbol? (listof symbol?))

  Γ : (list/c symbol? boolean? (listof symbol?))
Constructs an initial correct Δ table from the given Γ.

procedure

(gen:peg maxVars maxLits maxDepth)  gen?

  maxVars : exact-positive-integer?
  maxLits : exact-positive-integer?
  maxDepth : exact-positive-integer?
Creates a PEG generator that generates a PEG grammar, a start expression, and the type context for each variable in grammar.

procedure

(gen:peg-s maxVars maxLits maxDepth b)  gen?

  maxVars : exact-positive-integer?
  maxLits : exact-positive-integer?
  maxDepth : exact-positive-integer?
  b : boolean?
Creates a PEG generator that generates a PEG grammar, a start expression that is nullable if b is true, and the type context for each variable in grammar.

procedure

(gen:grm G Γ Δ Σ n pmax)  gen?

  G : (list/c symbol? peg G)
  Γ : (list/c symbol? boolean? (listof symbol?))
  Δ : (hash/c symbol? (listof symbol?))
  Σ : (listof natural?)
  n : exact-positive-integer?
  pmax : exact-positive-integer?
Construc a generator for grammar from a context Γ. The Δ is a hashtable that maps each variable to its respective forbidden variables list. The G parameter is accumulative and will contain the generated grammar at the end. the Σ is the alphabet. The n parameter is from which variable the generation should start and pmax is the depth.

procedure

(gen:expr Γ Δ Σ b p)  gen?

  Γ : (list/c symbol? boolean? (listof symbol?))
  Δ : (listof symbol?)
  Σ : (listof natural?)
  b : boolean?
  p : exact-positive-integer?
  • Γ: is a list of terns: v is a variable name; nullable is a value that determines whether or not that variable is nullable; italic{headset} is a list of possible variables starting the body of v.

  • Δ: is a list of constraints - forbidden variables

  • Σ: is an alphabet

  • b: Should be #t whenever the generated expression must be nullable.

  • p: Depth of the expression. If p is 0, only generate terminals or allowed variables. Samples n values from g.

procedure

(gen:var varSize)  gen?

  varSize : exact-positive-integer?
Creates a generator for a variable (as a string) the varSize is the maximum length of the string.

procedure

(gen:symbolVar varSize)  gen?

  varSize : exact-positive-integer?
Creates a generator for a variable (as a symbol) the varSize is the maximum length of the symbol.

3  Changing the output structure of the generator🔗ℹ

In some cases, it may be helpful to have the generator produce an output format more suitable for other applications, such as creating a proper source code or testing a specific library.

The generator uses a structure (referred to as a factory) that contains the constructors for all the parsing expressions and the grammar, and this structure can be set with setSynFactory. The module peg-gen-syntax contains an alternative structure called PEGStructF.

> (require peg-gen
           peg-gen/peg-gen-syntax)
> (setSynFactory PEGStructF)
> (sample (gen:peg 3 2 1) 3)

(list

 (GPEG '#hash() (GLit 0) '())

 (GPEG '#hash() (GLit 0) '())

 (GPEG

  (hash

   'F

   (GAlt (GLit 2) (GEps))

   'R

   (GSeq (GLit 2) (GLit 1))

   'S

   (GSeq (GVar 'R) (GVar 'R)))

  (GAlt (GLit 0) (GLit 0))

  (list

   (cons 'F (TyPEG #t '()))

   (cons 'S (TyPEG #f '(R)))

   (cons 'R (TyPEG #f '())))))

The structure is defined as follows:
(struct PEGFSyn
        ( mkEps
          mkLit
          mkVar
          mkNot
          mkKle
          mkSeq
          mkAlt
          addRule
          mkEmptyGrm
          mkPEG))

where:
  • mkEps : A function with no parameter that builds an empty PEG symbol.

  • mkLit : A function that takes one parameter (from Σ) and builds a terminal symbol.

  • mkVar : A function that takes one parameter (a racket symbol from the set of non-terminals) and builds a no-terminal (or a variable) symbol.

  • mkSeq and mkAlt : Takes two parameters and builds a sequence and an alternative constructions respectively.

  • mkNot and mkKle : Takes one parameter and builds a not predicate and a repetition construction , respectively.

  • addRule : Takes three parameters: A representation of the Grammar (G), a representation of non-terminal name (NT) and a parsing expression (E) and produces a new grammar adding the rule NT <- E to it.

  • mkEmptyGrm : Takes no parameters and returns a representation of the empty grammar.

  • mKPEG : Takes a representation of grammar, one parsing expression (as the start parsing expression) and a type context (as a list of pairs) and builds a structure for the PEG.

As an example consider the follwing definition for the default factory used by peg-gen:

(define defaultFactory
     (PEGFSyn
      (lambda () 'ϵ)
      (lambda (x) x)
      (lambda (x) x)
      (lambda (e) (list '! e))
      (lambda (e) (list '* e))
      (lambda (x y) (list ' x y))
      (lambda (x y) (list '/ x y))
      (lambda (g nt e) (list nt e g))
      (lambda () ')
      (lambda (g start gamma) (list g start gamma))))

Note that in this syntax, we represent a PEG as a recursive tuple of a Non-terminal, its right-hand side followed by the rest of the grammar. The symbol ∅ represents the empty grammar. Notice that the constructors only expect to build PEGs without types. The generation algorithm produces the types. The type context is given to the constructor mkPEG at the end of the generation process.

The peg-gen-syntax module contains the definition of an alternative structure that can be used as the output of the generator:
(struct GPEG (nt start gamma)  #:transparent)
(struct GEps ()                #:transparent)
(struct GLit (chr)        #:transparent)
(struct GVar (var)        #:transparent)
(struct GAlt (left right) #:transparent)
(struct GSeq (left right) #:transparent)
(struct GNot (exp)        #:transparent)
(struct GKle (exp)        #:transparent)

As an second example, here is the definition of the factory for that structure:
(define PEGStructF
    (PEGFSyn
      (lambda ()  (GEps))
      (lambda (x) (GLit x))
      (lambda (x) (GVar x))
      (lambda (e) (GNot e))
      (lambda (e) (GKle e))
      (lambda (x y) (GSeq x y))
      (lambda (x y) (GAlt x y))
      (lambda (g nt e) (hash-set g nt e))
      (lambda () (make-immutable-hash))
      (lambda (g start gamma) (GPEG g start gamma))))

The module that defines this data type also define some pretty print functions:

procedure

(gpeg->string GPEG)  String?

  GPEG : GPEG?
Converts a PEG into a string.

procedure

(gpexp->string exp)  String?

  exp : any?
Converts a parse expression into a string.

4 Generating Ill-typed PEGs🔗ℹ

As an experimental feature, we add the two new generators that generate PEGs that are ill-typed (not well-formed according to Ford’s definition).

procedure

(gen:ill-peg maxVars maxLits maxDepth)  gen?

  maxVars : exact-positive-integer?
  maxLits : exact-positive-integer?
  maxDepth : exact-positive-integer?
It has the same parameters as gen:peg but generates an ill-typed PEG instead. The gamma context this generator returns will have explicit annotations on the intended ill-typed non-terminals. However, note that the start expression does not have a type annotation.

procedure

(gen:ill-peg-s maxVars maxLits maxDepth b)  gen?

  maxVars : exact-positive-integer?
  maxLits : exact-positive-integer?
  maxDepth : exact-positive-integer?
  b : boolean?
It has the same parameters as gen:peg-s but generates an ill-typed PEG instead. The gamma context this generator returns will have explicit annotations on the intended ill-typed non-terminals. However, note that the start expression does not have a type annotation.