1.10 Syntax Objects🔗ℹ

If we write pi * pi, then given our earlier definition of pi, the result is as you’d expect, even if we include a lot of extra space around the *:

> pi    *  pi

- Int

9

We could add string quotes around the expression and get a completely different result:

> "pi    *  pi"

- String

"pi    *  pi"

If you’re studying programming languages and building interpreters, then a string representation of program text is potentially useful. In particular, you might want to treat that text as being a program in a different language than Shplait. Still, parsing the pi identifiers and * operator out of that string, including ignoring the unimportant whitespace, is a lot of work.

Shplait offers a different kind of quoting via single-quote marks '':

> 'pi    *  pi'

- Syntax

'pi * pi'

Superficially, this interaction is similar to the one with a string, but there are two differences: the type was reported as Syntax, and the whitespace has been normalized. Whitespace is normalized because a syntax object is not just a sequence of characters, but something that is been parsed into structured components.

When you compare syntax objects with ==, then the comparison is based on that structure, ignoring whitespace differences or, say, the particular way that a number value is written.

> '1 + 2.0' == '1  +  2.0000'

- Boolean

#true

> '1 + 2' == '3 + 0'

- Boolean

#false

While the syntaxes 2.0000 and 2.0 are equivalent ways of writing the inexact number 2.0, note that == here is not checking whether the interpreted values of the two quoted expressions would be the same. The syntax objects might not even be intended as Shplait expressions. For example, maybe '1 + 2.0' is meant to represent a set that contains two numbers, instead of adding them.

To give a different interpretation to a syntax object, you would need to inspect the pieces. One way to inspect is using syntax_split, which extracts the pieces of a single-line syntax object.

> syntax_split('pi * pi')

- Listof(Syntax)

['pi', '*', 'pi']

Parenthesized terms and blocks formed with : count as individual terms for syntax_split, although they also have nested structures.

> syntax_split('1 * (3 + 4)')

- Listof(Syntax)

['1', '*', '(3 + 4)']

The structure implemented by syntax objects is shrubbery notation. Shrubbery notation defines the syntax of numbers, operators, identifiers, and it defines how newlines and indentation work with | and :, but it doesn’t give an interpretation to those forms.

Splitting a syntax object like '1 2 3' produces three syntax objects, but those syntax object are still distinct from Shplait numbers. Clearly, there is a correspondence between the syntax object '1' and the number 1, the syntax object '#false' and the boolean #false, and the syntax object 'x' and the symbol #'x. Functions like syntax_is_integer, syntax_to_integer, and integer_to_syntax let you move between program representations as syntax object and values that you can compute with at the Shplait level.

> syntax_is_integer('1')

- Boolean

#true

> syntax_to_integer('1')

- Int

1

> integer_to_syntax(1)

- Syntax

'1'

> syntax_is_integer('x')

- Boolean

#false

In principle, you can use syntax_split and conversion functions like syntax_is_integer to pull apart syntax objects in any way. That quickly gets tedious, however, and Shplait offers better support for manipulating syntax objects with patterns and templates, as we see in the next section.