6.52 Syntax Objects
A syntax object encapsulates a shrubbery term, group, or multi-group sequence with binding scopes and other metadata on individual terms, and metadata potentially on individual syntax objects. See Shrubbery Notation for information on shrubbery notation, and specifically Parsed Representation for information on representing shrubbery terms as Rhombus values. The Syntax.make function takes such a value and wraps it as a syntax object, so that it can accumulate binding scopes or hold other metadata, and functions like Syntax.unwrap expose that structure.
In addition to normal shrubbery structure, a syntax object can contain parsed terms, which are opaque. The meaning and internal structure of a parsed term depends on the parser that produced it. In the case of parsing a syntax object as a Rhombus expression via expr_meta.Parsed, a parsed term encapsulates a Racket expression. Pattern matching and functions like Syntax.unwrap treat parsed terms as opaque.
An quoted sequence of terms using '…' is parsed as an implicit use of the #%quotes form, which is normally bound to create a syntax object. For example, '1.000' is a syntax object that wraps the number 1.0.
Metadata for a syntax object can include a source location and the raw source text for a term, such as "1.000" for a 1.0 that was written originally as 1.000. Raw-source metadata is used when printing a syntax error for a syntax object. Besides the main text of a term, metadata can include a prefix string and/or suffix string, which is used when printing a sequence of terms to reflect the original layout. A group syntax object internally starts with a group tag that normally contains only prefix and suffix text, leaving the group elements to supply their own text forms. Finally, a syntax object can contain a tail string or and/or a tail suffix; those normally appear only on a tag at the start of a syntax object that represents a pair of parentheses, brackets, braces or quotes, where the tail string corresponds to the closer, and the tail suffix corresponds to text after the closer.
stx.unwrap()
is
Syntax.unwrap(stx)
stx.unwrap_op()
is
Syntax.unwrap_op(stx)
stx.unwrap_group()
is
Syntax.unwrap_group(stx)
stx.unwrap_sequence()
is
stx.unwrap_all()
is
Syntax.unwrap_all(stx)
stx.srcloc()
is
Syntax.srcloc(stx)
stx.is_original()
is
Syntax.is_original(stx)
stx.strip_scopes()
is
Syntax.strip_scopes(stx)
stx.replace_scopes(like_stx)
is
Syntax.replace_scopes(stx, like_stx)
stx.relocate(to)
is
Syntax.relocate(stx, to)
stx.relocate_span(like_stxes)
is
Syntax.relocate_span(stx, like_stxes)
is
Syntax.property(stx, key, ...)
stx.to_source_string()
is
Constructs a syntax object. When a single term is present, the result is a single-term syntax object. When a single term ... group is present with multiple terms, the result is a group syntax object. The general case is a multi-group syntax object.
The #%quotes form is implicitly used when '…' is used in in an expression position. See also Implicit Forms.
> '1'
'1'
> 'pi'
'pi'
> '1 + 2'
'1 + 2'
> '1 + 2
3 + 4'
'1 + 2; 3 + 4'
A $ as a term unquotes (i.e., escapes) the expression afteward; the value of that expression replaces the $ term and expression. The value is normally a syntax object, but except for lists, other kinds of values are coerced to a syntax object. Nested '…' forms are allowed around $ and do not change whether the $ escapes.
'x y z'
'x 3 z'
'«x '3' z»'
The result of the expression after $ can be a list, in which case and the elements of the list are spliced as terms in place of the $ term and expression witin the enclosing group. If the result is a syntax object, it can be a single-term syntax object or a group syntax object; in the latter case, the group terms are spliced in place of the escape.
> 'x $[1, 2, 3] z'
'x 1 2 3 z'
> 'x $('1 2 3') z'
'x 1 2 3 z'
Similarly, when an $ escape is alone within its enclosing group, then the result of the expression after $ can be a multi-group syntax object, in which case the group sequence is spliced in place of the escape.
> 'x; $('1; 2 3; 4'); z'
'x; 1; 2 3; 4; z'
A ... as a term must follow a term that includes at least one escape, and each of those escapes must contain a repetition instead of an expression. The preceding term is replaced as many times as the repetition supplies values, where each value is inserted or spliced into the enclosing sequence.
'(1 + 1) (1 + 2) (1 + 3)'
'0 + 1 + 2 + 3'
'0 + 1 + 2 + 3'
Multiple escapes can appear in the term before ..., in which the repetitions are drawn in parallel (assuming that they are at the same repetition depth), repetition ... can be nested around escapes, consecutive ... splice deeper repetitions, and so on, following the normal rules of repetitions.
Quotes work as a repetition to construct multiple syntax objects within another kind of repetition context, such as forming a list. All escapes must then be repetitions, instead of just expressions, and the depth of the repetition is the amount of repetition depth left over from the deepest escape.
> ['[$x, ...]', ...]
['[1, 2, 3]', '[4]', '[5, 6]']
binding operator |
#%quotes 'term ...; ...' |
Matches a syntax object consistent with terms. Identifiers and operators are matched symbolically (unrelatd to binding), and other atomic terms are matched using == on unwrapped syntax objects.
A $ within term escapes to a subsequent unquoted binding that is matched against the corresponding portion of a candidate syntax object. A ... in term following a subpattern matches any number of instances of the preceding subpattern, and escapes in the pattern are bound as repetitions. Unlike binding forms such as List, ... can appear before the end of a sequence, and multiple ... can be used in the same group; when matching is ambiguous, matching prefers earlier ... repetitions to later ones.
A $ or ... as the only term matches each of those literally. To match $ or ... literally within a larger sequence of terms, use $ to escape to a nested pattern, such as $('$').
To match identifier or operators based on binding instead of symbolically, use $ to escape, and then use bound_as within the escape.
The #%quotes form is implicitly used when '…' is used in in a binding position. See also Implicit Forms.
['1', '2']
| '($x/1) ...': [x, ...]
['1', '2', '3']
| '$x ... * 3': [x, ...]
['1', '+', '2']
['1', '+', '2']
['3']
annotation |
|
annotation |
|
annotation |
|
annotation |
|
annotation |
|
annotation |
|
annotation |
|
annotation |
|
annotation |
Term matches only a single-term syntax object.
Group matches only a single-group syntax object.
Block matches only a block (which is a single-term syntax object).
TermSequence matches only a single-group syntax object or an multi-group sequence with zero groups.
Identifier matches only an identifier (which is a single-term syntax object).
Operator matches only an operator (which is a single-term syntax object).
Name matches a syntax object that is an identifier, operator, or dotted multi-term group that fits the shape of an op_or_id_name.
IdentifierName matches a syntax object that is an identifier or dotted multi-term group that fits the shape of an id_name.
Only allowed within a '…' expression form, escapes so that the value of expr is used in place of the $ form.
The expr must be either a single term or a sequence of .-separated identifiers. To escape only an identifier (or .-separated identifier sequence) with an unescaped . afterward, use parentheses around the identifier (or sequence).
binding operator | |||||||||||||||||||||
| |||||||||||||||||||||
| |||||||||||||||||||||
| |||||||||||||||||||||
|
Only allowed within a '' binding pattern, escapes to a unquoted binding pattern. Typically, the unquoted pattern has an id that is not bound as a unquote binding oerator; the id is then bound to the corresponding portion of the syntax object that matches the '' form.
['1', '2', '3']
A _ as a syntax pattern binding matches any input, like an identifier does, but without binding an identifier.
'2'
A parenthesized escape is the same as the escape itself. Parentheses are needed to use the :: operator, since an $ escape must be followed by a single term, and a use of the :: operator consistent of three terms.
'2'
'2'
Empty parentheses as an escape, $(), serve as a group pattern that is only useful as a group tail, where it matches an empty tail. This escape is primariy intended for use with macro-definition forms like macro.
An escape that contains a '…'-quoted term matches the term as a nested syntax-object pattern. In a term context, a multi-term escape is spliced into the enclosing group. One use of a quoted escape is to match a literal $ or ... so that it is not treated as an escape or binding repetition in an enclosing pattern.
'3'
'3'
['1', '2', '3']
The :: operator is used to associate a syntax class with an identifier. See :: for more information.
The &&, ||, and ! operators combine matches. See &&, ||, and ! for more information.
The pattern form is a shorthand for using :: with an inline syntax_class form. See pattern for more information.
Other syntax pattern binding forms can be defined with unquote_bind.macro.
For use within a $ escape within a syntax pattern. See $.
unquote binding |
#%quotes 'term ...; ...' |
For use within a $ escape for a nested binding pattern. See $.
A && binds all variables from its arguments, while || and ! bind none of them.
Independent matching for && means that in a term context, combinding a variable binding with a splicing multi-term binding will not enable a multi-term splicing match for the variable; instead, the pattern will fail to match a multi-term splice.
A ! can only be used in a term context, negating a term binding.
> a
'(1 2 3)'
> b
'1 2 3'
> def '$(b && '$_ $_ $_') done' = '1 2 3 done' // b in term context
def: value does not satisfy annotation
value: ’1 2 3 done’
annotation: ’$(b && ’$_ $_ $_’) done’
Unquote binding operator for use with $ that binds id for a match to syntax_class.
The syntax_class_ref can be a predefined class such as Term, Identifier, or Group, among others, it can be a class defined with syntax_class, or it can be an parenthesized inline syntax_class form that omits the class name. A class defined with syntax_class may expect arguments, which must be supplied after the syntax class name.
The id before :: refers to the matched input, and it is a repetition if the syntax class has classification ~sequence. The identifier can be combined with . to access fields (if any) of the syntax class. If id is _, then it is not bound.
A block supplied after syntax_class_ref exposes fields of match as directly bound pattern identifier. For each field_id as pattern_id that is supplied, then pattern_id is bound directly to the to the named field’s value. Suppling just an field_id binds using the same identifier. Supplying open is a shorthand for listing every field to bind using its own name, and it cannot appear multiple times or be combined with expose clauses for individual fields.
syntax_class Wrapped:
kind: ~term
| '($content)'
['(2)', '2']
'2'
'2'
'2'
> match '(hello there)'
| '$(whole :: (syntax_class:
kind: ~term
| '($content)'))':
[whole, whole.content]
['(hello there)', 'hello there']
binding operator | |||||||||
| |||||||||
| |||||||||
unquote binding | |||||||||
| |||||||||
| |||||||||
|
When directly used in a binding context, pattern acts as a shorthand for a syntax pattern with the pattern form as the only term.
fun simplify(e):
match e
| '($e)': simplify(e)
| '0 + $e': simplify(e)
| '$e + 0': simplify(e)
| (pattern
match_when same(simplify(b), simplify(c))):
simplify(a)
| (pattern
match_when same(simplify(b), simplify(c))):
simplify(a)
| ~else: e
expression |
Syntax.literal 'term ...; ...' |
|
expression |
Syntax.literal (term ..., ...) |
There’s no difference in result between using '…' or
() after literal_syntax—
Metadata, such as raw source text, is preserved for the term sequence, but not any metadat that might be on the group as a whole when the terms form a single group.
> Syntax.literal 'x'
'x'
> Syntax.literal (x)
'x'
> Syntax.literal '1 ... 2'
'1 ... 2'
> Syntax.literal '$ $ $'
'$ $ $'
expression |
Syntax.literal_group 'term ...' |
|
expression |
Syntax.literal_group (term ...) |
Metadata, such as raw source text, is preserved for the term sequence, but not any metadata that might be on the group as a whole when the terms form a single group.
> Syntax.make(1.0)
'1.0'
> Syntax.make([#'parens, '1.0', '2', '"c"'])
'(1.0, 2, "c")'
> Syntax.make([#'alts, ': result1', ': result2'])
'|« result1 » |« result2 »'
> Syntax.make(['1.0', '2', '"c"'])
Syntax.make: invalid as a shrubbery term representation
value: [’1.0’, ’2’, ’"c"’]
function | |||
|
> Syntax.make_group([1.0, 2, "c"])
'1.0 2 "c"'
> Syntax.make_group(['if', 'test', [#'alts, ': result1', ': result2']])
'if test |« result1 » |« result2 »'
> Syntax.make_group(['1 2'])
Syntax.make_group: invalid as a shrubbery term representation
value: ’1 2’
> Syntax.make_sequence(['1 2 3', 'a b'])
'1 2 3; a b'
> Syntax.make_op(#'#{+})
'+'
function | |||
> Syntax.make_id("hello" +& 7, 'here')
'hello7'
function | |||
|
Unless keep_name is true, the name argument can be any value, and the name of the generated identifier may be derived from name for debugging purposes (especially if it is a string, symbol, or identifier). If keep_name is true, the name argument must be an identifier, symbol, or (readable) string, and the result identifier has exactly the given name.
> Syntax.make_temp_id("hello")
'hello1'
> Syntax.make_temp_id("hello", ~keep_name: #true)
'hello'
> Syntax.unwrap('1.0')
1.0
> Syntax.unwrap('(a, "b", ~c)')
['parens', 'a', '"b"', '~c']
> Syntax.unwrap(': b; c')
['block', 'b', 'c']
> Syntax.unwrap('| a | b')
['alts', ':« a »', ':« b »']
> Syntax.unwrap('1 2 3')
Syntax.unwrap: multi-term syntax not allowed in term context
syntax: ’1 2 3’
> Syntax.unwrap_op('+')
#'#{+}
Following the usual coercion conventions, a term syntax object for stx is acceptable as a group syntax object.
> Syntax.unwrap_group('1.0')
['1.0']
> Syntax.unwrap_group('1 2 3')
['1', '2', '3']
> Syntax.unwrap_group('a: b; c')
['a', ':« b; c »']
> Syntax.unwrap_group('1; 2; 3')
Syntax.unwrap_group: multi-group syntax not allowed in group context
syntax: ’1; 2; 3’
Following the usual coercion conventions, a term or group syntax object for stx is acceptable as a multi-group syntax object.
> Syntax.unwrap_sequence('1.0')
['1.0']
> Syntax.unwrap_sequence('1 2 3')
['1 2 3']
> Syntax.unwrap_sequence('1; 2; 3')
['1', '2', '3']
> Syntax.unwrap_all('(1 + 2)')
[#'parens, [#'group, 1, [#'op, #'#{+}], 2]]
When to is a syntax object, the specific source of metadata from to depends on its shape. If it is a single-term parenthesis, brackets, braces, quotes, block or alternatives form, then metadata is taken from the leading tag in the representation of the form. In the case of a single-term operator, metadata is taken from the operator token, not the op tag. In the case of a group syntax object, metadata is taken from the group tag.
In the same way, metadata is applied to stx based on its shape. Transferring metadata thus makes the most sense when stx and to have the same shape.
function | ||||
|
function | ||||
| ||||
| ||||
function | ||||
|