On this page:
2.1 Hyperbracketed Notations Predating Punctaffy
2.2 Potential Application Areas
2.2.1 Potential Application:   Hygiene
2.2.2 Potential Application:   Incremental Compilation
2.2.3 Potential Application:   Abstraction-Respecting Call Stacks
2.2.4 Potential Application:   Interactions Between unsyntax and Ellipses
2.2.5 Potential Application:   Opetopes and Higher Categories
2.2.6 Potential Application:   Type Theories with Transpension
2.2.7 Potential Application:   User-Defined Escapes from Nestable Quotation
8.12

2 Motivation for Punctaffy🔗ℹ

    2.1 Hyperbracketed Notations Predating Punctaffy

    2.2 Potential Application Areas

      2.2.1 Potential Application: Hygiene

      2.2.2 Potential Application: Incremental Compilation

      2.2.3 Potential Application: Abstraction-Respecting Call Stacks

      2.2.4 Potential Application: Interactions Between unsyntax and Ellipses

      2.2.5 Potential Application: Opetopes and Higher Categories

      2.2.6 Potential Application: Type Theories with Transpension

      2.2.7 Potential Application: User-Defined Escapes from Nestable Quotation

2.1 Hyperbracketed Notations Predating Punctaffy🔗ℹ

None of these existing examples involves a hypersnippet of degree 3 or greater. So, aside from the trivial degree-0 and degree-1 examples, all the examples above have degree 2.

2.2 Potential Application Areas🔗ℹ

2.2.1 Potential Application: Hygiene🔗ℹ

Hygienic macroexpansion usually generates code where certain variables are only in scope across some degree-2 hypersnippet of the code.

For instance, Racket first colors all the input subforms of a macro call using a new scope tag, and then it inverts that color in the result so that the color winds up only occurring on the code the macro generates itself, not the code it merely passes through.

Racket’s strategy works, but it relies on the generation of unique tags and the creation of invisible annotations throughout the generated code.

If we were to approach hygiene by using explicit hypersnippets instead, it might lead to more straightforward or less error-prone implementations of macro hygiene. If enough people find it convenient enough to do structural recursion over nested hypersnippets, then they may find this skill lets them easily keep a lid on the local details of each hypersnippet, just as traditional forms of structural recursion make it easy to keep the details of one branch of a tree data structure from interfering with unrelated branches.

2.2.2 Potential Application: Incremental Compilation🔗ℹ

In a language like Racket where programmers can write arbitrarily complex custom syntaxes, compilation can be expensive. This can drag down the experience of editing code the DrRacket IDE, where features like jump-to-definition can depend on performing background expansion to process the user’s custom syntaxes.

If macros weren’t tree-to-tree transformations, but instead consumed only some small part of the tree and generated a degree-2 hypersnippet, then a modification of one local part of a file could lead to a pinpoint re-expansion of the specific macro call that processed that part of the file, rather than a costly re-expansion of every macro call in the whole file.

Incidentally, this would bring s-expression macro calls into a closer analogy with reader macro calls, which themselves consume some bounded part of the source text stream and generate an s-expression.

2.2.3 Potential Application: Abstraction-Respecting Call Stacks🔗ℹ

When using higher-order functional programming techniques, the call stack can sometimes be cluttered with the implementation details of higher-order functions.

For instance, if filter were implemented in terms of append-map, then call stacks observed during the execution of the filter predicate might look like:

...

<filter-callback>

<append-map-callback>

append-map

filter

...

But this may be too much information if someone’s not trying to debug the filter function itself. If filter were implemented in a more direct way, the call stack would only show:

...

<filter-callback>

filter

...

If call stacks were hyperstacks, then the filter and append-map calls could be degree-2 hyperstack frames, essentially giving the call stack a nested structure:

...

{

  {

  } append-map

} filter

...

With tools built on this infrastructure, the user could view the stack in as much or as little detail as they’d like, using UI toggles or a verbosity-controlling stylesheet to collapse and expand sections of interest.

2.2.4 Potential Application: Interactions Between unsyntax and Ellipses🔗ℹ

In Introduction to Punctaffy, we talk about the relationship between hypersnippets and Racket’s syntax DSL for syntax templates (or, more specifically, its datum DSL for s-expression templates).

The quasisyntax form (abbreviated #`) introduces the quasiquotation-like ability to use unsyntax and unsyntax-splicing (abbreviated #, and #,@) to interpolate arbitrary expression results into a template.

There’s a noticeable missed opportunity in the current design of the DSL, and it has to do with how these interpolated expressions interact with ellipses (...). Usually, ellipses iterate the template they apply to, and template variables have an associated ellipsis depth that specifies how many dimensions of nested iteration they should undergo (usually because they were created by matching a syntax-parse or syntax-case template with the same number of nested ellipses).

However, expressions interpolated with unsyntax don’t have any ellipsis depth information. Even if they contain another syntax or quasisyntax call inside, the variables in that second call can’t interact with the ellipses in the first call. For insatnce, the following doesn’t work, because pat and val only occur in the template #'(pat val), not in the template that has the ellipsis:

(define-syntax (match-let/derived stx)
  (syntax-parse stx
    [(_ orig-stx ([pat val] ...) body ...)
     (syntax-protect
       #`(let ()
           #,(datum->syntax
               #f
               `(,#'match-define ,@#'(pat val))
               #'orig-stx)
           ...
           (let ()
             body
             ...)))]))

(TODO: See if we can make this a better example. The idea here is that the macro defines something like a cross between match/derived and match-let, and it manipulates the call to match-define to insert the source location of the given term to help with accurate error-reporting. However, we haven’t tested that this manipulation would actually improve the error-reporting this way, and using the actual match/derived would likely be a better idea than trying to make match-define work just right.)

Incidentally, the syntax DSL already has an experimental feature that caters to this situation: template metafunctions. A template metafunction can run arbitrary code during the iteration of a template:

(define-syntax (match-let/derived stx)
  (syntax-parse stx
    [(_ orig-stx ([pat val] ...) body ...)
 
     (define-template-metafunction (reattributed-match-define stx)
       (syntax-parse stx
         [(_ pat val)
          (datum->syntax
            #f
            `(,#'match-define ,@#'(pat val))
            #'orig-stx)]))
 
     (syntax-protect
       #'(let ()
           (reattributed-match-define pat val)
           ...
           (let ()
             body
             ...)))]))

However, using define-template-metafunction substantially rearranges the code. That can be good in cases like this one, where the concept can be associated with a simple name and interface. On the other hand, unsyntax comes in handy in situations where the code’s navigability benefits from having some structural resemblance to the results it produces, or in situations where multiple DSLs have synergy together but haven’t yet been fused into a single monolithic DSL.

If for some reason we have a strong preference to arrange this code in the unsyntax style, then what we really need here is for #'(pat val) not to be a new template but a resumption of the original syntax. We need some kind of un-unsyntax, perhaps abbreviated #,!. Then we could replace #'(pat val) with #,!(pat val) and be on our way.

The concept of un-unsyntax fits neatly into the hypersnippet concept we explore in Punctaffy. We can consider quasisyntax to be opening a degree-3 hypersnippet, unsyntax to be opening a degree-2 hole in that hypersnippet, and un-unsyntax to be opening a degree-1 hole in that degree-2 hole.

Punctaffy currently defines a taffy-quote-syntax operation, but it corresponds to quote-syntax rather than the syntax template DSL. Suppose Punctaffy were to define a taffy-syntax operation in that combined spirit, which used Punctaffy’s baseline hyperbrackets and had support for syntax DSL features like ellipses. Using that operation, the above code could look like this:

(define-syntax (match-let/derived stx)
  (syntax-parse stx
    [(_ orig-stx ([pat val] ...) body ...)
     (syntax-protect
       (taffy-syntax
         (^<d 3
           (let ()
             (^>d 2
               (list
                 (datum->syntax
                   #f
                   `(,#'match-define ,@(^> (pat val)))
                   #'orig-stx)))
             ...
             (let ()
               body
               ...)))))]))

This seems to be a rare example of where a degree-3 hypersnippet specifically would come in handy. Most of the things we say about hypersnippets of degree 3 aren’t so specific; they would make just as much sense at degree 4, 5, etc.

This example seems like an instance of a pattern, though: The basic principle of this example is that we want two different invocations of an embedded DSL to be part of the same "whole program" to allow some nonlocal interaction between them. The nonlocal interactions here aren’t too exotic; they’re basically lexically scoped variables, with ellipses acting as the binding sites.

Other embedded DSLs with lexically scoped interactions, such as type inference and type-directed elaboration, may similarly benefit from degree-3 lexical structure.

2.2.5 Potential Application: Opetopes and Higher Categories🔗ℹ

An opetope is a kind of geometric shape that arises in certain formulations of higher category theory.

A category is like the collection of paths in a directed graph: Two paths (called morphisms) can match up end-to-end, and if they do, together they make a single path. Path-joining like this is associative; if we join some number of paths together this way, it doesn’t matter which ones we join first.

Higher category theory extends this to higher-dimensional shapes. If two paths share both their endpoints, we can think of a soap bubble where current flows across from one of them to the other, a path between paths (called a 2-cell). And given two of these, we can imagine a geometric solid where current flows from one of these faces to the other (called a 3-cell).

We can study more subtle category-like systems this way: Instead of simply knowing whether two paths are equal or not, we can point to a 2-cell and say they’re equivalent according to this specific equivalence that translates between them. This is appealing to many category theorists, because a lot of interesting category-theoretic results don’t depend at all on whether paths have particular endpoints (called objects), only that when joining paths, there’s some equivalence-like way to join up one endpoint to another. Once we’re looking at 2-cells, a lot of interesting results have nothing to do with what specific 1-cells/paths/morphisms they begin and end at.

However, this kind of generalization also makes it quite a bit more difficult to formalize the meaning of a higher category. A number of different formalizations exist.

What we’ve described just now is a globular approach, where each cell of dimension (N + 1) begins at a cell of dimension N and ends at another cell of dimension N with the same endpoints.

Perhaps we should instead take a cubical approach, where the source and target don’t necessarily have "the same" endpoints, just edges that are related by other cells. Then a 2-cell is shaped like a square, a 3-cell is a cube, and higher cells are hypercubes.

Or perhaps we could take a simplicial approach, where the higher cells that mediate compositions of two paths are triangles, the ones that mediate compositions of three paths are tetrahedra, and the ones that mediate compositions of more paths are higher-dimensional simplex polytopes.

But one of the approaches is what we’re here for, because it resembles Punctaffy’s hypersnippets: The opetopic approach.

Opetopes are another set of higher-dimensional shapes like hypercubes or simplexes. An opetopic cell of dimension (N + 1) has a target cell of dimension N and any number of source cells of dimension N that are in a composable arrangement. For instance, an opetopic 2-cell ends at a path, and it begins with any number of paths aligned end-to-end. This makes opetopic 2-cells just the right shape for saying "this arrangement of paths composes to make something that’s related to this other path in this particular way." Opetopic 3-cells are good for the same thing, but they relate a composable arrangement of 2-cells (i.e. a tree) to a single 2-cell. Since these are many-input, single-output relations, a composition of 2-cells tends to look like a tree, and higher cells are composed along "higher-dimensional trees."

The peculiar shape of opetopes lets us talk about composition at the same time as we talk about cells being related to each other. Instead of postulating ideas of composition and equivalence as separate ideas, we can define them both in terms of the existence of certain opetopic cells. If we’re looking for a cell that acts as the composition of some composable arrangement of cells, all we need is for the arrangement to be the source of some equivalence-like cell; that way the target of that cell is the composition we’re looking for (or equivalent to it, which is just as good). Once we have composition, we can also use cells like these to relate one arrangement to another, by first taking the composition of the target arrangement and making that the target cell of another opetope. In this fashion, we can build up a theory of weak higher categories by merely requiring that enough equivalence-like cells exist (and are actually equivalence-like in their relationships with each other).

Now we can finally relate opetopes to hypersnippets, specifically hypertees. At low dimensions, there’s a striking resemblance: Degree-2 hypertees are shaped like (^< (^> _) (^> _) (^> _) ...) with some number of degree-1 holes in sequence, which looks just like the way an opetopic 2-cell represents a composition of any number of paths. There’s only one degree-0 hypertee, and only one degree-1 hypertee, and the same is true for dimension-0 and dimension-1 opetopes. At degree 3, the degree-2 holes of a hypertee can be arranged like a tree, and the source cells of a dimension-3 opetope are also arranged like a tree.

However, there does seem to be a discrepancy. This is apparent when we look at algebraic data types that represent opetopes and hypersnippets.

The slide deck "Type Theory and the Opetopes" by Eric Finster gives a richly illustrated overview of opetopes as they’re used in opetopic higher category theory and opetopic type theory. One slide mentions a type MTree representing "possible ill-typed A-labeled pasting diagrams":

data MTree (A : Set) : N → Set where

  obj : MTree A 0

  drop : {n : N} → MTree ⊤ n → MTree A (n + 2)

  node : {n : N} → A → MTree (MTree A (n + 1)) n → MTree A (n + 1)

This corresponds to three mutually exclusive kinds of opetopes:

In short, an opetope falls into one of three cases: Either it’s the trivial dimension-0 opetope where sources aren’t allowed at all, it has no sources, or it has at least one source.

As for hypersnippets, an analogous data type definition for possible ill-typed A-labeled hypertees would look like this:

data Hypertee (A : Set) : N → Set where

  zero : Hypertee A 0

  hole :

    {m, n : N} → {m < n} → A → Hypertee (Hypertee A n) m →

      Hypertee A n

These cases correspond to hypertee-coil-zero and hypertee-coil-hole, and we can describe them like this:

Going from opetopes to hypertees, we notably lose the existence of "drop" opetopes, but the "at least one" case has become more lenient and admits more hypertees to take the place of what we lost. Specifically, the "at least one" case no longer limites us to incrementing the dimension by precisely 1.

This gives us all the shapes we had before. The degree-2 hypertee (^<) has zero degree-1 holes (^> _), and in this way it is exactly the same shape as the "drop" dimension-2 opetope which has zero degree-1 source cells. However, we don’t represent it by a zero-hole "drop" case; we represent it as having at least one hole: Its degree-0 hole. The degree-0 hole corresponds not to one of the dimension-2 opetope’s dimension-1 source cells, but to its dimension-1 target cell’s dimension-0 source cell. We can similarly encode any other "drop" opetope as a hypertee by encoding its target cell (which is not itself a "drop" opetope since it has one source cell) and then adjusting that encoding to have a degree 1 greater.

In fact, we have some hypertees that don’t correspond to any opetopes. For instance, the degree-2 hypertee (^< (^> _) (^> _)) corresponds to an opetope, but setting its degree to 3 gives us a degree-3 hypertee (^<d 3 (^> _) (^> _)) that has no such correspondence. This hypertee isn’t a drop, because a drop’s target must have exactly one source (of the same shape as its target, so that it can represent an identity element that unimposingly slots into compositions along that shape).

On the other hand, if we turn our attention to the role of these shapes as conceptual notations for higher category theory, then in some sense it makes sense for (^<d 3 (^> _) (^> _)) to be a "drop." That’s because, while the degree-2 target (^< (^> _) (^> _)) isn’t syntactically identity-shaped the way (^< (^> _)) is, it’s still semantically identity-shaped: A higher category has an equivalence-like cell of this shape—namely, one of the 2-cells that relates two composable paths to the combined path they become.

The story isn’t as simple as saying that there are more hypertees than opetopes. The two approaches seem to have an affinity for different kinds of generalization:

Still, hypertees and opetopes are two systems that seem to have a lot in common. This finally brings us to the potential applications:

It’s quite likely Punctaffy will be easier to explain if we borrow some of the graphical visualization techniques from the existing body of work on opetopes. Conversely, we suspect opetopic type theory in particular would benefit from hyperbrackets as a more intuitive notation for specifying opetopic shapes in textual code.

If hypertees aren’t quite the same as opetopes, then perhaps hypertees constitute yet another geometry for higher category theory, alongside shapes like hypercubes and simplexes. Perhaps this approach to higher category theory would be more appealing than the others in some ways, at least thanks to the ease of representing its higher-dimensional shapes in textual code if nothing else.

If we pursue hypersnippet-shaped higher category theory, then Punctaffy’s progress so far can give us some clues as to how that theory might be formulated. In particular, the snippet system interface likely bears a strong resemblance to the interface of an internalized hypersnippet-based higher category, similar to the way the Haskell Category type class can be seen as the interface of an internalized 1-category.

2.2.6 Potential Application: Type Theories with Transpension🔗ℹ

The paper "Transpension: The Right Adjoint to the Pi-type" by Andreas Nuyts and Dominique Devriese discusses several type theories that have operations that we might hope to connect with what Punctaffy is doing. Transpension appears to be making use of degree-2 hypersnippets in its syntax.

Essentially (and if we understand correctly), a transpension operation declares a variable that represents some unknown coordinate along a new dimension. At some point in the scope of that dimension variable, another operation takes ownership of it, taking the original dimension variable and all the variables that depended on it since then out of scope, but replacing the latter with reusable functions that can be applied repeatedly to different coordinate values of the user’s choice.

From a Punctaffy perspective, the dimension variable’s original scope is a degree-2 hypersnippet, and the operation that takes it out of scope (and converts other variables to functions) is located at one of the degree-1 closing hyperbrackets of that hypersnippet.

Curiously, the degree-2 hypersnippet also gets closed by degree-1 closing hyperbrackets at the type level; we might say these type theories assign types to terms that have unmatched closing hyperbrackets. They also have lambdas that abstract over terms that have unmatched closing hyperbrackets, so the journey of a closing hyperbracket through the codebase to find its match can potentially be rather circuitous.

At any rate, these dimension variables have affine (use-at-most-once) types, which can sometimes seem odd in the context of a type theory where the rest of the variables are Cartesian (use-any-number-of-times). By relating transpension operators to hyperbrackets, we may give the affine typing situation some more clarity: A "closing bracket" can’t match up with an "opening bracket" that’s already been closed.

And conversely, by relating the two, we may find techniques for understanding Punctaffy’s hyperbrackets in terms of affine variables in a type theory rather than the other way around. This kind of insight may come in handy for studying the categorical semantics of a calculus that has hyperbracketed operations, or even for implementing hyperbracket libraries like Punctaffy for typed languages.

2.2.7 Potential Application: User-Defined Escapes from Nestable Quotation🔗ℹ

Pairs of quasiquote and unquote can be seamlessly nested within each other as long as they match up. The situation is less convenient for string literal syntaxes, which typically require nested occurrences to be escaped:

(displayln "(displayln \"(displayln \\\"Hello, world!\\\")\")")

A rethought design for string syntaxes, using more distinctive string delimiters like "{ and }", can give strings the ability to recognize nested occurrences of "{ the way quasiquote does:

(displayln "{(displayln "{(displayln "{Hello, world!}")}")}")

However, string syntaxes also complicate matters in a way that doesn’t come up as often with quasiquote: String syntaxes have a wide variety of escape sequences.

In fact, if we take a look at s-expression quotation DSLs other than quasiquote, this does come up: For instance, syntax templates have the ..., ~@, and ~? directives, and syntax-parse patterns have a wide variety of directives like ~seq and ~optional. Users can even define their own operations like ~seq and ~optional by using prop:pattern-expander. (And from a certain point of view, most languages are trivial "quotation DSLs" where nothing at all is quoted and every single term is an escape sequence.)

When an escape sequence appears inside multiple layers of quotation, there’s an ambiguity issue: Which layer of quotation does it escape from? Should a string escape sequence be processed right away as part of the first string literal, or should it become part of the generated code to be processed as part of the next one (or the one after that)?

While this suggests some ambiguity in the user’s intent, there’s no corresponding wiggle room in the design. When we consider the point of recognizing nested occurrences in the first place, one design stands out: If we associate escape sequences specifically with the innermost quotation, we can successfully write a string literal the same way whether we’re inside or outside another string literal.

That ambiguity in the user’s intent does exist, though. Sometimes the user may intend to escape from an outer quotation. In that case, they can use a more elaborate notation to specify what layer they have in mind, such as writing \[LABEL]x20 instead of \x20:

(displayln "[OUTER]{(displayln "{Hello,\[OUTER]x20world!}")}")

One implication of this design is that when we process an escape sequence like \x20, we might treat it in two different ways depending on context: If it’s associated with the outermost stage of quotation, we fully process it, turning \x20 into a space character. If it’s associated with some inner stage of quotation, we suppress it, treating \x20 as the four characters \ x 2 0.

Note that even when we suppress an escape sequence, we still need to recognize it to know where it ends. That means in the implementation of even our simplest escape sequences, we’ll need a couple of behaviors:

For string escape sequences, expressing the boundary is simple, and the bounded input is some substring of the string literal’s code. For instance, in "Hello, world...", the x escape sequence might process the unbounded input 20world... and determine that its bounded input is 20. If the escape sequence is being suppressed, then we stop there and treat that bounded input as literal text. Otherwise, we invoke the x escape sequence’s second step to detemine what it represents (a space character). Either way, we then turn our attention to the rest of the input (world...) and process the escape sequences we find there.

Punctaffy’s infrastructure starts to come in handy when we apply this design to s-expression escape sequences. There, the bounded input is a degree-2 hypersnippet of code.

This quotation DSL design—specifically this particular way it leads to degree-2 hypersnippets—is actually the original motivating force behind Punctaffy.

For a couple of other reasons, the infrastructure we need to complete a DSL like this is still not straightorward to build, even with Punctaffy’s hypersnippets in our toolkit:

Nestable quasiquotation with depth-labeled escape sequences seems to be a complex problem. While Punctaffy’s notion of hyperbracketed code doesn’t straightforwardly apply in all the ways we might hope for it to, the slightly more complex variations we’ve described seem viable, and Punctaffy’s hyperbrackets are a milestone in those directions. Hypersnippet-shaped data does seem come in handy at least for representing the bounded input of an escape sequence.