Fast Sequence

8.12

Fast Sequence🔗ℹ

The library provides efficient sequence operations that have good performance when used inside a for clause.

1 Basic Concepts🔗ℹ

The operations provided by this library have high performance when applied to a fast sequence. A fast sequence uses syntactic information about the shape of a sequence to apply compiler optimizations, such as inlining and using specialized functions for extracting data. Examples of fast sequences are applications of a sequence macro, such as in-range, in-list, etc. (see Sequences).

For a way of defining new fast sequences, see define-sequence-syntax.

A slow sequence—all other sequences, such as applications of list, vector, etc.—provides to the compiler a generic interface for data accession and has lesser performance at run-time.

An element of a sequence can be a single value or it can consist of multiple values.

2 Motivating Example🔗ℹ

As a running example, consider the following task: calculate a sum and a product of squares of even positive numbers up to 10000.

There are several ways of doing it. The simplest one uses range and a composition of filter and map to define a sequence of squares of even positive numbers up to 10000 and then reuses it:

> (define (squares-of-evens-up-to-10000)
    (map sqr (filter even? (range 1 10000))))
> (for/sum ([x (squares-of-evens-up-to-10000)])
    x)
166616670000
> (void
   ; a very long number
   (for/product ([x (squares-of-evens-up-to-10000)])
     x))

A downside of this solution is that it eagerly allocates a big intermediate list. We can fix it by using the lazy in-range sequence instead, with appropriate changes to the used operations. We only need to change the sequence definition; that does not affect the rest of the solution. The improved version is

> (define (squares-of-evens-up-to-10000)
    (sequence-map
     sqr
     (sequence-filter
      even?
      (in-range 1 10000))))

But what about its performance? When we use specialized sequence constructors like in-range, the Racket compiler can run them very fast because it uses inlining and specialization. And functions sequence-filter and sequence-map return some generic sort of a sequence as a result. Therefore, we lose the performance.

Another way of doing it is to apply even? and sqr in a #:when clause or directly in the body of a loop:

> (for/sum ([x (in-range 1 10000)]
             #:when (even? x))
    (sqr x))
166616670000
> (void
   ; a very long number
   (for/product ([x (in-range 1 10000)])
     (if (even? x)
         (sqr x)
         1)))

This will run fast, but the code was nicer in the previous solution. Here we lose abstraction of working with sequences, and we can no longer define a reusable sequence.

Using fast sequence operations provided by this library, we can take the best of these two worlds:

> (define-sequence-rule (squares-of-evens-up-to-10000)
    (fast-sequence-map
     sqr
     (fast-sequence-filter
      even?
      (in-range 1 10000))))

In this code, fast-sequence-filter and fast-sequence-map may be considered as versions of sequence-filter and sequence-map that, being applied to a specialized sequence, still run fast. Note that as like as specialized sequences, their applications are efficient only when applied in a clause of a for loop or its variants, so it is necessary to define a fast sequence form instead of merely using define. The define-sequence-rule form is similar to define-syntax-rule, but it defines a sequence macro, which can be efficiently used in a for loop clause.

Alternatively, you can use the do/sequence form whose syntax is similar to the syntax of a for loop:

> (define-sequence-rule (squares-of-evens-up-to-10000)
    (do/sequence ([x (in-range 1 10000)]
                  #:when (even? x))
      (sqr x)))

The do/sequence form is a more powerful operation, which, in particular, allows nested iterations.

3 Fast Sequence Operations🔗ℹ

syntax
(fast-sequence-map f seq-expr ...+)

f : procedure?
seq-expr : sequence?

Returns a fast sequence form whose elements are results of applying f to the elements of seq-exprs. The number of elements is determined by the length of the shortest seq-expr. The number of seq-exprs must match the number of arguments that f accepts, and all the elements of each sequence must be single values.

The fast-sequence-map form is similar to sequence-map provided by racket/sequence but it accepts multiple sequence expressions, whereas sequence-map accepts a single sequence. When used directly in a for loop clause, fast-sequence-map has better performance than sequence-map. The best performance is provided when seq-expr is a fast sequence.

Examples:

> (for/list ([(x) (fast-sequence-map
                   add1
                   (in-list '(1 2 3 4 5)))])
    x)
'(2 3 4 5 6)
> (for/list ([(x) (fast-sequence-map
                   symbol->string
                   (in-vector #(a b c)))])
    x)
'("a" "b" "c")

syntax
(fast-sequence-filter pred seq-expr)

pred : (-> any/c ... boolean?)
seq-expr : sequence?

Returns a fast sequence form whose elements are the elements of seq-expr that satisfy the predicate function pred.

When used directly in a for loop clause, fast-sequence-filter has better performance than sequence-filter provided by racket/sequence. The best performance is provided when seq-expr is a fast sequence.

Examples:

> (for/list ([c (fast-sequence-filter
                 char-alphabetic?
                 (in-string "a, b, c"))])
    c)
'(#\a #\b #\c)
> (for/list ([(k v) (fast-sequence-filter
                     (lambda (k v) (odd? v))
                     (in-hash (hash 'a 1 'b 2 'c 3 'd 4 'e 5)))])
    (list k v))
'((c 3) (a 1) (e 5))

syntax
(do/sequence (binding-or-when-chunk ...) body ...+)

binding-or-when-chunk = binding-clause ...+
| when-clause ...+

binding-clause = [(id ...) seq-expr]
| [id seq-expr]

when-clause = #:when guard-expr

seq-expr : sequence?

When used directly in a for loop, returns a fast sequence form whose elements are results of evaluating the last body on each iteration of the loop. Using outside a for loop is forbidden; in that case, it raises an error.

The do/sequence form provides two kinds of clauses: a binding-clause and a when-clause, which specify the number of elements of the sequence. These clauses have the same meaning as the corresponding clauses of the for loop. Other kinds of for clauses (e.g., #:break) are not supported.

Clauses are grouped into chunks of clauses of the same kind. Within the same chunk, expressions, seq-exprs or guard-exprs, are evaluated left-to-right and elements of the sequences are extracted in parallel, one per iteration of the loop, and stored in locations generated for the corresponding ids. The ids are bound in all expressions in the succeeding chunks and in the bodys.

If there are more than one chunk of clauses then the for loop performs nested iterations for sequences in each succeeding chunk. (This behavior is the same as the behavior of the for loop’s clauses.)

With no clauses, do/sequence returns a sequence with a single element.

The do/sequence form is as expressive as both fast-sequence-map and fast-sequence-filter, but it additionally allows iterating over nested sequences.

When do/sequence returns a sequence whose elements are all single values, it is equivalent to using for/list in place of do/sequence, except that do/sequence does not construct a list.

The do/sequence form provides the best performance when it is used directly in a for loop clause and all seq-exprs are fast sequences.

Examples:

> (for/list ([(x) (do/sequence () 13)])
    x)
'(13)
> (for/list ([x (do/sequence ([x (in-list '(1 2 3))]
                              [y (in-list '(a b c))]
                              #:when (odd? x))
                  (list x y))])
    x)
'((1 a) (3 c))
> (for ([(x y) (do/sequence ([x (in-list '(1 2 3))]
                             #:when (odd? x)
                             [y (in-list '(a b c))])
                 (values x y))])
    (printf "~a " (list x y)))
(1 a) (1 b) (1 c) (3 a) (3 b) (3 c)
> (for ([x (do/sequence ([x (in-list '((1 2) (3 7) () (5 6)))]
                         #:when #t
                         [y (in-list x)]
                         #:when (even? y))
             y)]
        [y (for/list ([x (in-list '((1 2) (3 7) () (5 6)))]
                      #:when #t
                      [y (in-list x)]
                      #:when (even? y))
             y)])
    (printf "~a ~a\n" x y))
2 2
6 6

4 Defining New Sequence Forms🔗ℹ

syntax
(define-sequence-rule (id . pattern) template)

template : sequence?

The define-sequence-rule form binds id as a sequence macro that matches a single pattern. Using (id . pattern) is only allowed directly in a clause of a for loop (or its variants). A template must be a sequence expression. It provides better performance when template is a fast sequence expression.

Examples:

> (struct 2-level-list (lst-of-lsts))
> (define-sequence-rule (in-2-level-list lst-of-lsts)
    (do/sequence ([(x) (in-list lst-of-lsts)]
                  #:when #t
                  [(y) (in-list x)])
      y))

1	Basic Concepts
2	Motivating Example
3	Fast Sequence Operations
4	Defining New Sequence Forms