avro:   Apache Avro
1 Schema Evolution
2 Missing Features
2.1 Aliases
2.2 JSON Encoding
2.3 RPC
3 Reference
codec?
make-codec
codec-read
codec-write
3.1 Object Container Format
read-container
write-container
8.12

avro: Apache Avro🔗ℹ

Bogdan Popa <bogdan@defn.io>

This package implements support for decoding and encoding data specified using the Apache Avro protocol.

1 Schema Evolution🔗ℹ

I agree with the authors of the goavro package that Avro schema evolution seems broken. While this package supports field defaults, I would caution against using them. Instead, version your schemas by tagging your data.

2 Missing Features🔗ℹ

2.1 Aliases🔗ℹ

Aliasing schemas during read is not currently supported, but I may add support if there is interest.

2.2 JSON Encoding🔗ℹ

JSON encoding is not currently supported and it is unlikely it will be unless someone else is interested in adding support.

2.3 RPC🔗ℹ

I don’t plan on supporting the RPC features at the moment.

3 Reference🔗ℹ

 (require avro) package: avro-lib

Codecs are opaque values that can be used to read and write data according to an Avro schema.

Primitive values are mapped between Avro and Racket according to the following table:

Avro Type

Racket Contract

null

'null

boolean

boolean?

int

(integer-in (- (expt 2 31)) (sub1 (expt 2 31)))

long

(integer-in (- (expt 2 63)) (sub1 (expt 2 63)))

float

real?

double

real?

bytes

bytes?

string

string?

Records are represented by hasheq hashes with symbols for keys. Enums are represented by the symbols that make up their variants. Unions are represented by hasheq hashes with two keys: 'type representing the fully-qualified name of the variant and 'value, containing the value.

procedure

(codec? v)  boolean?

  v : any/c
Returns #t when v is a codec.

procedure

(make-codec schema)  codec?

  schema : string?
Converts the given Avro schema to a codec. The schema must be a valid JSON string in Avro definition format.

Examples:
> (require avro json)
> (define c
    (make-codec
     (jsexpr->string
      (hasheq
       'type "record"
       'name "LongList"
       'fields (list
                (hasheq
                 'name "value"
                 'type "long")
                (hasheq
                 'name "next"
                 'type '("null" "LongList")))))))
> (define v
    (hasheq
     'value 1
     'next (hasheq
            'type "LongList"
            'value (hasheq
                    'value 2
                    'next (hasheq
                           'type "null"
                           'value 'null)))))
> (define buf (open-output-bytes))
> (codec-write c v buf)

4

> (codec-read c (open-input-bytes (get-output-bytes buf)))

'#hasheq((next

          .

          #hasheq((type . "LongList")

                  (value

                   .

                   #hasheq((next . #hasheq((type . "null") (value . null)))

                           (value . 2)))))

         (value . 1))

procedure

(codec-read c in)  any/c

  c : codec?
  in : input-port?
Reads a value from in according to c.

procedure

(codec-write c v out)  exact-nonnegative-integer?

  c : codec?
  v : any/c
  out : output-port?
Writes v to out according to c. Returns the number of bytes written.

3.1 Object Container Format🔗ℹ

 (require avro/container) package: avro-lib

procedure

(read-container in)  list?

  in : input-port?
Reads a list of objects from in using the Avro Object Container Format.

procedure

(write-container schema    
  values    
  out    
  [#:block-size block-size    
  #:compression compression])  void?
  schema : string?
  values : list?
  out : output-port?
  block-size : exact-positive-integer? = (* 100 1024 1024)
  compression : (or/c 'none 'deflate) = 'deflate
Writes values to out using the Avro Object Container Format and the given schema. The block-size argument is a hint that instructs the writer to start new data blocks once the size (before compression) has been exceeded.