On this page:
7.1 TEI X-Expression Contracts
any-tei-xexpr/  c
tei-xexpr/  c
dynamic-tei-xexpr/  c
tei-element-name/  c
7.2 Common Element Interface
tei-element?
content-containing-element?
elements-only-element?
element-or-xexpr->plain-text
7.2.1 Struct–X-Expression Conversion
xexpr->tei-element
tei-element->xexpr
tei-element->xexpr*
7.2.2 Traversing TEI Element Structs
tei-element-get-name
tei-element-get-attributes
tei-element-get-body
tei-get-body/  elements-only
tei-element
content-containing-element
elements-only-element
7.3 Specialized Element Interfaces
7.3.1 Elements with Responsible Parties
tei-element-can-have-resp?
tei-element-resp
7.3.2 Page-break Elements
tei-pb?
pb-get-kind
pb-get-numeric
pb-get-page-string
tei-get-page-breaks
7.3.3 Footnote & Endnote Elements
tei-note?
tei-note-get-place
tei-note-get-n
tei-note-get-transl?
7.3.4 Chapter & Section Elements
div?
div-get-n
div-get-type
div-type/  c
0.5.91

7 TEI Element Representation🔗

This section documents the representations used by this library for individual TEI elements. TEI elements are translated from native XML form into raw x-expressions that satisfy any-tei-xexpr/c. Internally, this library uses TEI element structs to provide a further layer of abstraction.

Many of the functions documented in this section expose details of the current schema used by Digital Ricœur for our TEI XML documents. The details of this schema are subject to change: indeed, a major purpose of this library is to provide clients with an API that remains stable across changes to the schema. Programmers are strongly advised to use higher-level abstractions instead of the low-level operations documented in this section whenever possible.

7.1 TEI X-Expression Contracts🔗

Similar to raw-xexpr-element/c, but specifically for valid TEI elements that satisfy the additional requirements documented in TEI Encoding Guidelines for Digital Ricœur.

Currently, any-tei-xexpr/c and related contracts check all of Digital Ricœur’s project-specific requirements and most other requirements inherited from the TEI standard. However, ricoeur/tei does not currently implement a full XML validator. Nonetheless, values that are not valid according to all XML and DR-TEI.dtd rules should never be constructed: they may cause subtle errors with, at best, obscure error messages.

Full XML validation may be added to any-tei-xexpr/c in the future. Currently, high-level clients (like directory-corpus%) should use valid-xml-file? or directory-validate-xml for validation and ensure that (xmllint-available?) returns #true.

syntax

(tei-xexpr/c elem-name-id)

Produces a contract similar to any-tei-xexpr/c, but which recognizes only elements named (quote elem-name-id).

Using (tei-xexpr/c elem-name-id) produces the same contract as (dynamic-tei-xexpr/c (quote elem-name-id)), but tei-xexpr/c expands to the specific contract at compile-time, and a syntax error is raised if (quote elem-name-id) would not satisfy tei-element-name/c.

procedure

(dynamic-tei-xexpr/c name)  flat-contract?

  name : tei-element-name/c
Like tei-xexpr/c, but dispatches to the specific contract dynamically at run-time.

A contract recognizing the names of valid Digital Ricœur TEI XML elements. All values which satisfy tei-element-name/c also satisfy symbol?.

7.2 Common Element Interface🔗

procedure

(tei-element? v)  any/c

  v : any/c

procedure

(content-containing-element? v)  any/c

  v : any/c

procedure

(elements-only-element? v)  any/c

  v : any/c
This library uses TEI element structs, a layer of abstraction over normalized x-expressions, to build higher-level interfaces to TEI XML documents. All TEI element structs are recognized by the predicate tei-element?.

Internally, there is a distinct TEI element struct type for each type of element in Digital Ricœur’s customized TEI schema. See Formal Specification in TEI Encoding Guidelines for Digital Ricœur for a complete listing. However, the specific representations of most TEI element struct types are kept private to this library: for robustness against future changes to Digital Ricœur’s TEI schema, clients are urged to use high-level interfaces that abstract over the details of the document structure.

Every TEI element struct satisfies either content-containing-element? or elements-only-element? (but not both) depending on whether the element type of which it is an instance may ever contain textual data directly.

Like tei-document->plain-text, but for any TEI element struct or non-element raw x-expression, and without support for an #:include-header? argument.

For implementation details, see prop:element->plain-text.

7.2.1 Struct–X-Expression Conversion🔗

procedure

(xexpr->tei-element xs)  tei-element?

  xs : any-tei-xexpr/c
The primitive function for converting a raw xexpr representation of a TEI XML element to a TEI element struct. The raw xexpr is effectively converted to a normalized xexpr as part of this process.

Any TEI element struct may be converted to a normalized x-expression using tei-element->xexpr. XML is the cannonical serialized form of a TEI element struct: TEI element structs are not serializable in the sense of racket/serialize.

Do not attempt to use this function as a substitute for write-tei-document.

Like tei-element->xexpr, but also accepts normalized xexprs satisfying normalized-xexpr-atom/c, which are returned directly.

Do not attempt to use this function as a substitute for write-tei-document.

7.2.2 Traversing TEI Element Structs🔗

The accessors documented in this subsection are especially likely to expose brittle details that will break upon changes to Digital Ricœur’s schema for our TEI XML documents.

All TEI element struct types support the common set of operations listed above for traversing an instance’s attributes and contents; however, these functions are quite low-level and should primarily be used to implement higher-level abstractions, ordinarily as part of this library.

For a TEI element struct that satisfies elements-only-element?, the list returned by tei-element-get-body will never contain any strings: any insignificant whitespace inside such elements is dropped when the TEI element struct is constructed. However, the result of tei-element-get-body may still be different from tei-get-body/elements-only, as the list returned by tei-element-get-body may contain values satisfying normalized-comment/c or normalized-p-i/c.

match expander

(tei-element name-pat attributes-pat body-pat)

match expander

(content-containing-element name-pat attributes-pat body-pat)

match expander

(elements-only-element name-pat
                       attributes-pat
                       body-pat
                       maybe-elements-only)
 
maybe-elements-only = 
  | #:elements-only body/elements-only-pat
For the match expanders tei-element, content-containing-element, and elements-only-element, the patterns name-pat, attributes-pat, and body-pat are matched against the results of tei-element-get-name, tei-element-get-attributes, and tei-element-get-body, respectively. A tei-element pattern can match any TEI element struct, whereas content-containing-element and elements-only-element patterns only match values that satisfy the corresponding predicate. If a body/elements-only-pat pattern appears, it is matched against the result of tei-get-body/elements-only.

7.3 Specialized Element Interfaces🔗

Functions for working with a few specific TEI element struct types are provided by this library; however, such functions are especially brittle and may change in incompatable ways, or even be removed entirely, in future versions of this library.

For most purposes, the segment interface is a much better choice than the functions documented below (which are in fact used in the implementation of tei-document-segments). However, they do serve some specific use-cases that have not yet motivated a higher-level interface: most prominently, “TEI Lint” uses these functions to generate warnings about likely numbering errors.

7.3.1 Elements with Responsible Parties🔗

procedure

(tei-element-can-have-resp? v)  any/c

  v : any/c

procedure

(tei-element-resp elem [default])  
(if default
    symbol?
    (or/c symbol? #f))
  elem : tei-element-can-have-resp?
  default : (or/c 'ricoeur #f) = 'ricoeur
A uniform interface for accessing the resp attribute of div and note elements and the who attribute of sp elements.

Note that tei-element-resp only accesses the resp attribute (if any) of the specific TEI element struct elem. Actually determining the “responsible party” for an element also requires consideration of its parent elements. This resolution is performed for segments and can be accessed with segment-resp-string and segment-by-ricoeur?: the primary purpose of tei-element-resp is to implement those higher-level functions.

For implementation details, see declare-resp-field.

7.3.2 Page-break Elements🔗

procedure

(tei-pb? v)  any/c

  v : any/c
Recognizes TEI element structs that represent pb (page-break) elements.

procedure

(pb-get-kind pb)  (or/c 'none 'number 'roman 'other)

  pb : tei-pb?

procedure

(pb-get-numeric pb)  (maybe/c natural-number/c)

  pb : tei-pb?

procedure

(pb-get-page-string pb)  (maybe/c string-immutable/c)

  pb : tei-pb?
This library groups page-breaks into several kinds based on their number, i.e. the n attribute of the pb element. The kind of number can be identified by the result of pb-get-kind:
  • 'none: The page was not numbered.

  • 'number: The page was numbered with an Arabic numeral.

  • 'roman: The page was numbered with a Roman numeral.

  • 'other: The page has a “number” according to the n attribute, but the n attribute value is not in a format this library can understand.

When the kind is 'number or 'roman, pb-get-numeric returns a just value containing the page number as a Racket integer.

Unless the kind is 'none, pb-get-page-string returns a just value containing the raw string given as the n attribute.

Recall that a pb element marks the beginning the specified page.

procedure

(tei-get-page-breaks elem)  (listof tei-pb?)

  elem : tei-element?
Returns a list, in order, of all of the TEI element structs recursively contained by elem that represent pb (page-break) elements. If elem itself represents a page-break element, the result is (list elem).

This function is most often used with TEI document values.

7.3.3 Footnote & Endnote Elements🔗

procedure

(tei-note? v)  any/c

  v : any/c
Recognizes TEI element structs that represent note elements, which are used for footnotes and endnotes.

procedure

(tei-note-get-place note)  (or/c 'foot 'end)

  note : tei-note?
Indicates whether note represents a footnote or an endnote.

procedure

(tei-note-get-n note)  string-immutable/c

  note : tei-note?
Returns a string representing how note was identified in the original, e.g. "1" or "*". This corresponds to the value of the n attribute of the note element.

procedure

(tei-note-get-transl? note)  (or/c #f 'transl)

  note : tei-note?
Returns 'transl if note is a translation note.

7.3.4 Chapter & Section Elements🔗

procedure

(div? v)  any/c

  v : any/c
Recognizes TEI element structs that represent div elements, which are used for chapters, sections, and other structural divisions.

procedure

(div-get-n elem)  (maybe/c string-immutable/c)

  elem : div?
Returns either a just value containing the string value of the n attribute of the represented div element, or (nothing) if the n attribute was not present.

procedure

(div-get-type elem)  div-type/c

  elem : div?

value

div-type/c : flat-contract?

 = 
(or/c 'chapter 'part 'section 'dedication
      'contents 'intro 'bibl 'ack 'index)
Returns a symbol corresponding to the type attribute of the represented div element. See the documentation for div for details about the meanings of specific type values.