On this page:
tei-document?
tei-document-checksum
5.1 Reading & Writing TEI Documents
file->tei-document
read-tei-document
write-tei-document
tei-document->plain-text
5.2 Paragraph Inference
tei-document-paragraphs-status
guess-paragraphs-status/  c
tei-document/  paragraphs-status/  c
tei-document-skip-guess-paragraphs
tei-document-unskip-guess-paragraphs
tei-document-guess-paragraphs
0.5.91

5 Document-level Functions🔗

procedure

(tei-document? v)  any/c

  v : any/c
Recognizes TEI document values.

A TEI document is a TEI element struct that represents the root TEI element of a document. TEI document values implement the instance info interface for bibliographic information.

procedure

(tei-document-checksum doc)  symbol?

  doc : tei-document?
Returns a checksum calculated based on a standardized XML representation of doc. The checksum is returned as a symbol to facilitate inexpensive comparisons.

5.1 Reading & Writing TEI Documents🔗

procedure

(file->tei-document file)  tei-document?

  file : 
(and/c path-string-immutable/c
       file-exists?)
Produces a TEI document value from the TEI XML document file.

High-level clients should use valid-xml-file? or directory-validate-xml to validate file before calling file->tei-document due to the current limitations on the validation performed by any-tei-xexpr/c.

This function uses read-xexpr/standardized to parse the raw XML into x-expressions consistently and without information loss.

procedure

(read-tei-document [in])  tei-document?

  in : input-port? = (current-input-port)
Produces a TEI document value representing the TEI XML document read from in.

Currently, file->tei-document should usually be used instead of read-tei-document, as it cooperates more easily with the validation needs documented under file->tei-document and any-tei-xexpr/c.

This function uses read-xexpr/standardized to parse the raw XML into x-expressions consistently and without information loss.

procedure

(write-tei-document doc [out])  any

  doc : tei-document?
  out : output-port? = (current-output-port)
Writes the XML representation of doc to out, prettyprinted using call/prettyprint-xml-out.

Use write-tei-document rather than other methods for writing XML: write-tei-document uses write-xexpr/standardized to generate consistent output and includes an appropriate prelude.

procedure

(tei-document->plain-text doc 
  [#:include-header? include-header?]) 
  string-immutable/c
  doc : tei-document?
  include-header? : any/c = #t
Converts the TEI document doc to a plain-text string.

The resulting string is not the XML representation of doc: it is formated for uses that expect unstructured plain text.

When include-header? is non-false (the default), the resulting string will begin with a header which includes, for example, the title and other information about the corresponding instance. When include-header? is #false, only the content will be included, which is sometimes preferable if the plain text form is intended for further processing by computer.

5.2 Paragraph Inference🔗

procedure

(tei-document-paragraphs-status doc)

  guess-paragraphs-status/c
  doc : tei-document?

value

guess-paragraphs-status/c : flat-contract?

 = 
(or/c 'todo
      'line-breaks
      'blank-lines
      'done
      'skip)
Returns a symbol indicating whether paragraph-guessing has been performed for the TEI document represented by doc.

A value of 'todo means that paragraph-guessing has not been performed and should be done as soon as possible. A value of 'skip means that paragraph-guessing has been intentionally postponed, perhaps because the current strategies have not proven effective for doc.

The values 'line-breaks, 'blank-lines, and 'done all mean that paragraph-guessing has been completed successfully: 'line-breaks and 'blank-lines indicate the strategy by which paragraphs were infered, whereas 'done is a legacy value indicating that paragraph-guessing was performed before this library began recording which strategy was used.

procedure

(tei-document/paragraphs-status/c status/c)  flat-contract?

  status/c : flat-contract?
Produces a contract recognizing TEI document values (i.e. those recognized by tei-document?) for which the result of tei-document-paragraphs-status would satisfy the contract status/c.

Returns a new TEI document value like doc, but with an annotation that paragraph-guessing has been intentionally skipped for this document.

Returns a new TEI document value like doc, but annotated to indicate that paragraph-guessing should be performed as soon as possible.

procedure

(tei-document-guess-paragraphs doc 
  [#:mode mode]) 
  (tei-document/paragraphs-status/c mode)
  doc : 
(tei-document/paragraphs-status/c
 (or/c 'todo 'skip))
  mode : (or/c 'line-breaks 'blank-lines) = 'blank-lines
Returns a new TEI document value like doc, but with paragraphs infered based on the strategy mode.