8 X-Expression and XML Operations
procedure
pth : path-string?
procedure
(attributes-ref attrs key) → (or/c #f string?)
attrs : (listof (list/c symbol? string?)) key : symbol?
procedure
(read-xexpr/standardized [in]) → xexpr/c
in : input-port? = (current-input-port)
A redundant UTF-8 byte-order mark at the beginning of the input will be ignored, which is not true with the functions from xml.
XML comments will be preserved in the resulting x-expression as comment data structures.
procedure
(write-xexpr/standardized xs [out]) → any
xs : xexpr/c out : output-port? = (current-output-port)
8.1 Raw X-Expressions
value
=
(or/c raw-xexpr-atom/c raw-xexpr-element/c)
value
=
(or/c normalized-xexpr-atom/c entity-symbol/c valid-char? cdata?)
value
The contract raw-xexpr-element/c recognizes specifically those raw x-expressions which represent XML elements (all of which would also satisfy list?).
Note that normalized x-expressions are a subset of raw x-expressions.
value
A value satisfying entity-symbol/c is allways a raw x-expression, but never a normalized x-expression.
procedure
(raw-xexpr? v) → any/c
v : any/c
procedure
(raw-xexpr-element? v) → any/c
v : any/c
procedure
(check-raw-xexpr-element blame val neg-party) → (and/c α raw-xexpr-element/c) blame : blame? val : α neg-party : any/c
Returns val directly if it satisfies raw-xexpr-element/c; otherwise, calls raise-blame-error to report the details of the violation, supplying neg-party as the missing party of the blame object.
8.1.1 Plain-Text Conversion
procedure
xs : raw-xexpr-atom/c
Because of the restriction that xs must not represent an XML element, (non-element-xexpr->plain-text xs) is very similar to (normalize-xexpr xs), except that comment and p-i (processing instruction) structures are replaced by "".
procedure
body : (listof raw-xexpr-atom/c)
8.2 Normalized X-Expressions
value
=
(or/c normalized-xexpr-atom/c normalized-xexpr-element/c)
value
=
(or/c string-immutable/c normalized-comment/c normalized-p-i/c)
value
CDATA content and entities (both numeric and symbolic) are normalized to strings;
All element normalized x-expressions, which are recognized by the contract normalized-xexpr-element/c, must have an attribute list, even if it is empty;
The body of an element normalized x-expression may only contain normalized x-expressions; and
- All strings anywhere in a normalized x-expression must be immutable. This includes:
Strings representing textual data in the body of element normalized x-expressions.
Attribute value strings of element normalized x-expressions.
Strings contained in comment and processing-instruction data structures (see normalized-comment/c and normalized-p-i/c).
procedure
(normalize-xexpr raw) → normalized-xexpr/c
raw : raw-xexpr/c
procedure
(normalized-xexpr? v) → any/c
v : any/c
procedure
v : any/c
value
procedure
(normalized-comment? v) → any/c
v : any/c
value
procedure
(normalized-p-i? v) → any/c
v : any/c
procedure
(check-normalized-xexpr-element blame val neg-party) → (and/c α normalized-xexpr-element/c) blame : blame? val : α neg-party : any/c
8.3 xmllint-based Operations
The functions documented in this section depend on the external command-line utility xmllint (which is part of libxml2) to work as their names indicate. If xmllint can not be found (see xmllint-available?), a warning is logged to (current-logger) at startup and these functions fall back to the alternate behavior specified in each function’s documentation, which is typically a noop.
procedure
procedure
(valid-xml-file? [#:quiet? quiet?] pth ...+) → boolean?
quiet? : any/c = #t pth : path-string?
When quiet? is #false, writes any validation error messages (from xmllint) to current-error-port.
If xmllint is not available, always returns #true.
Changed in version 0.5.1 of package ricoeur-tei-utils: Changed to always use Digital Ricœur’s DTD. Previously, valid-xml-file? was not specific to Digital Ricœur. It had the same meaning as passing --valid to xmllint, and it depended on a system DOCTYPE declaration to find the DTD file via a relative path reference. Instead, valid-xml-file? now uses a DTD file distributed with this library, which improves reliability at the cost of rejecting XML files that might be valid according to other arbitrary DTDs.
procedure
(directory-validate-xml dir [ #:quiet? quiet?]) → boolean? dir : (and/c path-string? directory-exists?) quiet? : any/c = #f
When quiet? is #false, writes any validation error messages (from xmllint) to current-error-port.
If xmllint is not available, always returns #true.
Changed in version 0.5.1 of package ricoeur-tei-utils: Changed to be specific to Digital Ricœur’s DTD, consistent with the change to valid-xml-file?.
procedure
(call/prettyprint-xml-out thunk) → any/c
thunk : (-> any/c)
When xmllint is available, thunk is called in a context where everything written to the current-output-port is piped through xmllint’s prettyprint function before being written to the original current-output-port. When prettyprinting succeeds, the result of call/prettyprint-xml-out is the result of thunk.
If prettyprinting fails (perhaps because the output of thunk was not well-formed XML), xmllint may still write to the original current-output-port, but call/prettyprint-xml-out raises an exception rather than returning a value. See with-output-to-file/unless-exn for an alternative when this behavior is undesirable.
If thunk raises an exception and xmllint is available, xmllint is never invoked and nothing is written to the original current-output-port.
Note: While xmllint’s prettyprint function natively uses platform-specific line endings, call/prettyprint-xml-out arranges to replace these with "\n" (Racket’s internal line ending) on all platforms for the purposes of writing to the current-output-port.