Extension: RAN-DOM (and RAN Infoset, XDM)

(This is not current)

RAN-DOM is a DOM for RAN documents.  The RAN-XML infoset is a description of how a RAN document may appear to be processed by XML systems.

RAN-DOM Document Object Model

Loosely, a RAN-DOM is a quite similar to an XML DOM or XDM or PSVI

  • Nodes
    • text   (ignorable-whitespace | RCDATA)
    • element (stream |fragment| scoped |  branch)
    • attribute  ( list   |  binary |  ellipsis)
    • attribute value  (lexically typed)
    • comment
    • IP/PI
    • link ?
    • (no namespace nodes)

Loosely, a RAN-DOM is an XML DOM with the following differences:

  • There are four subtypes of elements:  stream, fragment, scoped, or branch (i.e. XML).
  • A tag can be:
    • known (start and length known)
    • lexed (into tokens)

Required extensions:

  • PIs have attribute start-tag syntax
  • All names and attribute values may be lexically typed: string, name, number, date-time-range, boolean, path.
  • An element may have an empty name "" or be anonymous, in which case the name is “[” and "]".
  • Element and fragment end-tags allow arbitrary data after the generic identifier, in the manner of a comment. 

The information in the preamble is common to all.  Namespaces are implemented that a prefix on an element or attribute name can be used to look up the corresponding link. 

There is no equivalent of declarations, DOCTYPE, external entities, CDATA sections, namespace redeclaration, namespace defaults and so on in RAN.

XPath Data Model

A XML document loaded into a RAN-DOM document has the nodes as a conventional DOM. Similarly, it can have the same typed XDM behaviour as a  document from an XML DOM. 

Some aspects of RAN are coped with by the XDM (and Xquery) such as multiple top-level elements  (fragments can be treated as elements).  As names in RAN may be strings as well as tokens, the XPath would have to use *[local-name()="some name"] in paths for that case.

The definition of the value of an element changes: it is not the concatenation of all descendent nodes, but the concatenation of all descendent nodes of te same scope. It excludes contents of elements in descendent scoped-elements.

As with RAN-DOM, the pre-amble is available to all documents. A link tag's attributes are attached to the top-level elements and fragments with the same prefix.

The three things that do not have an equivalent in XDM are the date-time-range and path datatypes. Consequently these should be treated as strings.

RAN   Infoset

(This information needs to be updated to cope with top-level finite-stream element tags (four <).)

We start with the basic grammar productions:

stream    ::= STAGO preamble? body* ETAGO
preamble  ::= (RAN-IP | link | fluff)*
body ::= (element fluff* ) | (scoped fuff)+ | (fragment fluff*)+

Each top-level fragment are each treated as virtual XML documents, according to the following rules:

  • The link declaration is applied to each of them.
    • RAN links provide XML namespace declarations and defaulted attribute values
  • Fragments are treated as XML elements
    • Element and attribute names that are literals must be replaced by e.g. Base64 versions of the literal  (allowing - and _ not / and +).
    • Attribute values not in literals are put as string literals, and typed, if available, but by the nearest PSVI equivalent
  • Top-level fluff, i.e. comments and PIs, are not visible as part of the infoset of any fragment.