<<<Rapid Access Notation>>>

Rick Jelliffe (C) 2021-2024

Rapid Access Notation (RAN) is a  possible document format current under design, to allow fast and efficient access to elements in fragments on the raw text1.  It is designed to be lighter-weight than XML by being lexically richer (rather than use schemas), with much richer datatypes and tuples,  more expressive delimiters, and better relation/graph support. It is designed to allow richer infosets or object bindings without requiring any schema loading or processing.

A RAN document is a sequence of independent fragments, with lexically-determined datatyped values and relational links.  A RAN document is a series of trees of nodes, overlaid with multi-stage addressed links.

  • RAN is designed to support streaming, parallel, vectorized, speculative, lazy parsing and random access.
  • It is designed with a layered, “russian doll” approach so that systems that only need sparse access in large documents can have it efficiently, and so that the syntax can be augmented (as distinct from extended) to support specialist notations.  
  • It has  very comprehensive data-typing of attribute values and support for authoring features.
  • There are also related proposed evalidation method  Apatak and a CSV-embedding convention.

Here is a very simple RAN document: it has two fragments (specified using "<<<" and ">>>")). The parser determines the datatypes of the attribute values (string, date, number) and whether they are serving as internal link targets (specified with "=:") or link anchors (specified with ":=")

<<<my-document fragid:="f1" 
    x="y"  date=2022-02-22    "A 53 Code" = "1257B" >>>
   <p belongs_with=:abc124>Hello </p>
<<</my-document fragid:="f1">>>

<<<my-document id:=abc124 
     date=2022-02-22  "A 53 Code" = "1257A" >>>
     <p >World</>
<<</my-document id:=abc124  >>>

RAN considered as a syntax has been influenced by XML, SGML, XBRL, CSV, SQLite, XPath, DSSSL and  NLJSON. An important design goal has been that it is not merely another syntax for these existing notations, but goes well beyond.

1 By raw text is meant that the stream can be lexed, parsed and transduced (i.e. data-type-annotated) in-place, without allocating extra buffer space or requiring schemas.  For information on the kinds of parsing and application that motivate the design of RAN, see the papers  “Mison: A Fast JSON Parser for Data Analytics” (Li, et al.,  2017) and "Parsing Gigabytes of JSON per Second"  (  Langdale and Lemire, 2020)