Identifiers and References

RAN provides delimiter-level support for specifying internal links between elements, without a schema. And for declaring attributes in bulk for a name prefix.

The usual form of the attribute value indicator delimiter is a single "=".

There are four kinds of reference indicators:

  • “==” for a non-unique key
  • “===” for a document-unique primary key (aka specific identifier)
  • “====” for a universally unique identifier (UUID)
  • “=}” for a reference to a primary key or UUID

Identifiers and References

RAN gives a lexical  means of representing graph structures in a document without requiring a schema.  A RAN document is therefore a sequence of trees overlaid with directed links between and inside them. One-to-one, one-to-many, many-to-one and many-to-many links are supported. An element may have multiple links. 

A user should reasonably expect that anchors are indexed, that some kind of caching of links is performed, and that multi-factor matching against names in anchors is supported.  Similarly, they should reasonably expect that attributes that are not anchors or targets do not have any special indexing. 

Identifiers

An identifier is a special kind of attribute that identifies an element, for linking. 

  • The special delimiters  ==, === and ==== are used
  • The attribute value may be
    • a literal or name, which are treated equivalently as string values
    • a tuple of names and literals,
      • tuples of identifiers are provided specifically to allow users to graceful changing of the identifier or key system, not to provide composite keys or identifiers. 

Attributes should be ordered in the tag so that UUIDs come before primary identifiers, which come before keys, which come before usual attrubutes. This limits linear searching for identifiers, e.g during lazy parsing.

Primary Identifier

Stream, Fragment and Scoped start- and end-tags must have one or more values as UUID or primary identifier in the first position. An element may have a uuid or primary identifier in the first position. 

Each UUID or primary identifier for a fragment must be unique within its document; each primary identifier for a scope must be unique within its fragment; each primary identifier for an element must be unique within its scope or fragment. An implementation may create an index using these, for faster access.  For example, the following has two primary identifiers, the first of which has and older and a newer format value:

<section id===[sectionV section5] humanId==="The Society of Red Heads" >...

Other attribute may use ==.   These are secondary identifiers or keys. There is no requirement for uniqueness. For example, in the following has no specific identifier but  two secondary identifiers: @characteristics has three tokens as the value, and @breed has one.  An implementation may use these values as keys to select elements. 

<pet characteristics==[orange friendly agreeable] breed==retriever name="atom" > ...

If an element has no specific identifier but a key, and no other attributes, then an empty specific identifier can be used; in this case no look up or reference is possible using this empty primary identifier this would look like this:

<pet name=="" type="dog"> ...

References

A link is a special kind of attribute that identifies one or more reference elements

  • The special delimiter  =} is used for a normal forward link
  • The attribute value must be
    • a literal, in which case it is the id of element(s) in the current fragment
    • an reference-path, which identifies the fragment key and perhaps other ancestor references
    • a URI,
    • a relative path (to a file in the file system)
    • a tuple of literals and or reference-paths, in which case the link is to or from all the specified elements.
  • The link can be named using the attribute name.

Because reference-paths are used, the link may be a 1:many link, and may have wildcards.

A forward link mainly indicates a semantic connection or e.g. clickable link.

Example

 First we have a fragment x with various anchors specified.

<<<x id==="f1">>>
  <a> 
     <b me===b23>
        <c ima==="c34" ></c>
     </b>
   </a>   
<<</x id==="f1">>>

Next we have a fragment y which has a link with an reference-path to identify the reference.

<<<y ... >>> 
   <look   overthere=}f1+b23+c34  />   
<<</y ...>>>

Reference Paths

Reference Paths are a new concept. They are like a simple XPath on the identifier names of the current RAN stream, but made with identifier values not element names.

  • An reference-path is a typed value  (i.e. not in double quotes) that contains "+" or "~" not as the first character. 
  • A path of x+y+z  means "the element with an identifier of 'z' , which is any descendant of the element whose identifier is 'y', which is an descendant of the fragment whose identifier is 'x'. 
    • + selects the lexically first path. If there are multiple matches, only the first is used.
    • + establishes a 1:1 relation, or link. The name of the relation is the attribute name.
  • A path of x~y~z  means "all elements with an identifier of 'z' , which are  descendant of alls element whose identifier is 'y', which are descendants of the fragment whose identifier is 'x'. 
    • ~ establishes a relation to all. If there are multiple matches, only the first is used.
    • ~ establishes a 1:many relation. The name of the relation is the attribute name.
  • * can be used as a wildcard
    • * in the first position means "all fragments"
    • * in the last position means "all elements with any identifier" (descendant of the preceding path)
  • Identifiers are referenced as tokens.
    • If the identifier was a string literal containing a single token only, that is taken as a token. This is the only instance in RAN where the distinction between tokens and literals are blurred, and is discouraged.  If the identifier was a string literal containing multiple tokens, the first token may be used, and a warning may be issued.

Example

If I have two element types Woman and Man, in a Fragment with fragment key "Royals"  then the following are equivalent links  (assuming the identifiers are unique ids):

<Woman id===Elizabeth_II child =}[Charles_III Ann Andrew Edward ]>...</Woman>
<Man   id==="Charles_III">...</Man>
<Woman id===Ann>...</Woman>
<Man   id===Andrew>...</Man>
<Man   id===Edward>...</Man>

and with links from the children back to the parent

<Woman id==="Elizabeth_II" child =} [Charles_III Ann Andrew Edward ] >...</Woman>
<Man   id===Charles_III   parent =}"Elizabeth_II" >...</Man>
<Woman id===Ann           parent =}"Elizabeth_II" >...</Woman>
<Man   id===Andrew        parent =} Elizabeth_II >...</Man>
<Man   id===Edward        parent =} Elizabeth_II >...</Man>

and

<<<Family who==Royals decade=195X >>>
  <Woman id===Elizabeth_II    child =} [Charles_III Ann] >...</Woman>
  <Man   id===Charles_III >...<Man>
  <Woman id===Ann     >...<Woman>
<<</Family who==Royals>>>

<<<Family who==MoreRoyals decade=196X >>>
  <Man    id===Andrew      parent =}Royals+Elizabeth_II >...<Man>
  <Man    id===Edward      parent =}Royals+Elizabeth_II >...<Man> 
<<</Family  who==MoreRoyals>>>

To explain this last example:  we have a fragment for a Family (dated to the 1950s with an ISO 8601 wildcard) where Elizabeth_II has child links to Charles and Ann.  Then we append another fragment (dated to the 1960s with a wildcard)  for Andrew and Edward: these link to Elizabeth_II.

It is a dynamic reportable error if a link dangles.

Footnotes

18Implementation Note: Represent XML namespaces using an IP in the document head with name ns and attributes @prefix and @iri. Represent links to schemas, code objects, stylesheets, using IP. Put version and tracing metadata in a IP tag. Metadata such as MIME metadata or Dublin Core may be exposed as a IPs.