Converting XML Schemas to Schematron: (#6) Validating IDREFs

This article first appeared in a blog on O'Reilly on November 2, 2007

There are three rules concerning documents with xs:ID and xs:IDREF.

First, they must contain token values, that accord with the XML naming conventions. We check this already as part of the simple type checking. (The empty string is not allowed.)

Second, no attribute of type ID can have the same value as another attribute of type ID.

Third, for every attribute of type IDREF there must be an attribute of type ID with the same value. (There can be multiple IDREFs with the same value, but one only ID with that value.) That is what this entry is about.

Here is how to check IDREFs. First of all, we make three variables collecting all element declarations which have IDREF attributes. Then we make three variables containing all the element declarations which have ID attributes. Then we make a list with just the distinct IDs, just to make life easier. (There are other ways to do this, of course.)

<xsl:variable name="idref-list">

                <root>

                        <xsl:for-each select="//xs:attribute[@name][@type='xs:IDREF']">

                                <xsl:sort select="@name"/>

                                <idref><xsl:value-of select="@name"/></idref>

                        </xsl:for-each>

                </root>

        </xsl:variable>

        <xsl:variable name="id-list">

                <root>

                        <xsl:for-each select="//xs:attribute[@name][@type='xs:ID']">

                                <xsl:sort select="@name"/>

                                <id><xsl:value-of select="@name"/></id>

                        </xsl:for-each>

                </root>

        </xsl:variable>

        <xsl:variable name="id-distinct-list">

                <root>

                        <xsl:for-each select="$id-list/root/id">

                                <xsl:if test="position() = 1 or . != preceding-sibling::id[1]">

                                        <id><xsl:value-of select="."/></id>

                                </xsl:if>

                        </xsl:for-each>

                </root>

        </xsl:variable>

Now we have all our input data nicely available in variables, Generating IDREF rules is easy. For each attribute that can contain an IDREF we check it against each attribute that can contain an ID. (Now this would be better factored out into an abstract rule, but it is easier to read this.)

<xsl:for-each select="$idref-list/root/idref">

                <xsl:if test="position() = 1 or . != preceding-sibling::idref[1]">

                        <sch:rule context="*/@{.}">

                                <sch:assert>

                                        <xsl:attribute name="test">

                                                <xsl:for-each select="$id-distinct-list/root/id">

                                                        <xsl:text>//@</xsl:text>

                                                        <xsl:value-of select="."/>

                                                        <xsl:text> = . </xsl:text>

                                                        <xsl:if test="position() != last()"> or </xsl:if>

                                                </xsl:for-each>

                                        </xsl:attribute>

                                        Element <sch:name/> 's IDRef hasn't been found. IDRef: <sch:value-of select="."/>.

                                </sch:assert>

                        </sch:rule>

                </xsl:if>

        </xsl:for-each>

You can get an idea from this how the ID uniqueness checking could be generated. KEY/KEYREF and UNIQUENESS checks in XSD already use XPath, and don’t use types, so they also should be straightforward to integrate.