Can I assert patterns in Java Objects with Schematron?

Posted on August 30, 2017 by Rick Jelliffe

Schematron has been useful over the years for detecting patterns in documents.  A simple expert system (multiple if-then-else changes) for capturing the constraints as human language,  implementing the tests using XPaths to provide a context and then to assert or report on things that should be true at the context.  Other language frameworks have nothing really like it.

But if Schematron is useful for XML documents, why mightn’t it be useful for programming languages?  Lets take Java.  One approach would be to make a Query Language Binding for Java: that means use the Schematron elements, but allow Java statements where usually you find XPaths, and run it on some class in JUnit fashion.  The new Java REPL sub language might be appropriate.

But another approach, perhaps less work, would be to use the JAXB Java Bindings for XML.  You annotate your java code with the appropriate annotations, such as @XmlElement or @XmlAttribute, on classes and fields. You can wrap sequences, exclude some fields, or convert lists into space separated attribute values.

JAXB has been part of Java for about a decade and a half now. One of the major developers is one of the great hero software developers, Kohsuke Kawaguchi, who was everywhere in Sun’s early XML implementations, including implementing RELAXNG and Schematron! A very smart cookie, he is perhaps best known now for Jenkins (Hudson) Continuous Integration Server among other Open Source work.

Update: JAXB is now a deprecated API and will be removed from JSE at some time.  It is still available in Java 10.

Once the class is annotated, you can press a button, and generate an XML Schema, which you can then then throw away in righteous indignation. instead you press another button and “marshall” the data into an XML file …which you can then validate with Schematron.

So why might you want to do this? Because XPaths are so great for navigating around large and complex semi-structured documents. If you try to do this in Java, you potentially need to find out how to iterate over each class, a head-ache since the dot notation is not used to access properties in Java Beans for example, and you need to be able to go along upward or reverse axis which is just not possible: when a field contains an object, you cannot ask the object what field and structures it has been assigned to, even with the Reflection API (as I understand it).

Lets steal an example of the JAXB annotations to give you a feel for the work involved: the Java annotations are in green. Here is a Java class

package example;
 
import javax.xml.bind.annotation.*;
 
@XmlRootElement
@XmlAccessorType(XmlAccessType.FIELD)
public class Customer { @XmlElement(name="email-address") @XmlElementWrapper(name="email-addresses")
   private List<String> emailAddresses;
 
   ...
}

And here is the kind of XML that would be generated. You should not rely that elements will be in particular positions unless your annotations specify them: check the documents (or the XSD which you have just thrown away).

<Customer>
<email-addresses>
<email-address>fred@somewhere.com</email-address>
<email-address>freda@somewhereelse.com</email-address>
</email-addresses>
</Customer>