Alexander Schwartzman has written a good article summarizing the lessons learned from using Schematron and DTDs together over multiple years for a non-trivial DTD.
JATS Subset and Schematron: Achieving the Right Balance from the Journal Article Tage Suite Conference 2017 is now online.
Alexander is mainly concerned about whether you should subset a standard DTD or instead use Schematron rules to point out deprecated elements, as a second layer. His thoughts would apply just as much to RELAX NG and XSD, I think.
He gives many examples where Schematron is clearly the better approach, and otherwise comes down in favour of using DTDs (grammars) for quasi-static constraints and Schematron for quasi-dynamic constraints: you upgrade the DTD rarely and with attention to ramifications, you upgrade the Schematron as often as you find something new to : this seems a very workable approach, and probably is at heart an application of Conway’s Law. (But if his quasi-static versus quasi-dynamic demarcation holds water, does that mean that XSD 2.0 style assertions miss the mark, since they are appropriate for quasi-static constraints only?)
Alexander also makes a strong point that subsetting the DTD to only the elements that you actually need can reduce the number and complexity of the Schematron rules too.
What is perhaps the most interesting aspect of the article is that it is, in a sense, a follow up article to one made seven years earlier in the same conference in 2010: Superset Me—Not: Why the Journal Publishing Tag Set Is Sufficient if You Use Appropriate Layer Validation which has the abstract
This paper relates the experience of a publisher who chose to create a superset of the NLM Journal Publishing Tag Set in order to enforce business rules, data types, and house style and, having done just that, realized that a subset could have been sufficient to meet the publisher’s needs if it were used in conjunction with the appropriate layer validation technology, such as Schematron.
So we have a good history of the experience of the publisher, first the issues they found by extending a DTD and realizing that a validation layer would have been better, and then by seeing that that validation layer would be even better by more subsetting the DTD, to avoid maintenance effort.
Of course, every project has a unique story. But we ignore lessons learned at our (projects’) peril.