Mastodon
Where is there an end of it? | All posts tagged 'conformance'

Notes on Document Conformance and Portability #4

In my last post I wrote about the reaction to Microsoft's ODF support in the recent service pack released for their Office 2007 product, and in particular how claims of its "non-conformance" seemed ill-founded. Now, to look a little deeper at the conformance question, I will use an XML pipeline to validate some would-be ODF documents, to get a clear-sighted and spin-free look at what the state of ODF conformance really is.

XML Pipelines: The Next Big Thing

For many years pipelines have been recognised as something the XML community badly needed. Eager markup geeks would seek out Sean McGrath or Uche Ogbuji to hear miraculous tales of how XML pipelines could be put to work; some bold experimenters would try to coerce technologies like Apache Ant into action, and some pioneers would even specify and implement their own pipelining languages – witness, for example Eric van der Vlist's xvif, or maybe XPL, which happily sits at the heart of the awesome Orbeon Forms framework.

Now however, the W3C is on the cusp of finalising its XProc language and this looks set to bring pipelines into the mainstream. I am convinced that XProc is the most significant specification from the W3C since XSL, and fully expect it to become as pervasive in all XML shops.

So what are pipelines? Well, as we know XML processing models can be described as conforming to the model: "in; out; shake it all about". The "in" bit is catered for by XML storage technologies (eXist maybe), and the "out" bit is catered for by web servers; XProc is for the "shake it all about" bit, where, with XSLT it will become the engine of many an XML process. XSLT is great for transforms but less convenient for a number of day-to-day things we routinely want to do with XML: validating, stripping element, renaming attributes, glomming together, splitting up ...  Essentially, pipelines are for doing stuff to XML in a step-by-step way, but without the overhead of a full-on programming language, since XProc pipelines are written using nice, declarative XML.

Pipelines and Office Documents

One of these typical "day to day" tasks is validating XML inside ZIPs. Both ODF and OOXML resources are not simply XML documents, but "packages" (ZIP archives) of content which include several XML documents. So to perform a full validation, we need to visit the XML resources in the package and validate them all against their governing schemas to get an overall validation result. This is exactly the sort of scenario where XML pipelines can help.

A Walk Through

I am going to describe an XML pipeline for performing ODF validation using Calabash, a FOSS (GPL v2)  implementation of XProc for the JVM written by Norm Walsh (the XProc WG chair). I'm not going to cover the absolute basics, for those (and more) consult some of the excellent material on XProc already appearing on the web such as:

We start, immediately after the root element, with a couple of "option" elements. These allow values to be passed in from the outside. In our case, we need the name of the package we want to validate ...

<?xml version="1.0"?>
<pipeline name="validate-odf" xmlns="http://www.w3.org/ns/xproc"
  xmlns:cx="http://xmlcalabash.com/ns/extensions"
  xmlns:mf="urn:oasis:names:tc:opendocument:xmlns:manifest:1.0"
  xmlns:o="urn:oasis:names:tc:opendocument:xmlns:office:1.0">

  <!-- the URL of the package to be validated must be supplied by the caller -->
  <option name="package-url" required="true"/>

  <!-- whether to enforce use of the IEC/ISO 26300 schema -->
  <option name="force-26300-validation" select="'false'"/>

Next we import some extensions. Like XSLT, XProc is designed to be extensible and already additional sets of functions are becoming available. Calabash ships with a handy function for ZIP extraction which we are going to need.

  <!-- we use the Calabash extension in this library for looking inside ZIP files -->
  <import href="extensions.xpl"/>

Now we start the processing proper. This next step uses the ZIP extraction mechanism to pull the "manifest.xml" document out of the archive and outputs that XML for onward processing

  <!-- emits the package manifest -->
  <cx:unzip file="META-INF/manifest.xml">
    <with-option name="href" select="$package-url"/>
  </cx:unzip>

As a sanity check, we are going to make sure that this manifest actually conforms to the ODF manifest schema. I made this schema by manually extracting it from the ODF 1.1 specification (here referred to as "odf-manifest.rng"). As you can see, XProc makes this kind of document validation a cinch:

  <!-- validate the manifest against the manifest schema -->
  <cx:message message="Validating manifest ..."/>
  <validate-with-relax-ng assert-valid="false">    
    <input port="schema">
      <document href="odf-manifest.rng"/>
    </input>
  </validate-with-relax-ng>

[Update: I have added an @assert-valid="false" attribute here, as this is just a 'sanity check']

Now we start to visit the individual documents in the package referenced by the manifest. This is done here using the viewport step, which offers a kind of "keyhole surgery" option allowing us to isolate bits of a document. Here we're interested in all the <file-entry> elements in the manifest which (1) have a media type of "text/xml" and (2) aren't residing in the "META-INF" folder itself.

  <!-- visit each file entry in the manifest which targets an XML resource -->
  <viewport name="handle"
    match="mf:file-entry[@mf:media-type='text/xml'
    and not(starts-with(@mf:full-path,'META-INF'))]">

For each of these <file-entry> elements, a @full-path attribute specifies the name of an XML resource in the ZIP, again we use the unzip step to pull each of these XML documents from the archive:

    <!-- assume paths are relative to package base, and extract the XML resource -->
    <cx:unzip name="get-validation-candidate">
      <with-option name="href" select="$package-url"/>
      <with-option name="file" select="/*/@mf:full-path"/>
    </cx:unzip>

Once we've grabbed an XML resource, we need to work out which schema to use to validate it. Generally this can be done by looking at a @version attribute on the root element. However, ODF does not make this mandatory and so implementations are free to omit it. ODF specifies no fall-back rules, so we need to invent our own. What I've done here is to use the version specified, but fall back to the most recent published standard (1.1) when it is not specified.

    <!-- emits the schema RELAX NG that corresponds to the ODF version -->
    <choose name="get-relax-ng-schema">
      <when test="$force-26300-validation='true' or /*/@o:version='1.0'">
        <cx:message message="Validating with v1.0 schema ..."/>
        <load href="OpenDocument-schema-v1.0.rng"/>
      </when>
      <when test="/*/@o:version='1.2'">
        <cx:message message="Validating with draft v1.2 schema ..."/>
        <load href="OpenDocument-schema-v1.2-cd01-rev05.rng"/>
      </when>
      <otherwise>
        <cx:message message="Validating with v1.1 schema ..."/>        
        <load href="OpenDocument-schema-v1.1.rng"/>        
      </otherwise>
    </choose>
    <identity name="the-schema"/>

So now we have the document to validate, and the schema to use. We simply need to apply one to the other:

    <!-- and: validates the candidate against the schema -->
    <validate-with-relax-ng>
      <input port="schema">
        <pipe step="the-schema" port="result"/>
      </input>
      <input port="source">
        <pipe step="get-validation-candidate" port="result"/>
      </input>
    </validate-with-relax-ng>

  </viewport>
</pipeline>

Et voilà, a complete pipeline for validating ODF instances. Running it against packages which contain invalid XML will cause the pipeline processor to halt and report a dynamic error, for that is the default behaviour of the validate-with-relax-ng step.

Since ODF is clear that invalid XML signals non-conformance to the spec, we know that any package which fails this pipeline is, beyond argument, non-conformant.

Running It

Rob Weir helpfully provided a ZIP of the spreadsheets used for his Maya's Wedding Planner piece. Consult his blog entry for details of how these documents were produced. Putting these 7 test files through our pipeline we get this result:

Producer                   FAIL    PASS
---------------------------------------
Google                      X
KSpread                              X
Symphony                    X
OpenOffice                  ? *
Sun Plugin                  ? *
CleverAge                            X
MS Office 2007 SP2                   X
---------------------------------------
* See update below

So, Why the Failures?

  • Google failed because for some bizarre reason the manifest.xml document in its package specified a document type declaration referring to a non-existent "Manifest.dtd"; the processor cannot find this DTD and aborts with an IO Exception.
  • Symphony failed because its styles.xml document contained a date-value of "0-00-00". This fails to match the datatyping rules the ODF 1.1 schema uses to police date values.
  • OpenOffice failed because its manifest was not valid to the 1.1 schema. Now, this is an odd result as the manifest claims to be valid to version "1.2" of the ODF schema, yet consulting the latest drafts of ODF 1.2 it appears the manifest schema is not defined there, but has been planned for being specified in a new "Part 3" of ODF. I cannot find Part 3 of ODF in draft – maybe the OOo code has been written, but the standards text not fitted to it yet. If somebody can point me to a public draft of this schema, I'd like to re-run this test. [Update: I have now been pointed at the draft of Part 3 of ODF 1.2, and it does indeed contain a new schema. This draft is unfinished and contains non conformance clause, so it is not really possible to know for sure whether a package conforms to it. However, the OOo package here is invalid to the schema. I am going to assume that Part 3 will mirror the draft of Part 1 of ODF 1.2, and so will require schema validity. On that (reasonable) basis this OOo package is non-conformant; but of course the draft might change tomorrow. We do not know quite what version of the spec is being targetted here ...]
  • The Sun Plugin also failed because its manifest uses a @manifest:version attribute which the 1.1 schema does not declare. Again, maybe this is valid to some draft schema I have not seen, but it certainly does not conform to any published version of ODF. As above, if I can get a new schema I can re-run the test. [Update: see bullet above, it's the same here]

Conclusions

There had been a lot of spin in the blogosphere about who is, and who is not, supporting ODF at the moment. This validation test focusses on a small but important area of that discussion: conformance. One of the reasons it is important is that it is testable. From the test above we have the hard fact that most of the mainstream ODF applications are failing to emit standards-conformant ODF, even for a case as simple as "Maya's Wedding Planner". Surprisingly when assessing conformance it appears KOffice, Microsoft and CleverAge are leading the conformance pack; while Sun, Google and IBM have fallen behind.

To me this merely goes to confirm one of the fundamental dynamics of standardisation; done right, standards wrench "ownership" from those who thought they owned them, and distributes that ownership through the community at large. We, as users, should be applauding the widening adoption of ODF - and should be keeping the pressure on those vendors that seem to have been left behind, to raise their games.

Notes on Document Conformance and Portability #3

Now that the furore about Microsoft’s implementation of ODF spreadsheet formulas in Office SP2 has died down a little, it is perhaps worth taking a little time to have a calm look at some of the issues involved.

Clearly, this is an area where strong commercial interests are in play, not to mention an element of sometimes irrational zeal from those who consider themselves pro or anti (mostly anti) Microsoft.

One question is whether Microsoft did “The Right Thing” by users in choosing to implement formulas the way they did. This is certainly a fair question and one over which we can expect there to be some argument.

The fact is that Microsoft’s implementation decision means that, on the face of it, they have produced an implementation of ODF which does not interoperate with other available implementations. Thus IBM blogger Rob Weir can produce a simple (possibly simplistic) spreadsheet, “Maya’s Wedding Planner” and used it to illustrate, with helpful red boxes for the slow-witted, that Microsoft’s implementation is a “FAIL” attributable to “malice or incompetence”. For good measure he also takes a side-swipe at Sun for their non-interoperable implementation. In this view, interoperability aligning with IBM’s Symphony implementation is – unsurprisingly – presented as optimal (in fact, you can hear the sales pitch from IBM now: “well, Mr government procurement officer, looks like Sun and MS are not interoperable, you won’t want these other small-fry implementations, and Google’s web-based approach isn’t suitable – so looks like Symphony is the only choice …”)

Microsoft have argued back, of course, most strikingly in Doug Mahugh’s 1 + 2 = 1? blog posting, which appears to present some real problems with basic spreadsheet interoperability among ODF products using undocumented extensions. The MS argument is that practical ODF interoperability is a myth anyway, and so supporting it meaningfully is not possible (in fact, you can hear the sales pitch from MS now: “well, Mr government procurement officer, looks like ODF is dangerously non-interoperable: here, let me show you how IBM and Sun can’t even agree on basic features; but look, we’ve implemented ISO standard formulas, so we alone avoid that – and you can assess whether we’re doing what we claim – looks like MS Office is the only choice …”)

Personally, I think MS have been disappointingly petty in abandoning the “convention” that the other suites more or less use. I accept that these ODF implementations have limited interoperability and are unsafe for any mission-critical data, but for the benefit of the “Maya’s Wedding Planner” type of scenario, where ODF implementations can actually cut it, I think MS should have included this legacy support as an option, even if they did have to qualify that support with warning dialogs about data loss and interoperability issues.

But - vendors are vendors; it is their very purpose to compete in order to maximise their long-term profits. Users don’t always benefit from this. We really shouldn’t be surprised that we have IBM, Sun and Microsoft in disagreement at this point.

What we should be surprised about is how this interoperability fiasco has been allowed to happen within the context of a standard. To borrow Rick Jelliffe’s colourfully reported words, the whole purpose of shoving an international standard up a vendor’s backside it to get them to behave better in the interests of the users. What has gone wrong here is in the nature of the standard itself. ODF offers an extremely weak promise of interoperability, and the omission of a spreadsheet formula specification in ODF 1.1 is merely one of the more glaring facets of this problem. As XML guru James Clark wrote in 2005:

I really hope I'm missing something, because, frankly, I'm speechless. You cannot be serious. You have virtually zero interoperability for spreadsheet documents.

To put this spec out as is would be a bit like putting out the XSLT spec without doing the XPath spec. How useful would that be?

It is essential that in all contexts that allow expressions the spec precisely define the syntax and semantics of the allowed expressions.

These words were prophetic, for now we do indeed face a present zero interoperability reality.

The good news is that work is underway to fix this problem: ODF 1.2 promises, when it eventually appears, to specify formulas using the new OpenFormula specification. When that is published vendors will cease to have an excuse to create non-interoperable implementations, at least in this area.

Is SP2 conformant?

Whether Microsoft’s approach to ODF was the wisest is something over which people may disagree in good faith. Whether their approach conforms to ODF should be a neutral fact we can determine with certainty.

In a follow-up posting to his initial blast, Rob Weir sets out to show that Microsoft’s approach is non-conformant, subsequent to his previous statement that “SP2's implementation of ODF spreadsheets does not, in fact, conform to the requirements of the ODF standard”. After quoting a few selected extracts from the standard, a list is presented showing how various implementations represent a formula:

  • Symphony 1.3: =[.E12]+[.C13]-[.D13]
  • Microsoft/CleverAge 3.0: =[.E12]+[.C13]-[.D13]
  • KSpread 1.6.3: =[.E12]+[.C13]-[.D13]
  • Google Spreadsheets: =[.E12]+[.C13]-[.D13]
  • OpenOffice 3.01: =[.E12]+[.C13]-[.D13]
  • Sun Plugin 3.0: [.E12]+[.C13]-[.D13]
  • Excel 2007 SP2: =E12+C13-D13

Rob writes, “I'll leave it as an exercise to the reader to determine which one of these seven is wrong and does not conform to the ODF 1.1 standard.”

Again, this is clearly aimed at the slow witted. One can imagine even the most hesitant pupil raising their hand, “please Mr Weir, is it Excel 2007 SP2?” Rob however, is too smart to avoid answering the question himself, and anybody who knows anything of ODF will know that, in fact, this is a tricky question.

Accordingly, Dennis Hamilton (ODF TC member and secretary of the ODF Interoperability and Conformance TC) soon chipped in among the blog comments to point out that ODF’s description of formulas is governed by the word “Typically”, rendering it arguably just a guideline. And, as I pointed out in my last post, it is certainly possible to read ODF as a whole as nothing more than a guideline.

(I am glad to be able to report that the word “typically” has been stripped from the draft of ODF 1.2, indicating its existence was problematic.)

Curious readers might like to look for themselves at the (normative) schema for further guidance. Here, we find the formal schema definition for formulas, with a telling comment:

<define name="formula">
  <!-- A formula should start with a namespace prefix, -->
  <!-- but has no restrictions-->
  <data type="string"/>
</define>

Which is yet another confirmation that there are no certain rules about formulas in ODF.

So I believe Rob’s statement that “SP2's implementation of ODF spreadsheets does not, in fact, conform to the requirements of the ODF standard” is mistaken on this point. This might be his personal interpretation of the standard, but it is based on an ingenious reading (argued around the meaning of comma placement, and privileging certain statements over other), and should certainly give no grounds for complacency about the sufficiency of the ODF specification.

As an ODF supporter I am keen to see defects, such as conformance loopholes, fixed in the next published ODF standard. I urge all other true supporters to read the drafts and give feedback to make ODF better for the benefit of everyone, next time around.

Notes on Document Conformance and Portability #1

Richard Gillam’s handy book, Unicode Demystified: A Practical Programmers Guide to the Encoding Standard, contains an example of right-to-left text appearing in a prevailing left-to-right writing direction:

Avram said “מזל טוב.‏” and smiled.

Whether you see here what you are meant to see here will depend on your browser's Unicode support, and whether you have Hebrew fonts installed. Properly rendered, it will look something like this:

In reading order, the first character after “said” is the “מ” character to the left of the closing quotation mark. The text then runs from right to left until the full-stop, and then resumes with “and smiled”. In Unicode, this text is not represented in rendering order, but reading order – it is up to the renderer to make space and reverse direction at the correct points. Here is the text represented as XML in a paragraph in an ODF document (get the document here):

<text:p>Avram said “&#x5de;&#x5d6;&#x5dc; &#x5d8;&#x5d5;&#x5d1;.&#x200f;” and smiled.</text:p>

One of the great things about XML is its solid basis in Unicode and therefore its use of the Universal Character Set (ISO/IEC 10646). XML defines a number of encodings for this character set, and in the XML above the numeric character reference mechanism is used for the Hebrew characters. Notice, just to the left of the full stop the use of U+200F 'RIGHT-TO-LEFT MARK' which specifies that the full stop is part of the right-to-left character sequence.

Viewing this document in three ODF applications (OpenOffice 3, Google Docs with FireFox, and the new MS Office 2007 SP2) give the correct result every time. That is good news.

And if, for an ODF application, the character sequence did not appear correctly (if, say, the full stop was out-of-place) we would be able to say unequivocally that it was faulty; and we would be able to point to the Unicode specification where the correct behaviour was described. We (the user) would be able to bang the table and demand the bug was fixed.

This kind of process is one one of the pillars of conformance testing: application conformance testing, to be exact. Where we have a solid spec and observable behaviour we can compare the two and make a judgement.

Where we don't have a solid spec, things get trickier. For the standardiser's viewpoint, and if its not too highfalutin (and anyway, I claim Cambridge resident's special rights), we might want to quote Wittgenstein on such occasions: "Whereof one cannot speak, thereof one must be silent".

Real Conformance for ODF?

There has been quite a lot of hubbub recently about ODF conformance, in particular about how conformance to the forthcoming ODF 1.2 specification should be defined.

A New Conformance Clause

Earlier versions of ODF (including ISO/IEC 26300) already defined conformance - it was simply a question of obeying the schema. So in ODF 1.1, for example, we had this text:

Conforming applications [...] shall read documents that are valid against the OpenDocument schema if all foreign elements and attributes are removed before validation takes place [...] (1.5)

and that was the simple essence of ODF conformance.

This is now up for reconsideration. The impetus for altering the existing conformance criteria appears to have come from a change in OASIS's procedures, which now require that specifications have “a set of numbered conformance clauses”, a requirement which seems sensible enough.

However, the freshly-drafted proposal which the OASIS TC has been considering goes further than just introducing numbered clauses: it now defines two categories of conformance:

  1. “Conforming OpenDocument Document” conformance
  2. “Conforming OpenDocument Extended Document” conformance

as shorthand, we might like to characterise these as the “pure” and “buggered-up” versions of ODF respectively.

The difference is that the “pure” version now forbids the use of foreign elements and attributes (i.e. those not declared by the ODF schema), while the “buggered-up” version permits them.

Ructions

The proposal caused much debate. In support of the new conformance clause, IBM's Rob Weir described foreign elements (formerly so welcome in ODF) as proprietary extensions that are “evil” and as a “nuclear death ray gun”. Questioning the proposal, KOffice's Thomas Zander wrote that he was “worried that we are trying to remove a core feature that I depend on in both KOffice and Qt”. Meanwhile Microsoft's Doug Mahugh made a counter-proposal suggesting that ODF might adopt the Markup Compatibility and Extensibility mechanisms from ISO/IEC 29500 (OOXML).

Things came to a head in a 9-2-2 split vote last week which saw the new conformance text adopted in the new ODF committee specification by will of the majority. Following this there was some traffic in the blogosphere with IBM's Rob Weir commenting and Microsoft's Doug Mahugh counter-commenting on the vote and the circumstances surrounding it.

Shadow Play

What is to be made of all this? Maybe Sun, whose corporate memory still smarts from Microsoft's “extend and embrace” Java attempts, thinks this is a way to prevent a repeat of similar stunts for ODF. Or perhaps this is a way to carve out a niche for OpenOffice to enjoy “pure” status while competitor applications are relegated to the “buggered-up” bin. Maybe it is envisaged that governments might be encouraged to procure only systems that deal in “pure” ODF. Maybe foreign elements really are the harbinger of nuclear death.

Who knows?

Whatever the reasons behind the reasons, there is clearly an “absent presence" in all these discussions: Microsoft Office. And in particular the forthcoming Microsoft Office 2007 SP2 with its ODF support. It is never mentioned, except in an occasional nudge-nudge wink-wink sort of way.

This controvery is most bemusing. This is in part because the “Microsoft factor” appears not to be a factor anyway, since MS Office will (we are told) not use foreign elements for its ODF 1.1 support. But the main reason why this is bemusing is that this discussion (whether or not to permit foreign elements) is completely unreal. There seems to be an assumption that it matters – that conformance as defined in the ODF spec means something important when it comes to real users, real procurement, real development or real interoperability.

It doesn't mean anything real - and here's why...

Making an ODF-conformant Office Application

Let us consider the procurement rules of an imaginary country (Vulgaria, say). Let us further imagine that Vulgaria's government wants to standardize on using ODF for all its many departments. After many hours of meetings, and the expenditure of many Vulgarian Dollars on consultancy fees, the decision is finally made and an official draws up procurement rules to stipulate this:

Any office application software procured by the Government of Vulgaria must support ODF (ISO/IEC 26300), and must conform to the 'pure' conformance class defined in clause x.y of that Standard, reading and emitting only ODF documents that are so conformant".

Sorted, they think.

Now imagine a software company that has its eye on making a big sale of software licenses to Vulgaria. Unfortunately, its office application does not meet the ODF conformance criterion set out by the procurement officer. The marketing department is duly sad. But one day a bright young developer gets to hear of the problem and proposes a solution. He boldy proclaims “I can make our format ODF-conformant today!”, and proceeds to show how.

First he gets a template ODF document, like this:

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
office:version="1.0">
<office:body>
<office:text>
<text:p></text:p>
</office:text>
</office:body>
</office:document-content>

This document (he points out) meets the “pure” conformance criteria. Our young hacker then does a curious thing: he takes an existing (non-ODF) file from their office software, BASE-64 encodes it, and inserts the resulting text string into the element in the template document.

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
office:version="1.0">
<office:body>
<office:text>
<text:p><!-- several MBs of BASE-64 encoded content here --></text:p>
</office:text>
</office:body>
</office:document-content>

There, he proudly proclaims. All we need to do it to wrap our current documents with the ODF wrapper when we save, and unwrap when we load – I can have a fresh build for you tomorrow.

The rest of the story is not so happy: the software company makes the sale and the government of Vulgaria finds after installation that none of the files from it will interoperate with any other ODF files from other sources, despite the software company having met its procurement rules to the letter.

Far fetched?

Okay, that story makes an extreme example – but it neverthess illustrates the point. It is possible for a smart developer to represent pretty much anything as a “pure” ODF document; any differences and incompatibilities can ever-so-easily be shoehorned into conformant ODF documents. That some software deals only in such pure ODF means precisely zero in the real world of interoperability.

The central consideration here is that ODF conformance only ever was (and is only projected to be) stated in terms of XML, and XML is (in)famously “all syntax and no semantics”. The semantics of an ODF document (broadly, all the narrative text in the specification) play no part in conformance can remain unimplemented in a conformant processor. An ODF developer can safely use just the schema and never read much else. All those descriptions of element behaviour can be ignored for the purposes of achieving ODF conformance. [N.B. mistakes in this para corrected following comment from Rob Weir, below]

So my question is: what is the current debate on ODF conformance really about? It looks to me like mis-directed effort.

What ODF might usefully do is to look at the “application description” feature introduced into OOXML. This describes several types of applications, including a type called “full”. Such applications have “a semantic understanding of every feature within [their] conformance class”, and

“Semantic understanding” is to be interpreted that an application shall treat the information in Office Open XML documents in a manner consistent with the semantic definitions given in this Specification.

In other words, it is possible to specify in OOXML procurement that the processor should heed the narrative description within that Standard (not just the XML grammar). ODF currently lacks this. In my view if there is to be any connection between a definition of ODF conformance and the experience of users in the real world, then something like OOXML's “application description” feature is urgently needed. And it might be better done now, than hastily inserted during a JTC 1 BRM ...