Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ODF XML Reference HTML: Using some short notation for relationship of node children #184

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

svanteschubert
Copy link
Contributor

Problem: Making "child relations" of ODF grammar evident for ODF Reference

I wanted to do this for more than 10 years, even the latest ODF 1.3 specification still omits the relationship of child nodes (sequence, choice, optional in their listings), see for instance:

image
from https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part3-schema/OpenDocument-v1.3-os-part3-schema.html#element-form_property

You still need to look into the RNG (at least now in HTML since ODF 1.3) to understand the constraints!

Instead, in the future, the relations of the child nodes should become more evident (here with a screenshot from the generated text - no layout in HTML yet) ->
image
taken from https://tdf.github.io/odftoolkit/odf1.3/OpenDocument-v1.3-reference__TestChildRelations.html#element_form:property_1 (and text manually formatted)

Solution Overview

Status

This work on this ODF Toolkit branch is still in progress and this is an intermediate pull-request to receive feed-back!

Building/Testing the Solution

  1. First of all the MSV (Multi Schema Validator) created 25 years ago to validate XML (was not for code generation).
    Code prefixes were completely irrelevant and were added by the branch namespace-prefix2 you need to clone
    git clone -b namespace-prefix2 https://github.com/xmlark/msv
    mvn install
    Afterwards the JAR will be deployed to your local .m2 directory and the generator build will work.
  2. Second you need to work on the branch of the pull request "odf-reference"
    git clone -b odf-reference https://github.com/tdf/odftoolkit
    There you may test quickly the output for one RNG expression via a test (no regression testing yet)!

Before I explain the solution, just a quick description of RNG and MSV in a nutshell.

RelaxNG in a Nutshell - compared to XSD

The RNG specification is about 21 pages (DINA4) long and easy to read.
The XSD specification is about 380 pages and I find only the Primer readable.

The W3C Schema Grammar 1.0 2nd Ed. (28 October 2004) - 364 DIN A4 pages:

  1. https://www.w3.org/TR/xmlschema-0/ - Primer - 79 pages
  2. https://www.w3.org/TR/xmlschema-1/ - Structures 122 pages
  3. https://www.w3.org/TR/xmlschema-2/ - Datatypes - 163 pages

The W3C Schema Grammar 1.1 (5 April 2012) - 387 DIN A4 pages:

  1. https://www.w3.org/TR/xmlschema11-1/ - Structuress - 208 pages
  2. https://www.w3.org/TR/xmlschema11-2/ - Datatypes - 177 pages

In addition, there is a paper that states RNG is the more expressive languages over XSD and DTD. Others state similar.

Note: There is a RNG Tutorial (22 pages) and I have merged the errata into the RNG spec (still a PullRequest).

What I yet not fully understand on RNG is the advantage of simplification the RNG into a normalized run-time model becoming a binary tree.

The RNG specific describes like a choice
<choice> p1 p2 p3 </choice>
is transformed into binary-tree run-time representation:
<choice> <choice> p1 p2 </choice> p3 </choice>

MSV in a Nutshell

MSV is a Multi-Schema validator as it reads various Grammars into a single ueber model, the Abstract grammar model (AGM), which is based on RNG expressions all stored in a HashMap the ExpressionPool.

Similar to the Java Object superclass there is the Expression class where all Expressions the atoms of a grammar file are made of.

All possible expression types are listed in an expression visitor.
I personally find visitors not such intuitive, so in a sentence explained:
"Expression.java" - the base class of all expressions defines an abstract method for each visitor type (type differ on return type), like the visitor returning void is called ExpressionVisitorVoid.java.

The bad side: I find it unintutive, the good side: all functionality is written in the visitor.

The Solution in more Detail

I am using this visitor pattern to start with an expression and traverse its descendants. The functionality is triggered from the XMLModel class so its grammar is parsed once (and cached) to find the "heads of islands" parts of the grammar graph which is multiple used and might lead into infinite loops during traversal.
This functionality is in the end called from the Velocity HTML template for OdfReference.html.

What's missing - IMHO

  1. The MSV namespace-prefix2 branch still requires some new "evil tests" with grammars changing their prefixes, etc. before its integration
  2. The HTML layout of OdfReference.html: Currently it is just unformatted text as HTML, no table nor lists for positioning the brackets and/or nodes.
  3. Test some complex cases for instance table:table if the complexity can be taken away!
    (e.g. indent for each node child level, perhaps CSS colors for aligned brackets?
  4. Uncertain if we should give compact RNG a try (or to identify in detail why not used it in the first place to use it for this task)
  5. RNG pattern for ODF are not working yet.

@svanteschubert svanteschubert self-assigned this Sep 27, 2022
@svanteschubert svanteschubert changed the title ODF XML Reference HTML: Create RegEx similar short notation for relationship of node children ODF XML Reference HTML: Using some short notation for relationship of node children Sep 27, 2022
@svanteschubert svanteschubert marked this pull request as draft September 27, 2022 18:47
…e in HTML - requires MSV prefixes from branch namespace-prefix2 of github.com/xmlark/msv
…h requires the MSV namespace-prefix2 branch in order to create the prefixes in the OdfReference.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant