Skip to content

Commit

Permalink
add section about 'unstar' mapping
Browse files Browse the repository at this point in the history
  • Loading branch information
pchampin committed Nov 13, 2024
1 parent cc47f33 commit 8e07ca3
Show file tree
Hide file tree
Showing 4 changed files with 221 additions and 2 deletions.
10 changes: 10 additions & 0 deletions spec/ex-unstar-input.trig
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ex: <http://example.org/>

<< ex:s ex:p ex:o >> ex:q "some value".

GRAPH ex:g {
ex:s ex:p ex:o.
ex:s ex:p ex:o2.
}

10 changes: 10 additions & 0 deletions spec/ex-unstar-input2.trig
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ex: <http://example.org/>

_:r1 rdf:reifies <<( ex:s ex:p ex:o )>>.
_:r1 ex:q "some value".

GRAPH ex:g {
ex:s ex:p ex:o.
ex:s ex:p ex:o2.
}
19 changes: 19 additions & 0 deletions spec/ex-unstar-output.trig
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ex: <http://example.org/>

_:r1 rdf:reifies _:gen1.
_:r1 ex:q "some value".

GRAPH ex:g {
ex:s ex:p ex:o.
ex:s ex:p ex:o2.
}

GRAPH _:gen1 {
ex:s ex:p ex:o.
}

GRAPH rdf:unstarMetadata {
_:gen1 rdf:type rdf:TripleTerm.
}

184 changes: 182 additions & 2 deletions spec/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,19 @@
.grammar-literal { color: gray;}
code {color: #ff4500;} /* Old W3C Style */
div.abnf {margin-left: 1em;}

.algorithm ol {
counter-reset: numsection;
list-style-type: none;
}
.algorithm ol>li {
margin: 0.5em 0;
}
.algorithm ol>li:before {
font-weight: bold;
counter-increment: numsection;
content: counters(numsection, ".") ") ";
}
</style>
</head>

Expand Down Expand Up @@ -498,12 +511,12 @@ <h3>RDF Documents and Syntaxes</h3>
<p>This specification establishes two conformance levels:</p>

<ul>
<li><dfn class="no-export lint-ignore">Full conformance</dfn>
<li><dfn class="no-export lint-ignore" data-lt="full">Full conformance</dfn>
supports <a data-lt="RDF graph">graphs</a> and <a data-lt="RDF dataset">datasets</a>
with <a>triples</a> that contain <a>triple terms</a>.
Concrete syntaxes in which such graphs and datasets can be expressed include
[[RDF12-N-TRIPLES]], [[RDF12-N-QUADS]], [[RDF12-TURTLE]], and [[RDF12-TRIG]].</li>
<li><dfn class="no-export lint-ignore">Classic conformance</dfn>
<li><dfn class="no-export lint-ignore" data-lt="classic">Classic conformance</dfn>
only supports <a data-lt="RDF graph">graphs</a> or <a data-lt="RDF dataset">datasets</a>
with <a>triples</a> that do not contain <a>triple terms</a>.</li>
</ul>
Expand Down Expand Up @@ -1450,6 +1463,173 @@ <h2>Generalizations of RDF Triples, Graphs, and Datasets</h2>
</section>


<section id="section-classic-full-interop" class="informative">
<h2>Interoperability between RDF [=Classic=] and RDF [=Full=]</h2>

<p>This section provides transformations between RDF [=Classic=] datasets and RDF [=Full=] datasets,
to provide some level of interoperability between the different classes of <a href="#conformance">Conformance</a>.

<p class=issue>Should we go even further and aim to provide interoperability between <em>RDF 1.1</em> and RDF 1.2 [=Full=]?</p>

<p>It defines the <a href="#section-unstar-algo">`unstar`</a> algorithm, which transforms an RDF [=Full=] dataset into an RDF [=Classic=] dataset by encoding all triple-terms into dedicated named graphs.
This algorithm is designed to be:
</p>

<dl>
<dt>Information preserving</dt>
<dd>It must be possible to reconstruct the input dataset from the output dataset.
Note that, on the other hand, the algorithm is not designed to be semantics preserving:
the graphs in the produced dataset are not semantically <a>equivalent</a> to their corresponding graph in the input dataset.
</dd>
<dt>Idempotent</dt>
<dd>Transforming a dataset that is already complying with RDF [=Classic=] (i.e. containing no <a>triple term</a>) must result in the same dataset.
</dd>
<dt>Universal</dt>
<dd>It should be possible to transform any RDF [=Full=] dataset using this method.
There is actually <a href="#section-unstar-caveat">a minor caveat</a> to this property.
</dd>
</dl>


<p>The general principle of the <a href="#section-unstar-algo">`unstar`</a> algorithm is to replace each <a>triple term</a> `S P O` in the input dataset with a fresh blank node.
This blank node is then use as the name of a new named graph containing a single triple `S P O`.
In order to distinguish those new named graphs from the original named graphs of the input dataset,
dedicated triples are added to a special `rdf:unstarMetadata` named graph.</p>

<p class=note>The blank nodes generated by `unstar` to replace <a>triple terms</a> should not be confused with the <a>reifiers</a> that are typically associated to these <a>triple terms</a>.</p>

<p class=note>Despite having the same name, the `unstar` mapping defined in this specification is different from the `unstar` mapping defined in [[RDF-STAR-CG]].</p>


<section id="section-unstar-algo" class="algorithm">
<h2>The `unstar` algorithm</h2>

<p>The algorithm expects one input variable <var>Dᵢ</var> which is an <a>RDF dataset</a>. It returns a [=Classic=] <a>RDF dataset</a>.
In the algorithm, we adopt the view presented in <a href="#section-dataset-quad"></a>.
</p>

<ol>
<li>Let <var>Dₒ</var> be an empty <a>RDF dataset</a>.</li>
<li>Let <var>bnodes</var> be an empty map from <a>triple terms</a> to <a>blank nodes</a>.</li>
<li>Let <var>inputKind</var> be `null`.</li>
<li>For each quad (<var>s</var>, <var>p</var>, <var>o</var>, <var>g</var>) in <var>Dᵢ</var>:<ol>
<li>If <var>g</var> is not `null` and is the IRI `rdf:unstarMetadata`, then:<ol>
<li id="unstar-error1">If <var>inputKind</var> is `"full"` then exit with an error.</li>
<li>Otherwise, set <var>inputKind</var> to `"classic"`.</li>
</ol></li>
<li>If <var>o</var> is a <a>triple term</a>, then:<ol>
<li id="unstar-error2">If <var>inputKind</var> is `"classic"` then exit with an error.</li>
<li>Otherwise, set <var>inputKind</var> to `"full"`.</li>
<li>Let <var>b</var>, <var>M'</var> and <var>D'</var> be the result of invoking <a href="#section-qtt-algo">`quote-triple-term`</a> passing <var>o</var> as <var>t</var> and <var>M</var> as <var>Mi</var>.</li>
<li>Merge <var>M'</var> into <var>M</var>.
<li>Merge <var>D'</var> into <var>Dₒ</var>.
<li>Set <var>o</var> to <var>b</var>.
</ol></li>
<li>Add the quad (<var>s</var>, <var>p</var>, <var>o</var>, <var>g</var>) to <var>Dₒ</var>.</li>
</ol></li>
<li>Return <var>Dₒ</var>.</li>
</ol>
</section>

<section id="section-qtt-algo" class="algorithm">
<h2>The `quote-triple-term` algorithm</h2>

<p>This algorithm is responsible for incrementally populating the mapping <var>M</var> and the dataset <var>D</var> used internally by the <a href="#section-unstar-algo">`unstar`</a> algorithm. It receives a <a>triple term</a> as input and processes it recursively (in case its object is itself a <a>triple term</a>). It returns, among other things, the <a>blank node</a> minted to replace the <a>triple term</a> in the transformed [=Classic=] <a>RDF dataset</a>.</p>

<p>This algorithm expects two input variables:
a <a>triple term</a> <var>t</var>,
and a map <var>Mᵢ</var> from <a>triple terms</a> to <a>blank nodes</a>.
It returns a <a>blank node</a> <var>b</var>,
a map <var>Mₒ</var> from <a>triple terms</a> to <a>blank nodes</a>,
and a [=Classic=] <a>RDF dataset</a> <var>D</var>.
In the algorithm, we adopt the view presented in <a href="#section-dataset-quad"></a>.
</p>

<ol>
<li>Let <var>Mₒ</var> be an empty map.</li>
<li>Let <var>D</var> be an empty <a>RDF dataset</a>.</li>
<li>If <var>Mᵢ</var> contains a <a>blank node</a> <var>b</var> associated with <var>t</var>, then return <var>b</var>, <var>Mₒ</var> and <var>D</var>.
<li>Otherwise:<ol>
<li>Let <var>s</var>, <var>p</var> and <var>o</var> be the subject, predicate and object of <var>t</var>, respectively.</li>
<li>If <var>o</var> is a <a>triple term</a>, then:<ol>
<li>Let <var>b'</var>, <var>M'</var> and <var>D'</var> be the result of invoking <a href="#section-qtt-algo">`quote-triple-term`</a> passing <var>o</var> as <var>t</var> and <var>Mᵢ</var>.</li>
<li>Set <var>o</var> to <var>b'</var>.
<li>Merge <var>M'</var> into <var>Mₒ</var>.
<li>Merge <var>D'</var> into <var>D</var>.
</ol></li>
<li id="qtt-fresh-bnode">Let <var>b</var> be a fresh blank node.</li>
<li>Add the association (<var>t</var>, <var>b</var>) to <var>Mₒ</var>.</li>
<li>Add the quads (<var>s</var>, <var>p</var>, <var>o</var>, <var>b</var>) and (<var>b</var>, `rdf:type`, `rdf:TripleTerm`, `rdf:unstarMetadata`) in <var>D</var>.</li>
<li>Return <var>b</var>, <var>Mₒ</var> and <var>D</var>.</li>
</ol></li>
</ol>

<p class=note>
In <a href="#qtt-fresh-bnode">step 4.3</a> of this algorithm,
it is assumed that the blank node <var>b</var> is distinct from any blank node already in use in any dataset at hand,
in particular the dataset being processed in the invoking <a href="#section-unstar-algo">`unstar`</a> algorithm.
Some implementations may require that `unstar` passes that dataset to `quote-triple-term` to ensure this constraint.
</p>

</section>

<section id="section-unstar-caveat">
<h2>Limitations of the `unstar` algorithm</h2>

<p>Steps <a href="#unstar-error1">4.1.1</a> and <a href="#unstar-error2">4.2.1</a> of the <a href="#section-unstar-algo">`unstar`</a> algorithm exit with an error.
This will occur if the input dataset contains at the same time <a>triple terms</a> and a graph named `rdf:unstarMetadata`.
The `unstar` algorithm is therefore not strictly universal as it can not transform this particular kind of datasets.
</p>

<p>This limitation should not be an issue in practice.
The special graph name `rdf:unstarMapping` is unlikely to be in used in any published dataset,
as it was not defined in the RDF namespace prior to this specification.
For this reason, using it would actually have been bad practice.
As for future datasets, their authors should consider the graph name `rdf:unstarMetadata` to be reserved, in order to prevent interference with the `unstar` algorithm.
</p>

<p>Another consequence of this restriction is that users should be careful when merging dataset in an application that makes use of the `unstar` algorithm.
More precisely, merging a [=Full=] <a>RDF dataset</a> (containing at least one <a>triple term</a>)
with a [=Classic=] <a>RDF dataset</a> resulting from the application of `unstar` (and therefore containing an `rdf:unstarMetadata` named graph)
would result in a "hybrid" dataset that `unstar` can not process.
Such application should make sure to apply `unstar` to every dataset priori to merging them.
Since `unstar` is idempotent, there is no harm in applying it more than necessary.</p>

</section>

<p class=issue>We should probably provide the `restar` algorithm, the opposite of `unstar`.</p>

<section id="section-unstar-example">
<h2>Example of applying the `unstar` algorithm</h2>

<p>The examples in this section are using the Trig concrete syntax [[RDF12-TRIG]].</p>

<pre id="ex-unstar-input"
class="example nohighlight"
title="An input Full RDF dataset"
data-include="./ex-unstar-input.trig"
data-include-format="text"
></pre>

<pre id="ex-unstar-input2"
class="example nohighlight"
title="The same dataset as above, with reifiers made explicit"
data-include="./ex-unstar-input2.trig"
data-include-format="text"
></pre>

<pre id="ex-unstar-output"
class="example nohighlight"
title="The result of applying the `unstar` algorithm to the dataset above"
data-include="./ex-unstar-output.trig"
data-include-format="text"
></pre>

</section>


</section>

<section id="section-additional-datatypes" class="appendix">
<h2>Additional Datatypes</h2>
<p>This section defines additional <a>datatypes</a> that RDF processors MAY support.</p>
Expand Down

0 comments on commit 8e07ca3

Please sign in to comment.