Skip to content

Commit

Permalink
Pruning branches
Browse files Browse the repository at this point in the history
  • Loading branch information
pietercolpaert committed Sep 1, 2024
1 parent 6497c7e commit e6f558e
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 21 deletions.
45 changes: 26 additions & 19 deletions 01-tree-specification.bs
Original file line number Diff line number Diff line change
Expand Up @@ -141,40 +141,48 @@ The full algorithm is specificied in the [shape topologies](https://w3id.org/tre
After dereferencing a `tree:Node`, a client MUST extract all (zero or more) `tree:Relation` descriptions from the page.
This can be done by searching for `<> tree:relation ?R` triples.

A client MUST follow the object of the relation’s <code>?R tree:node ?object</code> triple, unless the client is able to prune them based on the type of the relation, detailled further on.
A client MUST follow the object of the relation’s <code>?R tree:node ?object</code> triple, unless the client is able to prune the branch reachable from that node (see further).

A client MAY also extract the <code>tree:Relation</code>’s <code>tree:remainingItems</code> if it exists.
If it does, it will be an integer indicating the remaining items to be found after dereferencing the node.

When traversing, a client SHOULD keep also try to detect faulty search trees by keeping a list of already visited pages.
When traversing, a client SHOULD detect faulty search trees by keeping a list of already visited pages.

When dereferencing the object of a <code>tree:node</code> triple, the client MUST follow redirects.

Note: Allowing redirects allows servers to rebalance their search trees over time.

A client MAY assume completeness of members intended by the search tree when it derefenced all node links.

## Pruning relations ## {#relationsubclasses}
# Pruning branches # {#relationsubclasses}

The object of <code>?R tree:value ?object</code> MUST be accompanied by a data type when it is a literal value.
In search trees, not commonly the `tree:Relation` will be used as the type, but more commonly it will be one of its subclasses.
For partial string matching, `tree:PrefixRelation`, `tree:SubstringRelation`, and `tree:SuffixRelation` exist.
For comparing various datatypes, `tree:GreaterThanRelation`, `tree:GreaterThanOrEqualToRelation`, `tree:LessThanRelation`, `tree:LessThanOrEqualToRelation`, `tree:EqualToRelation`, and `tree:NotEqualToRelation` exist.
Finally, for geospatial trees, `tree:GeospatiallyContainsRelation` exists.

More specific <code>tree:Relation</code> will have a <code>tree:path</code>, indicating the path from the member to the object on which the <code>tree:Relation</code> applies. For the different ways to express or handle a <code>tree:path</code>, we refer to [2.3.1 in the shacl specification](https://www.w3.org/TR/shacl/#x2.3.1-shacl-property-paths). All possible combinations of e.g., <code>shacl:alternativePath</code>, <code>shacl:inversePath</code> or <code>shacl:inLanguage</code> in the SHACL spec can be used. When <code>shacl:alternativePath</code> is used, the order in the list will define the importance of the order when evaluating the <code>tree:Relation</code>. A wildcard in the path is limited to the <code>tree:shape</code> of the <code>tree:Collection</code>.
A client decides, based on their own tasks, what relations are important to implement.
Each relation is a comparator function that helps deciding whether or not the subtree reachable from the `tree:node` link can be pruned.
All relation comparator functions to the same `tree:node` need to be evaluated using a logical AND.
As arguments, it this function takes:
1. The left-hand: what the members on the linked node will contain w.r.t. the literals reachable from the `tree:path`
2. The operator: decided by the type of the relation and the datatype or node type of the `tree:value` triple’s object.
3. The right-hand: the `tree:value` triple’s object.

If the client comes across a relation subclass it did not code against, it MUST return `true`.

Subclasses of <code>tree:Relation</code> commonly will use the <code>tree:path</code> to indicate the path from the member to the object on which the <code>tree:Relation</code> applies. For the different ways to express or handle a <code>tree:path</code>, we refer to [2.3.1 in the shacl specification](https://www.w3.org/TR/shacl/#x2.3.1-shacl-property-paths). All possible combinations of e.g., <code>shacl:alternativePath</code>, <code>shacl:inversePath</code> or <code>shacl:inLanguage</code> in the SHACL spec can be used. When <code>shacl:alternativePath</code> is used, the order in the list will define the importance of the order when evaluating the <code>tree:Relation</code>.
The result of the evaluation of the <code>tree:path</code>, is the value that must be compared to the <code>tree:value</code>.
When multiple results from the path are found, they need to be interpreted in the function using a logical OR.

The target object of a <code>tree:path</code> SHOULD be materialized in the current Node document, but when it is not, the object MAY be considered implicit on the condition both <code>tree:path</code> and <code>tree:member</code> are defined.
In contrast to <code>sh:path</code>, a <code>tree:path</code> MAY refer to an implicit property and may not be materialized in the current response. This may break SPARQL processors that did not yet come across the object before in their query plan. However, the tree may still be useful for query processors that, for example, prioritize queries according to the user’s location, and first download nodes that are nearby the user. Therefore, the materialized location of the object is not needed. While not recommended, possible heuristics could try to infer the data, could try to fetch it through another <code>tree:Collection</code>, or retrieve it using URI dereferencing.
Wildcards in the paths (i.e. `sh:zeroOrMorePath`) however do not trigger any further look-ups.

TODO: Note: Need a note on wildcards here

Deprecated: A <code>tree:import</code> MAY be defined in the <code>tree:Relation</code> instance. When there is a <code>tree:path</code> defined, and when the relation is flagged interesting to follow, the import link needs to be downloaded in order to find the necessary literals to be compared (it is thus already a <code>tree:ConditionalImport</code>.

Note: Deprecated: an example of a <code>tree:import</code> is given [in the repository](https://github.com/TREEcg/specification/blob/master/examples/geospatially-ordered-public-transport/first.ttl#L27).
When the type given for a certain Relation is <code>tree:Relation</code>, then the client MUST dereference the node if the client needs more results.
While this may seem useless, this more or less has similar semantics to a next page link.

When the *only* type given for a certain Relation is <code>tree:Relation</code>, then the client must dereference all of the nodes. While this may seem useless, it can be used for the same use case as a <code>hydra:PartialCollectionView</code>.

For other types check the chapter on relation types in the vocabulary [](#Relation).

### Comparing strings ### {#strings}
## Comparing strings ## {#strings}

String values have three specific type of relations: the <code>tree:PrefixRelation</code>, the <code>tree:SubstringRelation</code> and the <code>tree:SuffixRelation</code>.

Expand All @@ -189,21 +197,20 @@ When no language is set, all strings are compared.

Note: If you want to have one resource containing both <code>e</code> and <code>é</code> as a prefix, you will have to create multiple relations to the same <code>tree:Node</code>.

### Comparing named nodes ### {#named-nodes}
## Comparing named nodes ## {#named-nodes}

When using comparator relations such as <code>tree:GreaterThanRelation</code>, named nodes must be compared as defined in the [ORDER BY section of the SPARQL specification](https://www.w3.org/TR/sparql11-query/#modOrderBy).

### Comparing geospatial features ### {#geospatial}
## Comparing geospatial features ## {#geospatial}

The <code>tree:GeospatiallyContainsRelation</code> is the relation than can be used to express all further members will be contained within a geospatial region defined by the WKT String in the <code>tree:value</code>.

When using <code>tree:GeospatiallyContainsRelation</code>, the <code>tree:path</code> MUST refer to a literal containing a WKT string, such as <code>geosparql:asWKT</code>.

### Comparing time literals ### {#time}
## Comparing time literals ## {#time}

When using relations such as <code>tree:LessThanRelation</code> or <code>tree:GreaterThanRelation</code>, the time literals need to be compared according to these 3 possible data types: <code>xsd:date</code>, <code>xsd:dateTime</code> or <code>xsd:dateTimeStamp</code>.


# Search forms # {#searching}

Searching through a TREE will allow you to immediately jump to the right <code>tree:Node</code>.
Expand Down
2 changes: 0 additions & 2 deletions vocabulary.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@ A <code>tree:Relation</code> is a function denoting a conditional link to anothe

A <code>tree:Node</code>, apart from the root node, has exactly one other <code>tree:Node</code> linking into it through one or more relations.

Note: The condition of multiple <code>tree:Relation</code>s to the same <code>tree:Node</code> MUST be combined with a logical AND.

A SearchTree is a specific set of interlinked <code>tree:Node</code>s, that together contain all members in a collection. A specific view will adhere to a certain growth or tree balancing strategy. In one SearchTree, completeness MUST be guaranteed, unless the SearchTree has a retention policy cfr. LDES.

A <code>tree:search</code> form is an IRI template, that when filled out with the right parameters becomes a <code>tree:Node</code> IRI, or when dereferenced will redirect to a <code>tree:Node</code> from which all members in the collection that adhere to the described comparator can be found.
Expand Down

0 comments on commit e6f558e

Please sign in to comment.