Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what properties can or should link to triple terms? #127

Open
afs opened this issue Sep 13, 2024 · 44 comments
Open

what properties can or should link to triple terms? #127

afs opened this issue Sep 13, 2024 · 44 comments

Comments

@afs
Copy link
Contributor

afs commented Sep 13, 2024

Wiki page:
Notes from the Semantic Task Force meeting 2024‐09‐13

The wiki page will be revised based on any discussion on this issue.

Minutes:
https://www.w3.org/2024/09/13-rdf-star-minutes.html

@william-vw
Copy link

william-vw commented Sep 13, 2024 via email

@afs
Copy link
Contributor Author

afs commented Sep 15, 2024

Hi @william-vw,

The well-formedness approach is available via the sidebox on the wiki page but I've added links to a few relevant pages at the bottom of the "notes" wiki page to be more obvious. The "alternative baseline" and rdf:ReficiationProeprty is an alternative to see if such a well-formedness condition is needed.

The range constraint could go into RDFS but, as mention in the meeting, we are dealing with base RDF. The text note quoted from RDF Concepts contains the current way to express it at the RDF level. Presumably this could be reflected in RDFS and in the RDF namespace document. The WG may wish to upgrade the "note" to more formal text in RDF Concepts or RDF Semantics.

@william-vw
Copy link

@afs note that I wasn't saying that the well-formedness approach was not visible in the wiki, or that I am necessarily advocating for an RDFS range (or even starting this discussion). This was just my recollection - which may be faulty - of 2 "sub-options" proposed in the meeting :-)

@rat10
Copy link
Contributor

rat10 commented Sep 16, 2024

How would sub-properties of rdf:reifies play out? Would they meet the requirement of 2.ii, i.e. "rdf:reifies must be used when the object is a triple term"?

@afs
Copy link
Contributor Author

afs commented Sep 16, 2024

In 2.ii, the class rdf:ReificationProperty is fixed by the definition of RDF.

rdf:reifies is special. The RDF semantics would specifically mention rdf:reifies as the only member of rdf:ReificationProperty.

Having other properties, which may be subproperties of rdf:reifies in RDFS, is where the discussion brought up having a fixed set of properties, not just one. The set is a fixed set defined by specs.

It is a way of having the effect of the condition in "minimal-baseline" within the idea of metamodelling with rdf:ReificationProperty.

@pchampin
Copy link
Contributor

This was discussed during the rdf-star meeting on 26 September 2024.

View the transcript

Material about rdf:ReificationProperty 3

<AndyS> https://github.com/w3c/rdf-star-wg/wiki/Notes-from-the-Semantic-Task-Force-meeting-2024%E2%80%9009%E2%80%9013

AndyS: there is a not of the Semantics TF meeting, Friday 13 September.
… These are my notes for that, they are not formally agreed.

AndyS: I think the key point in there is that there are really only two options. Intermediate optiosn do not work out.

pfps: I believe this discussion should be deferred until we have several reifying properties.

<Zakim> gkellogg_, you wanted to point out that rdf:ReificationProperty is not a property, but a class. Perhaps there's a better name to use.

gkellogg_: noting that ReificationProperty is not a property but a class. Naming it that way can be confusing.
… Defining rdf:reifies without class defining its behaviours makes it a bit magical.
… Defining its behaviour in a class makes more sense to me.

<AndyS> +1 to renaming -- e.g. "rdf:ReificationPropertyClass"

tl: I'm promoting that other reification properties exists (e.g. rdfs:states), but I would also let anyone define their own reification properties.

<niklasl> The name does follow the naming pattern of rdf:Property, owl:ObjectProperty, owl:DatatypeProperty though?

<pfps> Oops, in trying to set up my headphone, I cut off all sound.

<pfps> I note that rdf:type is not a member of a class.

ora: about the name: yes it is a class, but we have other classes whose name ends with "Property" (rdf:Property, owl:TransitiveProperty)
… so this is not an issue.

<doerthe> I think pfps should go first

AndyS: I agree that there are others

<niklasl> rdf:type a rdf:Property # ?

pfps: I don't see why we need to do this. rdf:type is special, it does not belong to a class.

<doerthe> isn't rdf:type an rdf:Property?

<tl> @pfps the reason is that we want to somehow describe how to use triple terms

doerthe: isn't rdf:type an rdf:Property, not a specia one, but still a property.

<AndyS> rdf:ReificationProperty -- https://github.com/w3c/rdf-star-wg/wiki/RDF-star-%22alternative-baseline%22#metamodelling-entailment-patterns-and-axiomatic-triples

doerthe: The goal of ReificationProperty was to help people find out that a property was used with a triple term as object. Not more, not less.
… I don't see why adding "Class" to the name of ReificationProperty.
… I agree that we should discuss whether we want other reifictaion properties.

<william_vw> thanks everyone - I need to leave to attend a school assembly here

pchampin: -1 to rename rdf:ReificationProperty to rdf:ReificationPropertyClass. Names matter.
… +1 to what doerthe said; the goal of the ReificationProperty class was to flag properties based on.
… I'm not a big fan of ReificationProperty either. But, I don't see there's any special behavior. It's just a flag.
… Another way to flag properties that are used with TripleTerms is to make them sub-properties of rdf:reifies, but that would cross over to RDFS.
… Instead of saying that any property of type ReificationProperty is has behavior, make them sub=properties of rdf:reifies.

<tl> +1 to that

ora: what is the hypothetical effect that this has on implementers?
… If we say that rdf:reifies is a special property without putting it in a specia class, does it give implementers more latitude?
… I don't have the answer but I think yes.

pchampin: In my opinion it doesn't give more or less latitude to implementors.

AndyS: IIRC ReificationProperty was introduced by Enrico to replace the well-formed-ness condition.
… Previous properties were "reserving" rdf:reifies to be used in precise contexts, which make it a very special property.

ora: if rdf:reifies is a special property, similar to rdf:type, do we have examples or subproperties of rdf:type?

gkellogg_: Schema.org defines a subclass of rdf:type.

<niklasl> Yes, schema:additionalType is explicitly defined as rdfs:subPropertyOf rdf:type

tl: I like the idea of using rdfs:subPropertyOf
… Also, I concur with AndyS about the origin of Reification Property.

niklasl: I am not sure what I think about the subProperty solution. I also concur with AndyS.
… These properties have a special range, I think this was Enrico's point.

doerthe: I don't have an opinion yet about the subProperty solution. I don't know what it would do.
… Do we want to have a range for rdf:reifies?

<AndyS> It's not range but RDF concepts (editors working draft) has https://www.w3.org/TR/rdf12-concepts/#h-note-2

gkellogg_: I think we have discussed having a range for rdf:reifies of rdf:TripleTerm.
… This also came up in the discussion about unstar.

<doerthe> I am just afraid that we add a triple term class and make it the range or rdf:reifies

gkellogg_: the type rdf:TripleTerm is used to indicate that a blank node is representing a triple term.

niklasl: I agree. There was a discussion about naming it rdf:Triple vs. rdf:TripleTerm, but that's a separate discussion.

gtw: I mentioned SPIN before, where you don't necessarily want to set a range.
… We don't necessarily want to force a range on all reifying properties.

doerthe: the moment we set a range on rdf:reifies, I would disagree with the subProperty solution.

<MacTed> rangeIncludes ?

pchampin: I agree with doerthe and would vote against my own proposal if we have a range for rdf:reifies.
… I can see a range argument, and if we want it to be part of RDF Semantics, it's not good to mix. I see many drawbacks.

niklasl: we haven't talked about what Reification Property would mean.
… If it is just a marker, there is not much impact.

TallTed: it occurs to me that this may be a place where schema.org's rangeIncludes and domainIncludes can be useful.

<tl> I had that discussion with Greg Williams and though that when mixing ways to represent triple terms it's not surprising that issues occur. Apart from that, maybe I'm just ignorant so I would like to see examples from e.g. Dörthe

TallTed: These are not enforced schemata.

ora: I feel we should hear Enrico on this. Let's encourage him to read the minutes, and pick this up next Thursday.

pchampin: +1

<niklasl> +1

<tl> +1

<TallTed> we should perhaps close the items we've addressed (e.g., un-star)?


@rat10
Copy link
Contributor

rat10 commented Sep 27, 2024

Define an interpretation of Triple Terms #49 gives an example where:

This interpretation entails a blank node which denotes the same resource as the triple term (would be _:nnn owl:sameAs <<( sss ppp ooo )>> in OWL).

So owl:sameAs might be entailed to be an rdf:ReificationProperty, or a subproperty of rdf:reifies according to approaches discussed so far. That both sounds wrong, even if the entailment is just meant to "raise a flag", i.e. make authors aware that they might have made a mistake and may want to check.

I'm again leaning towards the position that we should just say something to the effect of:

  • rdf:reifieshas been defined for the purpose of creating reifiers,
  • you may create subproperties of rdf:reifies with more specific semantics
  • [proposal:] rdfs:states is such a subproperty of rdf:reifies that implies the statement it reifies and has special support through the Turtle-star annotation syntax
  • if you refer to triple terms by any other properties you should be aware that maybe you are in semantically undefined territory
  • or maybe your use is semantically sound, as in the owl:sameAs example above (but it's still your responsibility to check)

@gkellogg gkellogg removed the discuss-f2f Proposed for discussion during the next face-to-face meeting label Oct 24, 2024
@pfps
Copy link
Contributor

pfps commented Oct 25, 2024

This issue is basically about what properties are allowed to link to triple terms.

The major decision is whether there is only one property that connects to triple terms (rdf:reifies), there can be many properties (any instance of rdf:ReificationProperty, final name to be determined), or there are only a few properties (rdf:reifies and something like rdfs:states and maybe a small number of others). The first alternative is defined in RDF-star-"minimal-baseline". The second alternative is defined in RDF-star-"alternative-baseline". The decision is about what is allowed in RDF graphs and is not directly about any surface syntax or expansion of syntactic shorthands.

The first alternative would require that the only links to triple terms are for the rdf:reifies property.

:e rdf:reifies <<( :dick :married :liz )>> .  
:e :statedBy :nyt .
:e :statedOn "2024-10-08"^^xsd:date .

The second alternative would allow other properties as well, like

:supports rdf:type rdf:ReificationProperty .
:e :supports <<( :dick :married :liz )>> .  
:e :statedBy :nyt .
:e :statedOn "2024-10-08"^^xsd:date .

:augments rdf:type rdf:ReificationProperty .
:f :augments <<( :dick :married :liz )>> .  
:f :date "1977-07-08"^^xsd:date .
:f :location :lasvegas .

:lpg_property_group rdf:type rdf:ReificationProperty .
_:g :lpg_property_group <<( :dick :married :liz )>> .
_:g :lpg_date "1977-07-08" .
_:g :lpg_location "Las Vegas" .

:wikidata_qualifier_group rdf:type rdf:ReificationProperty .
_:h :wikidata_qualifier_group <<( :dick :married :liz )>> .
_:h :wikidata_qualifier_group "1977-07-08"^^xsd:date .
_:h :wikidata_qualifier_group :lasvegas .
_:h :wikidata_qualifier_group :deprecated .

Note that the details of the syntax are not important in the examples above. All that matters is the resultant RDF 1.2 graph.

(markup edited for clarity by @TallTed)

@rat10
Copy link
Contributor

rat10 commented Oct 28, 2024

IMO the first alternative is too restrictive, and that restriction is also easy to work around (probably too easy if one subscribes to the restriction). The minutes of the 25.10.24 Semantics TF meeting discuss some interesting aspects. What strikes me in particular is that even if the specification allows to refer to triple terms only via the rdf:reifies property, one can still type the resulting reifier as a :Claim, a :Stated, etc., e.g.:

_:r rdf:reifies <<( :s :p :o )>> ;
    a tl:Stated .

A tl:states property, defined as rdfs:subPropertyOf rdf:reifies and with rdfs:range :tl:Stated, would have the same effect. So restricting references of triple terms to rdf:reifies seems to not get us much, but adds verbosity.

Reification has a very general semantics, and it seems that all useful references to it a triple term can be defined as specializations of reification. What properties could there be that would be impossible to define as subproperties of rdf:reifies? Well, owl:sameAs doesn't fit that bill, e.g.:

_:t owl:sameAs <<( :s :p :o )>> ;
    rdfs:label "my first triple term" .

Here, _:t is not a reification, but stands in for the triple term itself (b.t.w, this is a way to sneak triple terms in subject position). So it seems that there is no sensible way to restrict usage of triple terms.

If it's not even possible to sensibly assume that all references to triple terms reify the triple term, then what is there left to define? Maybe just this:

  • the spec defines rdf:reifies and its subproperty rdfs:states with clear and normative semantics
  • everybody is free to refer to triple terms in any other way, but should consider to define those references as subproperties of rdf:reifies (or even rdfs:states).

@pfps pfps changed the title Material about rdf:ReificationProperty what properties can link to triple terms Oct 30, 2024
@pfps pfps changed the title what properties can link to triple terms what properties can or should link to triple terms Oct 30, 2024
@william-vw
Copy link

william-vw commented Nov 1, 2024

EDIT: To avoid confusion, I updated all subsequent posts to use <<(and )>> (including those by others, if they also meant to use the former syntax).

This is a general comment on how nesting triple terms may be a more verbose (but possibly simpler) alternative for reifiers. I wasn't being very specific with the particular notation; and it abstracts from current translations using reifiers.

From today's meeting, I understood that reifiers are quite useful to group metadata that follows a different "interpretation" or "context". E.g.,

_:x :reifies <<( :richard :marriedTo :liz )>> .
_:y :reifies <<( :richard :marriedTo :liz )>> .
_:x :belief true ; :by :bob , :alice .
_:y :belief false ; :by dave , :megan .

This way, we still know which people found the belief to be true or false.

Making a parallel with conceptual modeling, where you essentially draw circles around stuff to reify them (<<( and )>> here being the circles); we can really do the same by nesting triple terms:

<<( <<( :richard :marriedTo :liz )>> :belief true )>> :by :bob , :alice .
<<( <<( :richard :marriedTo :liz )>> :belief false )>> :by :dave , :megan .

I.e., we draw an extra circle around the extra "context" (:belief true/false) and the original triple term; the outer "circle" or triple term then groups metadata that follows this context. (It is likely I'm not the first one to raise this. Apologies in that case.)

The ordering will be very relevant here ; it will depend on the desired context. If the desired context is :by :bob/dave/... (for instance, to indicate time when it was stated etc) we can instead write:

<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :belief true ; :time 123 .
<<( <<( :richard :marriedTo :liz )>> :by :dave )>> :belief false ; :time 456 .
<<( <<( :richard :marriedTo :liz )>> :by :alice )>> :belief true ; :time 789 .
<<( <<( :richard :marriedTo :liz )>> :by :megan )>> :belief false ; :time 135 .

(This will also require extra reifiers to distinguish between bob/alice, etc. I am just pointing this out to say that ordering will have an impact.)

I can also imagine cases where multiple contexts will require multiple levels of nesting:

_:x :at Mariott ; :belief true ; :by :bob , :alice .
_:y :at :Waldorf ; :belief false ; :by :dave , :megan .

Becomes (as a triple term cannot include property and object lists :-)):

<<( <<( <<( :richard :marriedTo :liz )>> :at :Mariott )>> :belief :true )>> :by :bob , :alice .
<<( <<( <<( :richard :marriedTo :liz )>> :at :Waldorf )>> :belief :false )>> :by :dave , :megan .

There are clear practical limitations to this approach. I am posting this to find out whether there are any other non-practical issues with this.
If not - putting it bluntly - are we simply enforcing a design pattern (reifiers) that we happen to like (and certainly have their benefits, but also complexity)? By disallowing triple terms as subjects (plus the whole reification property vs. well-formedness discussion), the above design pattern with nested triple terms, would, at the very least, be made more difficult. I can see folks finding at least the non-nested (and perhaps even 1-level nested) case easy to use ...

@TallTed TallTed changed the title what properties can or should link to triple terms what properties can or should link to triple terms? Nov 1, 2024
@afs
Copy link
Contributor Author

afs commented Nov 1, 2024

@william-vw,

Triple terms are <<( :s :p :o )>>.

<< :s :p :o >> is shorthand for _:b rdf:refies <<( :s :p :o )>> .

The first example is triple terms <<( ... )>>.

I don't know about the rest. Did you mean <<( ... )>> everywhere?

(as a triple term cannot include property and object lists :-)):

This different from #132 where the discussion is more about compound form in occurrences << ... >>.

A triple term can't have compound components because it is for a triple - 3 RDF term components.

@william-vw
Copy link

@afs Thanks. I wasn't specifically targeting Turtle or N-Triples. This is a general comment on how nesting triple terms may be a more verbose (but possibly simpler) alternative for reifiers. I wasn't being very specific with the particular notation; and it abstracts from current translations using reifiers.

Re:

A triple term can't have compound components because it is for a triple - 3 RDF term components.

I thought I had seen examples with nested triple terms before, but I double checked the grammar:
In N-Triples, a triple term object can again be a triple term.
In Turtle, a triple term object can be a triple term or reified triple.

Note that my post also draws into question this restriction of triple terms to object position.

@afs
Copy link
Contributor Author

afs commented Nov 2, 2024

Then some questions for clarification:

What does :time or :by mean? A triple just "is" - all triples exist. It is using the triple (occurrence in a graph) that matters - the choice of triples is the model.

Does <<( <<( :richard :marriedTo :liz )>> :by :bob )>> equal <<( <<( :richard :marriedTo :liz )>> :by :bob )>> as RDF terms in

<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 123 .
<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 456 .

The simplification we have so far is that triple terms are expected to be largely unused except for use with rdf:reifies. It is the usage of a triple that matters.

We have also tried to only have named occurrences - but then how can you ask "what is the subject of this occurrence?" That is a function/map from occurrences to the elements of the triple. In that sense rdf;reifies in the "baseline" version (where it is special) can be seen as the mapping function.

@william-vw
Copy link

william-vw commented Nov 2, 2024

Then some questions for clarification:

What does :time or :by mean? A triple just "is" - all triples exist. It is using the triple (occurrence in a graph) that matters - the choice of triples is the model.

I did not put a lot of thought into the predicates. :time simply records the time; :by records who made the statement.

Does <<( <<( :richard :marriedTo :liz )>> :by :bob )>> equal <<( <<( :richard :marriedTo :liz )>> :by :bob )>> as RDF terms in

<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 123 .
<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 456 .

Yes

The simplification we have so far is that triple terms are expected to be largely unused except for use with rdf:reifies. It is the usage of a triple that matters.

We have also tried to only have named occurrences - but then how can you ask "what is the subject of this occurrence?" That is a function/map from occurrences to the elements of the triple. In that sense rdf;reifies in the "baseline" version (where it is special) can be seen as the mapping function.

Just so we are on the same page: a triple term allows describing a triple (that may be member of the graph, or not). E.g.,
<<( :richard :marriedTo :liz )>> :by :bob .

Describes the triple :richard :marriedTo :liz with some metadata. It could be interpreted as "bob said that richard was married to liz".

The following:
<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 123 .

Describes the triple <<( :richard :marriedTo :liz )>> :by :bob with some metadata. It could be interpreted as "at time 123, bob said that richard was married to liz". (Borrowing the interpretation from before, assuming it didn't change.)

The nesting of the triple term :richard :marriedTo :liz allows adding extra context to it, and then describing the original statement under this new context.

I understand reifiers here as a mechanism to group metadata under a particular interpretation / context (see my original message); and this mechanism involves creating a "token" or "occurrence" of the triple term.

I'm simply wondering whether there are other / simpler mechanisms to do the same. If so, then it's perhaps not a good idea to impose restrictions in line with only one of several ways of grouping metadata under a particular context.

@afs
Copy link
Contributor Author

afs commented Nov 3, 2024

Does <<( <<( :richard :marriedTo :liz )>> :by :bob )>> equal <<( <<( :richard :marriedTo :liz )>> :by :bob )>> as RDF terms in

<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 123 .
<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 456 .

Yes

Then it is the same as:

<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 123 , 456 .

and it seems to me that the complexity is still there, it's been moved into the definition of the predicates because they say time of use (context), and it's not a direct description of the triple itself.

I'm not sure trees are the only shape of context.

I'm simply wondering whether there are other / simpler mechanisms to do the same.

Understood!

@william-vw
Copy link

william-vw commented Nov 3, 2024

Does <<( <<( :richard :marriedTo :liz )>> :by :bob )>> equal <<( <<( :richard :marriedTo :liz )>> :by :bob )>> as RDF terms in

<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 123 .
<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 456 .

Yes

Then it is the same as:

<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 123 , 456 .

Unsure what point you are making, as this was not one of my examples :-) In the prior example, I believe all the outer triple terms were unique:

<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :belief true ; :time 123 .
<<( <<( :richard :marriedTo :liz )>> :by :dave )>> :belief false ; :time 456 .
<<( <<( :richard :marriedTo :liz )>> :by :alice )>> :belief true ; :time 789 .
<<( <<( :richard :marriedTo :liz )>> :by :megan )>> :belief false ; :time 135 .

But, if you simply want to say that the outer triple was recorded at time 123 and 456, then your example would work.

If you want to say something about the outer triple being recorded at time 123, then:
<<( <<( <<( :richard :marriedTo :liz )>> :by :bob )>> :time 123 )>> :beforeItHappened true .

You can simply put an extra <<( )>> that includes the desired context (time 123) and attach metadata to that.

I'm unsure what you mean by:

it's not a direct description of the triple itself.

Neither are reifiers, I think; they are simply associated with / tokens of (depending on who you're talking to :-) triple terms, which I am directly using above.

I'm not sure trees are the only shape of context.

Hmm - since you can simply put << and >> around any set of triples, I think you'd be able to group on any type of context, and then attach metadata to it.

@gkellogg
Copy link
Member

gkellogg commented Nov 3, 2024

Does << << :richard :marriedTo :liz >> :by :bob >> equal << << :richard :marriedTo :liz >> :by :bob >> as RDF terms in

<< << :richard :marriedTo :liz >> :by :bob >> :time 123 .
<< << :richard :marriedTo :liz >> :by :bob >> :time 456 .

Yes

I would say no, they're not equal. The << ... >> form is syntactic sugar and it creates a blank node reifier. Each use of this form creates a fresh blank node, so they two reifying triples are not equal.

Then it is the same as:

<< << :richard :marriedTo :liz >> :by :bob >> :time 123 , 456 .

For this to be the case, you'd need to use explicit reifiers:

<< << :richard :marriedTo :liz ~ :r1 >> :by :bob ~ :r2 >> :time 123 .
<< << :richard :marriedTo :liz ~ :r1 >> :by :bob ~ :r2 >> :time 456 .

Then you could reduce this to:

<< << :richard :marriedTo :liz ~ :r1 >> :by :bob ~ :r2 >> :time 123, 456 .

@william-vw
Copy link

william-vw commented Nov 3, 2024

For instance, consider @niklasl's Elizabeth Taylor example, where she married Richard Burton twice.

In contrast to reifiers, when using nested triple terms, you have to choose what context to use for grouping metadata. We need two groups of metadata, i.e., one per instance of marriage between the two. We need a separate outer triple term per group of metadata, i.e., they need to be unique. We could choose sdo:startTime as context to ensure this uniqueness (assuming that she is not marrying multiple people at the same exact time).

<<( 
	<<( <http://www.wikidata.org/entity/Q34851> sdo:spouse <http://www.wikidata.org/entity/Q151973> )>>
	sdo:startTime "+1964-03-15T00:00:00Z"^^xsd:dateTime 
)>>
	a :Circumstance ;
    sdo:position "5" ;
    sdo:startTime "+1964-03-15T00:00:00Z"^^xsd:dateTime ;
    :placeOfMarriage_P2842 <http://www.wikidata.org/entity/Q3433362> ;
    sdo:endDate "+1974-06-26T00:00:00Z"^^xsd:dateTime ;
    :endCause_P1534 <http://www.wikidata.org/entity/Q93190> ;
    :reference [ :thePeeragePersonId_P4638 "p33432.htm#i334317" ;
        :retrieved_P813 "+2020-08-07T00:00:00Z"^^xsd:dateTime ] .

<<( 
	<<( <http://www.wikidata.org/entity/Q34851> sdo:spouse <http://www.wikidata.org/entity/Q151973> )>>
	sdo:startTime "+1975-10-10T00:00:00Z"^^xsd:dateTime ;
)>>
	a :Circumstance ;
    sdo:position "5" ;
    sdo:startTime "+1975-10-10T00:00:00Z"^^xsd:dateTime ;
    :placeOfMarriage_P2842 <http://www.wikidata.org/entity/Q859389> ;
    sdo:endDate "+1976-07-29T00:00:00Z"^^xsd:dateTime ;
    :endCause_P1534 <http://www.wikidata.org/entity/Q93190> ;
    :reference [ :thePeeragePersonId_P4638 "p33432.htm#i334317" ;
        :retrieved_P813 "+2020-08-07T00:00:00Z"^^xsd:dateTime ] |} .

This is clearly much more of a hassle compared to reifiers. You can simply create a new reifier for a separate group of metadata, without having to worry about uniqueness as above.

This is really reminding me of primary keys in database modeling exercises ... Either create an artificial (e.g., auto-increment) primary key - which seems to be reifiers here - or select a primary key from existing attributes, which is seems to be the nested triple solution.

Does anyone see any other issues?

@william-vw
Copy link

william-vw commented Nov 3, 2024

I would say no, they're not equal. The << ... >> form is syntactic sugar and it creates a blank node reifier. Each use of this form creates a fresh blank node, so they two reifying triples are not equal.

Then it is the same as:

<< << :richard :marriedTo :liz >> :by :bob >> :time 123 , 456 .

For this to be the case, you'd need to use explicit reifiers:

<< << :richard :marriedTo :liz ~ :r1 >> :by :bob ~ :r2 >> :time 123 .
<< << :richard :marriedTo :liz ~ :r1 >> :by :bob ~ :r2 >> :time 456 .

Then you could reduce this to:

<< << :richard :marriedTo :liz ~ :r1 >> :by :bob ~ :r2 >> :time 123, 456 .

Apologies - I should have said from the beginning that I am abstracting from particular formats. I said the following in a prior response:

I wasn't specifically targeting Turtle or N-Triples. This is a general comment on how nesting triple terms may be a more verbose (but possibly simpler) alternative for reifiers. I wasn't being very specific with the particular notation; and it abstracts from current translations using reifiers.

To avoid confusion, I took the liberty of updating all relevant posts with the appropriate notation.

@william-vw
Copy link

william-vw commented Nov 3, 2024 via email

@rat10
Copy link
Contributor

rat10 commented Nov 4, 2024

@william-vw, your example shows how you need to add additional levels of nesting annotations to make the (outermost) annotated triple term sufficiently unique.

<<( <<( :richard :marriedTo :liz )>> :by :bob )>> :belief true ; :time 123 .
<<( <<( :richard :marriedTo :liz )>> :by :dave )>> :belief false ; :time 456 .
<<( <<( :richard :marriedTo :liz )>> :by :alice )>> :belief true ; :time 789 .
<<( <<( :richard :marriedTo :liz )>> :by :megan )>> :belief false ; :time 135 .

IMO this is unnatural and counterintuitive. There is a statement, and a set of annotations on it. Some of those annotations have more disambiguating power than others - the source's name in this case - but can and should an author be expected to think about such technicalities? To me it seems much more straightforward to just write

<< :richard :marriedTo :liz >> :by :bob;  :belief true ; :time 123 .
<< :richard :marriedTo :liz >> :by :dave;  :belief false ; :time 456 .
[...]

and let the system add the necessary disambiguation (minting a blank node to name the reifier).

It is actually pretty hard to come up with use cases that reliably and always only need to refer to the triple term itself: counting occurrences of triple terms is the only one that I find convincing. Idealized examples using triple terms either have only one annotation, or expect the triple term to occur only once, i.e. be unique in the graph. You have now found a way to stretch that uniqueness constraint - congratulations! ;-) But as I said above, I don't think this a natural way of modelling. Also it requires re-modelling when what was first thought to be unique needs disambiguation later on. Early disambiguation as in the occurrence-centered approach is the safer approach. In the CG we had a similar discussion w.r.t. the proposed :occurrenceOf property: one would have been tempted to naively annotate a triple term, but on arrival of more attributes be required to re-model, adding an occurrence-describing intermediate node. That makes navigating and querying data very tedious. Nesting triple terms suffers from the same problems.

@william-vw
Copy link

@rat10

your example shows how you need to add additional levels of nesting annotations to make the (outermost) annotated triple term sufficiently unique.

Indeed and that is one of the points I was making.

You have now found a way to stretch that uniqueness constraint - congratulations! ;-)

You should congratulate @niklasl as this is from his example :-) I am sure that there are many, many other examples from database exercises, where students are asked to identify primary keys from existing attributes.

But as I said above, I don't think this a natural way of modelling. Also it requires re-modelling when what was first thought to be unique needs disambiguation later on. Early disambiguation as in the occurrence-centered approach is the safer approach.

Congratulations - you just found one criterium for choosing primary key attributes :-)

I was careful not to suggest this in lieu of reifiers, but rather as an alternative to achieve the same goal, i.e., grouping sets of metadata under the same context. Whether or not it is a natural way of modelling is up for discussion; the same for database modeling, where both approaches keep existing1:

This is really reminding me of primary keys in database modeling exercises ... Either create an artificial (e.g., auto-increment) primary key - which seems to be reifiers here - or select a primary key from existing attributes, which is seems to be the nested triple solution. (Well, reifiers could be better compared to primary keys if they were one
to one.)

Possible drawbacks of reifiers (greatly depending on the implementation) are that they require an extra lookup/join to find their associated triple term, and extra triples are needed to keep the association.

If there are no other issues - aside from those already raised - then I would be apprehensive to impose restrictions that enforce reifiers (e.g., only allowing them in the object position).

Footnotes

  1. At least, both are talked about in textbooks ...

@rat10
Copy link
Contributor

rat10 commented Nov 4, 2024

@rat10

But as I said above, I don't think this a natural way of modelling. Also it requires re-modelling when what was first thought to be unique needs disambiguation later on. Early disambiguation as in the occurrence-centered approach is the safer approach.

Congratulations - you just found one criterium for choosing primary key attributes :-)

Or rather against them? Following that link I learn as "Criteria for Choosing a Primary Key":

  • Uniqueness: The primary key must uniquely identify each record in the table.
  • Stability: It should remain constant over time, unaffected by changes in the dataset.
  • Simplicity: A simple key, often numeric, enhances performance and reduces complexity.

...and stability is exactly the thing that we can not rely an, and that RDF is designed to not rely on.

Also, and here comes the many-to-many design of reifiers into play, if one really wants to model in a relational style, wouldn't one rather create the key right away, make it the subject of a number of statements and then annotate it, e.g.

_:m a :marriage ;
    :groom :Richard ;
    :bride :Liz ;
    :by :Bob .
_:mr rdf:reifies <<( _:m a :marriage )>> ,
        <<( _:m :groom :Richard )>> ,
        <<( _:m :bride :Liz )>> ,
        <<( _:m :by :Bob )>> ;
    :belief true ; 
    :time 123 .

That sure looks ugly ;-) but not uglier than repeated nesting.

I was careful not to suggest this in lieu of reifiers, but rather as an alternative to achieve the same goal, i.e., grouping sets of metadata under the same context. Whether or not it is a natural way of modelling is up for discussion; the same for database modeling, where both approaches keep existing1:

My counter argument is still: RDF is a graph formalism. It can represent relational structures, but that's not where its strengths are. Regarding our task, standardizing a mechanism to easily make statements about statements, I consider it more important to foolproof the basic mechanism
of statement annotation than to make it more amenable to E/R style modelling.

This is really reminding me of primary keys in database modeling exercises ... Either create an artificial (e.g., auto-increment) primary key - which seems to be reifiers here - or select a primary key from existing attributes, which is seems to be the nested triple solution. (Well, reifiers could be better compared to primary keys if they were one
to one.)

An issue of your approach is that order becomes important. Slightly changing your example:

<<( <<( <<( :richard :marriedTo :liz )>> :by :bob )>> a :belief )>> :value true ; :time 123 .
<<( <<( <<( :richard :marriedTo :liz )>> a :belief )>> :by :bob )>> :value true ; :time 456 .

Although I just switched the order of columns, so to say, two different keys have been generated. That doesn't seem desirable. Maybe it's okay in a relational system, but it seems quite surprising to RDF. It can even be read as two different things: a belief that Bob reported said marriage, or Bob claiming that said marriage is just a belief.

Possible drawbacks of reifiers (greatly depending on the implementation) are that they require an extra lookup/join to find their associated triple term, and extra triples are needed to keep the association.

Souri, who has a strong RDBMS background, has been strongly arguing in favor of an occurrence based approach as it helps keeping queries valid even when data is updated. Implementation issues are not to be neglected, but usability can have a pretty decisive effect on overall performance.

If there are no other issues - aside from those already raised - then I would be apprehensive to impose restrictions that enforce reifiers (e.g., only allowing them in the object position).

Well, we had a years-long discussion about this. The fact that the seminal example (see the CG report) had the very defect that the type based design of triple terms is so prone to should tell you something. Granted, you are the first to convince me that there could be a more-then-unpermissible-simplistic application for triple terms in subject position. I'm still very much in favor of ruling it out.

@william-vw
Copy link

@rat10

Congratulations - you just found one criterium for choosing primary key attributes :-)

Or rather against them? Following that link I learn as "Criteria for Choosing a Primary Key":

* Uniqueness: The primary key must uniquely identify each record in the table.

* Stability: It should remain constant over time, unaffected by changes in the dataset.

* Simplicity: A simple key, often numeric, enhances performance and reduces complexity.

...and stability is exactly the thing that we can not rely an, and that RDF is designed to not rely on.

Note that relational databases are even less stable - one can remove & update prior rows, whereas RDF is monotonic, technically - and yet choosing primary keys from existing attributes has, at least, persisted there.

I was careful not to suggest this in lieu of reifiers, but rather as an alternative to achieve the same goal, i.e., grouping sets of metadata under the same context. Whether or not it is a natural way of modelling is up for discussion; the same for database modeling, where both approaches keep existing1:

My counter argument is still: RDF is a graph formalism. It can represent relational structures, but that's not where its strengths are.

Honestly, I am a bit unsure what the impact is of the graph formalism. I was getting flashbacks from my days as a DB teaching assistant; this is why I made the analogy. I am not trying to impose a relational structure on top of RDF.

Regarding our task, standardizing a mechanism to easily make statements about statements, I consider it more important to foolproof the basic mechanism of statement annotation than to make it more amenable to E/R style modelling.

I feel that we are enforcing a particular design pattern, whereas RDF should be about flexibility (or so I keep hearing :-). For more complex use cases, reifiers can be used; for simpler ones, a triple term (or nested one) could be used.

It is true that one can easily shoot themselves in the foot this way, as the Taylor + Burton example shows. But, people can already shoot themselves in the foot with RDF, very easily, as they can with databases and any other formalism. Should we enforce certain design choices?

An issue of your approach is that order becomes important. Slightly changing your example:

<<( <<( <<( :richard :marriedTo :liz )>> :by :bob )>> a :belief )>> :value true ; :time 123 .
<<( <<( <<( :richard :marriedTo :liz )>> a :belief )>> :by :bob )>> :value true ; :time 456 .

Although I just switched the order of columns, so to say, two different keys have been generated. That doesn't seem desirable. Maybe it's okay in a relational system, but it seems quite surprising to RDF. It can even be read as two different things: a belief that Bob reported said marriage, or Bob claiming that said marriage is just a belief.

Yes, this is a point I had made initially; ordering will be very relevant and will depend on the desired context. But, one could instead think about this in choices of reification (i.e., which statements you are reifying).

Note that nesting triple terms is already possible beyond their use to group metadata. I am simply wondering whether this can serve as an alternative (not a replacement) to reifiers.

Possible drawbacks of reifiers (greatly depending on the implementation) are that they require an extra lookup/join to find their associated triple term, and extra triples are needed to keep the association.

Souri, who has a strong RDBMS background, has been strongly arguing in favor of an occurrence based approach as it helps keeping queries valid even when data is updated. Implementation issues are not to be neglected, but usability can have a pretty decisive effect on overall performance.

Yes, issues of stability remain. E.g., adding the extra marriage between Elizabeth Taylor and Richard Burton can easily gum up the works. E.g., given

<<( <http://www.wikidata.org/entity/Q34851> sdo:spouse <http://www.wikidata.org/entity/Q151973> )>>
    sdo:startTime "+1964-03-15T00:00:00Z" ;  sdo:endDate "+1974-06-26T00:00:00Z" .

And then adding

<<( <http://www.wikidata.org/entity/Q34851> sdo:spouse <http://www.wikidata.org/entity/Q151973> )>>
    sdo:startTime "+1975-10-10T00:00:00Z" ;  sdo:endDate "+1976-07-29T00:00:00Z" .

Will mean that we are no longer able to distinguish the metadata of the two marriages. While that is an issue with primary keys in general, one could indeed argue it is a bigger problem in an open world.

If there are no other issues - aside from those already raised - then I would be apprehensive to impose restrictions that enforce reifiers (e.g., only allowing them in the object position).

Well, we had a years-long discussion about this. The fact that the seminal example (see the CG report) had the very defect that the type based design of triple terms is so prone to should tell you something. Granted, you are the first to convince me that there could be a more-then-unpermissible-simplistic application for triple terms in subject position. I'm still very much in favor of ruling it out.

Note that this method uses existing constructs, namely triple terms - in a syntactically legal way - to create groups of metadata. If people strongly feel that it shouldn't be used, then perhaps a better solution is to not allow nested triple terms, unless they have other use cases.

@rat10
Copy link
Contributor

rat10 commented Nov 4, 2024

Note that nesting triple terms is already possible beyond their use to group metadata. I am simply wondering whether this can serve as an alternative (not a replacement) to reifiers.

And I'm saying that it leads to a lot of trouble down the road, without tangible benefit.

Note that this method uses existing constructs, namely triple terms - in a syntactically legal way - to create groups of metadata.

Am I missing something? We're arguing about if triple terms should be allowed in subject position, right? If they are not, this is not syntactically legal.

If people strongly feel that it shouldn't be used, then perhaps a better solution is to not allow nested triple terms, unless they have other use cases.

That seems like an unnecessarily drastic step to me. What about chains of provenance, or annotations from different domains like qualification and administration? I wouldn't rule that out just because the design is focused on occurrences, not types.

@william-vw
Copy link

Note that nesting triple terms is already possible beyond their use to group metadata. I am simply wondering whether this can serve as an alternative (not a replacement) to reifiers.

And I'm saying that it leads to a lot of trouble down the road, without tangible benefit.

Ok, that is a fair opinion.

Note that this method uses existing constructs, namely triple terms - in a syntactically legal way - to create groups of metadata.

Am I missing something? We're arguing about if triple terms should be allowed in subject position, right? If they are not, this is not syntactically legal.

Yes, you are - I was referring to the fact that it is perfectly legal to nest triple terms.

If the discussion is about restricting triple terms to the object position, examples to the contrary will have to assume that they are not yet restricted this way :-)

If people strongly feel that it shouldn't be used, then perhaps a better solution is to not allow nested triple terms, unless they have other use cases.

That seems like an unnecessarily drastic step to me. What about chains of provenance, or annotations from different domains like qualification and administration? I wouldn't rule that out just because the design is focused on occurrences, not types.

I feel this is making it even worse. In some use cases, it should be allowed to nest triple terms, but in other cases, it should be discouraged?

And that is done by disallowing triple terms in a particular position, because we think that the latter case would mostly rely on that position?

@rat10
Copy link
Contributor

rat10 commented Nov 4, 2024

Note that this method uses existing constructs, namely triple terms - in a syntactically legal way - to create groups of metadata.

Am I missing something? We're arguing about if triple terms should be allowed in subject position, right? If they are not, this is not syntactically legal.

Yes, you are - I was referring to the fact that it is perfectly legal to nest triple terms.

If the discussion is about restricting triple terms to the object position, examples to the contrary will have to assume that they are not yet restricted this way :-)

Are we still talking about the same thing? It seems to me like you mix triple terms, e.g. <<( :s :p :o )>> and reifications, e.g. << :s :p :o >>(<< :s :p :o ~_:id >> with optional identifier). IIUC both can be nested, the former however always only in object position according to what seems to be the majority opinion of what should be considered syntactically legal. Nesting reifications is not the issue here, right? Since that is perfectly possible in subject and object position.

If people strongly feel that it shouldn't be used, then perhaps a better solution is to not allow nested triple terms, unless they have other use cases.

That seems like an unnecessarily drastic step to me. What about chains of provenance, or annotations from different domains like qualification and administration? I wouldn't rule that out just because the design is focused on occurrences, not types.

I feel this is making it even worse. In some use cases, it should be allowed to nest triple terms, but in other cases, it should be discouraged?

What do you mean with "some cases". I'm talking about nesting triple terms in object position but not in subject position, nothing else.

And that is done by disallowing triple terms in a particular position, because we think that the latter case would mostly rely on that position?

Again, the argument is that almost all use cases talk about occurrences of triples, not about the type itself (very similar to RDF standard reification). The syntactic constraint is meant to prevent users from expressing something they don't intend to express, just because it's shorter, or seems natural, or unproblematic.

@william-vw
Copy link

Are we still talking about the same thing? It seems to me like you mix triple terms, e.g. <<( :s :p :o )>> and reifications, e.g. << :s :p :o >>(<< :s :p :o ~_:id >> with optional identifier). IIUC both can be nested, the former however always only in object position according to what seems to be the majority opinion of what should be considered syntactically legal. Nesting reifications is not the issue here, right? Since that is perfectly possible in subject and object position.

Yes we are - no mixing up of triple terms with reifications. I clarified this in my initial post to avoid this kind of confusion as I had been a bit fast and loose before -

EDIT: To avoid confusion, I updated all subsequent posts to use <<(and )>> (including those by others, if they also meant to use the former syntax).

That seems like an unnecessarily drastic step to me. What about chains of provenance, or annotations from different domains like qualification and administration? I wouldn't rule that out just because the design is focused on occurrences, not types.

I feel this is making it even worse. In some use cases, it should be allowed to nest triple terms, but in other cases, it should be discouraged?

What do you mean with "some cases". I'm talking about nesting triple terms in object position but not in subject position, nothing else.

Why, the ones you had raised before :-)

In some use cases, it seems it should be allowed to nest triple terms (your aforementioned cases); but in other cases, it seems it should be discouraged (grouping metadata). To achieve this, the proposal is to disallow triple terms in a particular position (subject); because we think that the to-be-discouraged case (grouping metadata) would mostly rely on that position.

To clarify, the discussion is about the use of nested triple terms to group metadata. The subject position restriction makes that particular way of grouping metadata more of a hassle.

It would still be perfectly possible to use nested triple terms in the object position to group metadata. It's just more of a hassle, since you need to use an inverse property. E.g.

[ rdf:value true ] :beliefAbout <<( <<( :richard :marriedTo :liz )>> :by :bob )>>.
[ rdf:value 456 ] :timeOfStatement <<( <<( :richard :marriedTo :liz )>> :by :bob )>>.

Instead of

<<( <<( :richard :marriedTo :liz )>> :by :dave )>> :belief false ; :time 456 .

Again, the argument is that almost all use cases talk about occurrences of triples, not about the type itself (very similar to RDF standard reification). The syntactic constraint is meant to prevent users from expressing something they don't intend to express, just because it's shorter, or seems natural, or unproblematic.

IIUC, a reifier does not necessarily refer to an occurrence of a triple in some surface syntax; it is possible that the triple does not occur at all. It is my understanding that it is a mechanism to group metadata that follows a specific context - see my prior post.

@rat10
Copy link
Contributor

rat10 commented Nov 4, 2024

In some use cases, it seems it should be allowed to nest triple terms (your aforementioned cases);

There I meant occurrences, not triple terms - sorry if that wasn't clear.

but in other cases, it seems it should be discouraged (grouping metadata). To achieve this, the proposal is to disallow triple terms in a particular position (subject); because we think that the to-be-discouraged case (grouping metadata) would mostly rely on that position.

Nesting and an occurrence focused design are orthogonal aspects. Reifications can be used in subject and object position, and in both positions they can be nested. Triple terms can only be used in object position, but IIUC they can be nested too.

To clarify, the discussion is about the use of nested triple terms to group metadata. The subject position restriction makes that particular way of grouping metadata more of a hassle.

It would still be perfectly possible to use nested triple terms in the object position to group metadata. It's just more of a hassle, since you need to use an inverse property. E.g.

[ rdf:value true ] :beliefAbout <<( <<( :richard :marriedTo :liz )>> :by :bob )>>.
[ rdf:value 456 ] :timeOfStatement <<( <<( :richard :marriedTo :liz )>> :by :bob )>>.

What you are doing is essentially to construct a singleton type, a type that is so specific that it exists only once. But what's the purpose, besides maybe following an E/R design pattern? You are trying to annotate a specific instance, not all instances of that type. You can do that perfectly well with the occurrence oriented design of reifiers.

So, yes, you can shoot yourself in the foot if you absolutely insist. The way your example is modeled is exactly what we try to prevent (also see EDIT below). However, I still wouldn't want to disallow any property but rdf:reifies from referring to triple terms, because there are sensible alternatives to rdf:reifies, e.g. the proposed rdfs:states.

Again, the argument is that almost all use cases talk about occurrences of triples, not about the type itself (very similar to RDF standard reification). The syntactic constraint is meant to prevent users from expressing something they don't intend to express, just because it's shorter, or seems natural, or unproblematic.

IIUC, a reifier does not necessarily refer to an occurrence of a triple in some surface syntax; it is possible that the triple does not occur at all.

Right, but that is another aspect. To avoid that connotation I could also use the term reification, or token, or instance. It doesn't change the fact that even for non-occurring reifications the use cases are predominantly referring to instances, not to the type itself.


[EDIT The latest stab at this is to declare the subject of such a triple a reifier: a hint that `[ rdf:value true ]` is now of type rdf:Reifier should warn users that they are doing something unrecommended]

@william-vw
Copy link

@rat10 I think we are on the same page (correct me if I'm wrong) - it is possible to use nested triple terms to group metadata, but there can be good reasons not to, and reifiers can be better choices in some cases.

What you are doing is essentially to construct a singleton type, a type that is so specific that it exists only once. But what's the purpose, besides maybe following an E/R design pattern? You are trying to annotate a specific instance, not all instances of that type. You can do that perfectly well with the occurrence oriented design of reifiers.

So, yes, you can shoot yourself in the foot if you absolutely insist.

Note that I can easily shoot myself in the foot with reifiers as well, e.g., when accidentally choosing an existing one. But, perhaps it's not quite as easy to blow your foot off with reifiers, that is true.

The way your example is modeled is exactly what we try to prevent (also see EDIT below).

I suppose this is what is bugging me a bit. RDF is getting into the business of enforcing particular modeling patterns. But, at the same time, it is considered (by some) a sort of low-level assembly language for KR. There seems to be a contradiction here. Even when granting this "mandatory" design pattern; the way it is being enforced seems rather artificial, i.e., by disallowing triple terms in the subject position.

@rat10
Copy link
Contributor

rat10 commented Nov 5, 2024

@rat10 I think we are on the same page (correct me if I'm wrong)

I'd put it more cautiously as "there might be overlaps" ;-)

  • it is possible to use nested triple terms to group metadata,

Well, IIUC correctly it's not even possible to have multiple metadata annotations on the same level of nesting in triple terms (*) - so the mechanism is really very impoverished and my lead to undesirable expressivity (see example above)

but there can be good reasons not to, and reifiers can be better choices in some cases.

IMO that's almost all cases, as I repeatedly stressed.

What you are doing is essentially to construct a singleton type, a type that is so specific that it exists only once. But what's the purpose, besides maybe following an E/R design pattern? You are trying to annotate a specific instance, not all instances of that type. You can do that perfectly well with the occurrence oriented design of reifiers.

So, yes, you can shoot yourself in the foot if you absolutely insist.

Note that I can easily shoot myself in the foot with reifiers as well, e.g., when accidentally choosing an existing one.

That seems like a pretty contrived concern.

But, perhaps it's not quite as easy to blow your foot off with reifiers, that is true.

The way your example is modeled is exactly what we try to prevent (also see EDIT below).

I suppose this is what is bugging me a bit. RDF is getting into the business of enforcing particular modeling patterns. But, at the same time, it is considered (by some) a sort of low-level assembly language for KR. There seems to be a contradiction here. Even when granting this "mandatory" design pattern; the way it is being enforced seems rather artificial, i.e., by disallowing triple terms in the subject position.

IMO it's not artificial at all, because it tries to enforce exactly one design pattern: using the type as a blueprint to define a token, and only use that token thereafter. How to enforce that design pattern is not trivial, and your example above shows that the mechanism can be misused if it is not enforced overly rigidly. But to frame the mechanism as a whole as "artificial" doesn't do it justice. To the contrary it is as focused as possible. And it doesn't only concern subject vs object position: it concerns the use of triple terms as blueprints to nothing but instantiating tokens from.

All that said: I would find it easier to lift this restriction if the rdfs:states proposal had been accepted, because then I wouldn't need a way to define subproperties of rdf:reifies. Instead I could with quite some confidence claim that the Turtle-star shorthand syntaxes provide everything necessary to accommodate 99% of use cases. A daringly expressive N-triples-star would then not be much of a problem. But currently I pretty much doubt we'll get there.


(*) Wait a moment, is that concern valid for the reified tokens as well? Oh well, that of course asks for graphs on the syntax level as well. There may be issues when going for graphs, but there sure are downsides to not doing so.

@william-vw
Copy link

  • it is possible to use nested triple terms to group metadata,

Well, IIUC correctly it's not even possible to have multiple metadata annotations on the same level of nesting in triple terms (*) - so the mechanism is really very impoverished and my lead to undesirable expressivity (see example above)

Sorry - I have no idea what you mean here.

but there can be good reasons not to, and reifiers can be better choices in some cases.

IMO that's almost all cases, as I repeatedly stressed.

With triple terms not allowed as subjects, I cannot just write the following in core RDF:

<<( :william :hates :pizza )>> a :Lie ; :perpetratedBy :doerthe .

Instead, I need a reifier (that I will not use in any other way):

_:r1 rdf:reifies <<( :william :hates :pizza )>> ; a :Lie ; :perpetratedBy :doerthe .

But that is even beside the point. We are making the particular design choice for everyone.

Note that I can easily shoot myself in the foot with reifiers as well, e.g., when accidentally choosing an existing one.

That seems like a pretty contrived concern.

If I choose human-readable reifiers:

:_richardLizMarriage rdf:reifies <<( :richard :marriedTo :liz )>> ; # perfectly reasonable, in "99.8% of the cases"
    sdo:startTime "+1964-03-15T00:00:00Z"^^xsd:dateTime ;
    sdo:endDate "+1974-06-26T00:00:00Z"^^xsd:dateTime .

:_richardLizMarriage rdf:reifies <<( :richard :marriedTo :liz )>> ; # oops ...
    sdo:startTime "+1975-10-10T00:00:00Z"^^xsd:dateTime ;
    sdo:endDate "+1976-07-29T00:00:00Z"^^xsd:dateTime .

But, as I said,

But, perhaps it's not quite as easy to blow your foot off with reifiers, that is true.

IMO it's not artificial at all, because it tries to enforce exactly one design pattern: using the type as a blueprint to define a token, and only use that token thereafter. How to enforce that design pattern is not trivial, and your example above shows that the mechanism can be misused if it is not enforced overly rigidly.

Unsure why you are saying "if not enforced overly rigidly". The subject position restriction does not apply there at all.

All that said: I would find it easier to lift this restriction if the rdfs:states proposal had been accepted, because then I wouldn't need a way to define subproperties of rdf:reifies. Instead I could with quite some confidence claim that the Turtle-star shorthand syntaxes provide everything necessary to accommodate 99% of use cases.

Both could co-exist. But, only if ReificationProperty/TripleTermProperty instances are not the only allowed predicates for a triple term. Using an instance of this class would indicate a special relation between the subject and object; i.e., the subject can be used as a token for the object.

(*) Wait a moment, is that concern valid for the reified tokens as well? Oh well, that of course asks for graphs on the syntax level as well. There may be issues when going for graphs, but there sure are downsides to not doing so.

Again, no idea what you mean.

@doerthe
Copy link

doerthe commented Nov 5, 2024

I agree with @william-vw here that we are defining our restrictions here around a particular design pattern and the question keeps being: is the fact that we like people to use reifiers enough to restrict the syntax of RDF?

I dislike the restriction, because I would like to be able to directly express

<<( :william :hates :pizza )>>  a rdf:TripleTerm.

if we decide to introduce such a class. Or maybe

<<( :william :hates :pizza )>>  a rdf:Resource.

But I can already see how @rat10 will dislike my first example, because he will wonder whether something could/should be a triple term and a lie at the same time. I get the point and yet, we might want leave the freedom to the RDF users to declare things as they want. We cannot predict the use of these constructs and we can still provide guidance through the syntactic sugar.

@rat10
Copy link
Contributor

rat10 commented Nov 5, 2024

@doerthe What other use cases do you have besides expressing RDF meta information?

@doerthe
Copy link

doerthe commented Nov 6, 2024

I am not sure wether I would call my use case only meta information. I dislike that we cannot express things which naturally follow from our entailments. I think that is more than expressing meta information.

In addition to that, all your use cases could also be expressed using a triple term in subject position. I can easily write
<<( :william :hates :pizza )>> rdf:isReifiedBy _x.
The choice that I need an object position here feels random. In our meetings we discussed that and I learned that the reason for that random choice was to limit the usage of triple terms in general. But if I follow that reasoning, why is it not enough to only provide syntactic sugar for the object position? I just think that we are making restrictions where they are not necessary.

@rat10
Copy link
Contributor

rat10 commented Nov 6, 2024

I am not sure wether I would call my use case only meta information. I dislike that we cannot express things which naturally follow from our entailments. I think that is more than expressing meta information.

But what be would those things that you'd like to entail, besides RDF meta information (which entailing that something is of type rdf:resource clearly is)?

In addition to that, all your use cases could also be expressed using a triple term in subject position. I can easily write <<( :william :hates :pizza )>> rdf:isReifiedBy _x. The choice that I need an object position here feels random.

I could say the same of your desire to have it in subject position. The choice of position is not really random (it follows the usual modelling pattern that the most important thing is in subject position, which according to the current design is the reifier itstelf,), but the main issue is that there is a restriction on one position, right? That however follows from the clear purpose of the mechanism. Likewise I'm not aware of any demands for a set of inverse properties w.r.t. the RDF 1.0/1 reification vocabulary.

In our meetings we discussed that and I learned that the reason for that random choice was to limit the usage of triple terms in general. But if I follow that reasoning, why is it not enough to only provide syntactic sugar for the object position?

Because that syntactic sugar only takes us so far. Two examples:

  • there is no syntactic sugar that clarifies that an annotation refers to a statement actually stated (discussed at length by the rdfs:states proposal)
  • there is no syntactic sugar for reifications of multiple triple terms

So clearly the syntactic sugar is quite limited.

I just think that we are making restrictions where they are not necessary.

Another perspective is that we are replacing the RDF standard reification quad with a more concise mechanism, the triple term. The only purpose of that term is to act as a blueprint to instantiate reifications. It's not meant to be used on its own, just as there never was a RDF reification mechanism for triple types. This was easy to enforce with the standard reification vocabulary: it's what it is defined to be. That the enforcement is not so easy to achieve with triple terms doesn't mean that we shouldn't try. Instead you say that that special mechanism should become fully part of standard RDF, and I disagree. I think it should remain special. This is not arbitrary but there is a necessity for that, namely to ensure that statements about statements follow a predictable and coherent model.

The obvious other argument is: the kind of entailments that you seem to be interested in seem to include literals in subject position anyway, so why isn't it satisfactory to you to just use generalized RDF?

And one more thing: rdf:value was introduced to work around the restrictions on literals. What about _:tt owl:sameAs <<( :s :p :o )>>; a rdf:Resource.?

@rat10
Copy link
Contributor

rat10 commented Nov 6, 2024

[...]

Well, IIUC correctly it's not even possible to have multiple metadata annotations on the same level of nesting in triple terms (*) - so the mechanism is really very impoverished and my lead to undesirable expressivity (see example above)

Sorry - I have no idea what you mean here.

See below. The result is that, as discussed above, those ER-style primary keys have a peculiar reliance on ordering that the original data doesn't have (as there is no order of triples in a graph).

but there can be good reasons not to, and reifiers can be better choices in some cases.

IMO that's almost all cases, as I repeatedly stressed.

With triple terms not allowed as subjects, I cannot just write the following in core RDF:

<<( :william :hates :pizza )>> a :Lie ; :perpetratedBy :doerthe .

Yes, and you shouldn't as you are talking about a specific token, the lie perpetrated by Doerthe, not the <<( :william :hates :pizza )>> type.
This example also has the problem that one triple declares another triple to be untrue, thereby violating the open world assumption. Tokens can't have that problem.

Instead, I need a reifier (that I will not use in any other way):

That may well be the case, but it is a hint that you really aren't talking about something in general but about something very specific, i.e an instance.

[...]

IMO it's not artificial at all, because it tries to enforce exactly one design pattern: using the type as a blueprint to define a token, and only use that token thereafter. How to enforce that design pattern is not trivial, and your example above shows that the mechanism can be misused if it is not enforced overly rigidly.

Unsure why you are saying "if not enforced overly rigidly". The subject position restriction does not apply there at all.

What makes you think that I'm arguing from a perspective where the subject position restriction does not apply? I'm arguing against your proposal, and not always within the confines of your example.

All that said: I would find it easier to lift this restriction if the rdfs:states proposal had been accepted, because then I wouldn't need a way to define subproperties of rdf:reifies. Instead I could with quite some confidence claim that the Turtle-star shorthand syntaxes provide everything necessary to accommodate 99% of use cases.

I made a mistake here: the other severe restriction of Turtle-star is that it provides no support for multi-statement triple terms and reifications, see below.

[...]

(*) Wait a moment, is that concern valid for the reified tokens as well? Oh well, that of course asks for graphs on the syntax level as well. There may be issues when going for graphs, but there sure are downsides to not doing so.

Again, no idea what you mean.

That it is impossible to express this

:x :y <<( <<( :s :p :o )>>  :a :b ; :c :d )>>

or this

<< << :s :p :o >>  :a :b ; :c :d >> :y :z .

or this

{ :s :p :o 
  :a :b :c } {| :x :y |}

i.e. neither triple terms nor reified terms nor the annotation syntax provide support for multi value reifications.

@william-vw
Copy link

william-vw commented Nov 6, 2024

See below. The result is that, as discussed above, those ER-style primary keys have a peculiar reliance on ordering that the original data doesn't have (as there is no order of triples in a graph).

You are simply choosing which statements to reify; this has an impact on the meaning of what is being reified.
It is true this can be seen as implying an ordering. But, this is a general issue with nested triple terms.

With triple terms not allowed as subjects, I cannot just write the following in core RDF:

<<( :william :hates :pizza )>> a :Lie ; :perpetratedBy :doerthe .

Yes, and you shouldn't as you are talking about a specific token, the lie perpetrated by Doerthe, not the <<( :william :hates :pizza )>> type. This example also has the problem that one triple declares another triple to be untrue, thereby violating the open world assumption. Tokens can't have that problem.

That is a very strange thing to say!

:Lie has no meaning under the RDF semantics - it is a class I made up. If there would be negation - which there isn't - it would not be negation as failure (disallowed in open world), as I am explicitly calling it a lie.

Assuming I understood you correctly - it is my understanding that the reifier (or token) is being used as a proxy for the triple term. Anything you say about the token, you are really saying about the triple term. Why else use reifiers?
If you say:

_:r1 rdf:reifies <<( :william :hates :pizza )>> a :SomethingThatWasSaid ; :perpetratedBy :doerthe` 

You do not mean to describe the blank node _:r1, but the associated triple term. Do you see that differently? Do reifiers have another purpose or meaning aside from that?

Even if I did mean to describe the blank node, any negation - which there isn't - would be a problem there as well. Calling something a token does not somehow allow negation in RDF.

Finally, the example isn't about the :Lie part; but rather that I am not easily able to annotate a simple triple term, without always needing an extra triple for a reifier.

(*) Wait a moment, is that concern valid for the reified tokens as well? Oh well, that of course asks for graphs on the syntax level as well. There may be issues when going for graphs, but there sure are downsides to not doing so.

Again, no idea what you mean.

That it is impossible to express this

I have been quite careful from the beginning to not suggest this as a replacement of reifiers. There will be times that reifiers are more useful to represent a particular example. That said -

:x :y <<( <<( :s :p :o )>>  :a :b ; :c :d )>>
:x :y <<( <<( <<( :s :p :o )>>  :a :b )>> :c :d )>> .

Or

:x :y <<( <<( <<( :s :p :o )>>  :c :d )>> :a :b )>> .

Depending on what you want to describe; whether :a and :b is metadata that describes <<( :s :p :o )>> :c :d, or vice-versa. In that case, you are reifying different statements, so the meaning will be different. As I mentioned, I am not guaranteeing that it is suitable for all cases.

or this

<< << :s :p :o >>  :a :b ; :c :d >> :y :z .

Here, you are using the reification syntax, which currently translates into reifiers.

Assuming you meant the triple term syntax:

<<( <<( <<( :s :p :o )>>  :a :b )>> :c :d )>> :y :z .

We've already had the discussion about switching :a :b and :c :d.

or this

{ :s :p :o 
  :a :b :c } {| :x :y |}

Note that { :s :p :o . :a :b :c } is a single term and not a triple, so this would not be a legal triple.

Assuming something like

{ :s :p :o . :a :b :c } :k :l {| :x :y |} .

Then

{ :s :p :o . :a :b :c }  :k :l .
<<( { :s :p :o . :a :b :c }  :k :l . )>> :x :y .

@william-vw
Copy link

I am finding that the same argument comes up again and again (but not quite as explicitly as here). Namely, there is a strong preference towards this particular usage of triple terms -

Another perspective is that we are replacing the RDF standard reification quad with a more concise mechanism, the triple term. The only purpose of that term is to act as a blueprint to instantiate reifications.

And

This is not arbitrary but there is a necessity for that, namely to ensure that statements about statements follow a predictable and coherent model.

(bold added for emphasis)

I haven't heard a compelling argument against using triple terms differently; aside from stopping users, in some cases, from shooting themselves in the foot. If that is the singular reason, then we should be clear about it.

The reason is that, IMO, there are drawbacks from the restriction, which I tried pointing out in my (and Doerthe's) prior posts. It reduces the expressivity of reification in core RDF; as in requiring more triples - each time to declare a reifier - to describe the same thing. There is no such thing as a free lunch.

Because that syntactic sugar only takes us so far. Two examples:

* there is no syntactic sugar that clarifies that an annotation refers to a statement actually stated (discussed at length by the `rdfs:states` proposal)

* there is no syntactic sugar for reifications of multiple triple terms

So clearly the syntactic sugar is quite limited.

Just because syntactic sugar cannot express what you want it to, does not take away the fact that syntactic sugar is specifically geared towards the preferred use case. (It's a different discussion if this sugar is so limited or misleading that people simply won't use it, which I don't think is the case.)

@rat10
Copy link
Contributor

rat10 commented Nov 6, 2024

See below. The result is that, as discussed above, those ER-style primary keys have a peculiar reliance on ordering that the original data doesn't have (as there is no order of triples in a graph).

You are simply choosing which statements to reify; this has an impact on the meaning of what is being reified. It is true this can be seen as implying an ordering. But, this is a general issue with nested triple terms.

That may well be, but it still limits the usefulness of triple terms for creating (surrogate) composite primary keys.

With triple terms not allowed as subjects, I cannot just write the following in core RDF:

<<( :william :hates :pizza )>> a :Lie ; :perpetratedBy :doerthe .

Yes, and you shouldn't as you are talking about a specific token, the lie perpetrated by Doerthe, not the <<( :william :hates :pizza )>> type. This example also has the problem that one triple declares another triple to be untrue, thereby violating the open world assumption. Tokens can't have that problem.

That is a very strange thing to say!

:Lie has no meaning under the RDF semantics - it is a class I made up.

Well, calling it a :Lie is not much different from calling it false. We are pushing around examples, without establishing the context that a proper vocabulary would require. It is however easy to imagine a vocabulary that gives the term ex:Lie a proper semantics to the effect that it calls something false (in the Boolean sense of the word). I had discussions about the dangers of statement annotation with Pat Hayes on the [email protected] mailing list a few years ago, and he was very vary of this issue.

If there would be negation - which there isn't - it would not be negation as failure (disallowed in open world), as I am explicitly calling it a lie.

I don't follow, but let's not get deeper into this right now. It was rather intended to be a remark on the side..

Assuming I understood you correctly - it is my understanding that the reifier (or token) is being used as a proxy for the triple term. Anything you say about the token, you are really saying about the triple term.

No, most definitely not. In short, a reification refers to an occurrence (no matter if it actually happened/occurred or not) of a statement as a token, not to the abstract type. This difference is really essential, and if you haven't understood that right, that might at least in part explain why we don't come to an agreement.
Maybe look up the definition of standard reification in the RDF spec if it hasn't become clear what was said before. Or maybe lets discuss this next Friday in the Semantics TF.

Why else use reifiers? If you say:

_:r1 rdf:reifies <<( :william :hates :pizza )>> a :SomethingThatWasSaid ; :perpetratedBy :doerthe` 

This example is syntactically broken. I'm not sure what you mean.

You do not mean to describe the blank node _:r1, but the associated triple term. Do you see that differently? Do reifiers have another purpose or meaning aside from that?

Even if I did mean to describe the blank node, any negation - which there isn't - would be a problem there as well. Calling something a token does not somehow allow negation in RDF.

Finally, the example isn't about the :Lie part; but rather that I am not easily able to annotate a simple triple term, without always needing an extra triple for a reifier.

(*) Wait a moment, is that concern valid for the reified tokens as well? Oh well, that of course asks for graphs on the syntax level as well. There may be issues when going for graphs, but there sure are downsides to not doing so.

Again, no idea what you mean.

That it is impossible to express this

I have been quite careful from the beginning to not suggest this as a replacement of reifiers. There will be times that reifiers are more useful to represent a particular example. That said -

:x :y <<( <<( :s :p :o )>>  :a :b ; :c :d )>>
:x :y <<( <<( <<( :s :p :o )>>  :a :b )>> :c :d )>> .

Or

:x :y <<( <<( <<( :s :p :o )>>  :c :d )>> :a :b )>> .

Depending on what you want to describe; whether :a and :b is metadata that describes <<( :s :p :o )>> :c :d, or vice-versa. In that case, you are reifying different statements, so the meaning will be different. As I mentioned, I am not guaranteeing that it is suitable for all cases.

or this

<< << :s :p :o >>  :a :b ; :c :d >> :y :z .

Here, you are using the reification syntax, which currently translates into reifiers.

Assuming you meant the triple term syntax:

No, I didn't. I wanted to point out that annotations on a reification composed of multiple triples is not supported in any of the syntactic variants we have. One has to compose it "by hand", using the rdf:reifies property and multiple triple terms. So there are indeed cases where the syntactic sugar provided by Turtle-star is not sufficient. So saying that the syntactic sugar is good enough to keep people from shooting themselves in the foot is IMO not a convincing argument.

<<( <<( <<( :s :p :o )>>  :a :b )>> :c :d )>> :y :z .

We've already had the discussion about switching :a :b and :c :d.

or this

{ :s :p :o 
  :a :b :c } {| :x :y |}

Note that { :s :p :o . :a :b :c } is a single term and not a triple, so this would not be a legal triple.

Well, that was my point.

Assuming something like

{ :s :p :o . :a :b :c } :k :l {| :x :y |} .

Then

{ :s :p :o . :a :b :c }  :k :l .
<<( { :s :p :o . :a :b :c }  :k :l . )>> :x :y .

@william-vw
Copy link

william-vw commented Nov 6, 2024

Well, calling it a :Lie is not much different from calling it false. We are pushing around examples, without establishing the context that a proper vocabulary would require. It is however easy to imagine a vocabulary that gives the term ex:Lie a proper semantics to the effect that it calls something false (in the Boolean sense of the word). I had discussions about the dangers of statement annotation with Pat Hayes on the [email protected] mailing list a few years ago, and he was very vary of this issue.

Not to belabour the point - since it is tangential to this discussion - but there is a huge difference. I can call anything just about anything using terms I make up; it has no impact whatsoever on the open world assumption. You cannot construct a vocabulary in OWL that will result in negation as failure, as OWL follows the open world assumption (often to my chagrin).

Assuming I understood you correctly - it is my understanding that the reifier (or token) is being used as a proxy for the triple term. Anything you say about the token, you are really saying about the triple term.

No, most definitely not. In short, a reification refers to an occurrence (no matter if it actually happened/occurred or not) of a statement as a token, not to the abstract type. This difference is really essential, and if you haven't understood that right, that might at least in part explain why we don't come to an agreement. Maybe look up the definition of standard reification in the RDF spec if it hasn't become clear what was said before. Or maybe lets discuss this next Friday in the Semantics TF.

They are indeed seems to be a misunderstanding here. I didn't follow many of the discussions, so I am basing myself on existing documentation, meetings, and talks I've had since I've joined.

IIRC, the standard reification assumed the reification describes an existing triple in some RDF document; a reification could be called an occurrence in that way.

It is my understanding (disclaimer) that a reifier is associated with a triple term using the regular IEXT: _:r rdf:reifier <<( :s :p :o )>> . is true iff: <[I+A]( _:r ), [I+A]( <<( :s :p :o )>> )> ∈ IEXT([I+A]( rdf:reifies )). Nothing else going on here. The reifier can be any kind of resource; a problematic triple will still be problematic when you use a reifier as subject. The recommended usage of rdf:reifies is to indicate a subject reifier to describe the triple term. Just like it is recommended to use rdf:type to indicate a resource's type; rdf:type has no special semantics in core RDF.

Here, you are using the reification syntax, which currently translates into reifiers.
Assuming you meant the triple term syntax:

No, I didn't. I wanted to point out that annotations on a reification composed of multiple triples is not supported in any of the syntactic variants we have. One has to compose it "by hand", using the rdf:reifies property and multiple triple terms. So there are indeed cases where the syntactic sugar provided by Turtle-star is not sufficient. So saying that the syntactic sugar is good enough to keep people from shooting themselves in the foot is IMO not a convincing argument.

I don't recall saying that. The only point was that syntactic sugar already translates into reifiers, i.e., we're already trying to push people to use triple terms in the object position. This brings into question the need to further restrict triple terms to only the object position.

Note that { :s :p :o . :a :b :c } is a single term and not a triple, so this would not be a legal triple.

Well, that was my point.

Then I don't understand the point you were trying to make.

@TallTed
Copy link
Member

TallTed commented Nov 7, 2024

IIRC, the standard reification assumed the reification describes an existing triple in some RDF document

... where that RDF document (and so, that "existing" triple) might be unmaterialized, or at least external to the RDF graph under immediate consideration.

In other words, standard reification does not depend on, and does not itself materialize, a triple comprised of the rdf:subject, rdf:predicate, and rdf:object values corresponding to a single subject entity. SPARQL queries for the RDF terms that comprise a triple described by standard reification will not reveal such a triple; the only results will be the triples with rdf:subject, rdf:predicate, and rdf:object predicates.

@william-vw
Copy link

@TallTed I'm basing myself on the following paragraph:

For one thing, it is important to note that in the conventional use of reification, the subject of the reification triples is assumed to identify a particular instance of a triple in a particular RDF document, rather than some arbitrary triple having the same subject, predicate, and object. This particular convention is used because reification is intended for expressing properties such as dates of composition and source information, as in the examples given already, and these properties need to be applied to specific instances of triples. There could be several triples that have the same subject, predicate, and object and, although a graph is defined as a set of triples, several instances with the same triple structure might occur in different documents. Thus, to fully support this convention, there needs to be some means of associating the subject of the reification triples with an individual triple in some document. However, RDF provides no way to do this.

It is true that the reification does not imply the statement:

Note that asserting the reification is not the same as asserting the original statement, and neither implies the other.

@afs
Copy link
Contributor Author

afs commented Dec 11, 2024

what properties can or should link to triple terms?

See https://github.com/w3c/rdf-star-wg/wiki/RDF-star-%22liberal-baseline%22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants