-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RDF graphs with value-space literals #136
Comments
SPARQL Query does not say anything about how the dataset came into existence - a dataset is given to the query execution. That dataset may be "produced" from another: for example, by converting to canonicalized lexical form as part of the parsing process. This is not covered by SPARQL (this is intentional). SPARQL does have to deal with expression results which are values; some cases are not simple:
Allowing value-centric interpretation of RDF (including syntax to indicate values, for example |
I have recently encountered the same issue: the standard reserves the term "graph" for the RDF abstract syntax, but does not foresee any way of interpreting an RDF document as a (semantic) graph. However, the abstract syntax RDF graph is almost semantic as it is: IRIs do not get interpreted anyway (RDF assumes simple string equality), and bnodes in the abstract syntax are already ID-free. The only syntactic bit left are the syntactically encoded literals. A graph structure where these literals would be replaced by the data values they stand for would be very easy to define. RDF already has the necessary interpretation definitions ready, even for cases where a type is unsupported -- someone just needs to give that graph-based semantics a name so people can refer to it when describing what they already do. Indeed, practitioners often want to use RDF to represent graphs semantically rather than viewing graphs only as a syntactic abstraction on the way to more complex model theories. Most people in practice already seem to think RDF is a standard for exchanging graphs. Such a fully semantic view would also be in harmony with the proposal in rdf concepts issue #60. The semantic view would also be helpful for training and teaching. I saw major confusion in students when trying to explain that RDF graphs, while being abstract in some sense, still contain concrete syntactic elements that are further interpreted only later on. It would be easier to say that RDF is a standard for representing graphs that make connections between IRIs, bnodes, and concrete data values, instead of having to introduce datatypes and literal syntax first. In particular datatypes have a lot of technical baggage that is not essential to understanding what RDF data means, e.g., all the subtypes of |
@mkroetzsch I sympathize with the notion that literals are "overly syntactic", and with the proposal of this issue in general. However, I'm not comfortable with considering that "graphs that make connections between IRIs, bnodes, and concrete data values" would be "semantic graph", and ultimately more "homogeneous" (on the syntax-semantics spectrum) that RDF graphs are... The triple |
given that we are among the implementations that do this, i sympathize with the intent.
|
This would also requiring a value-to-lexical description for datatypes, which would probably be their canonical representation, but it would be much more challenging to define for rdf:HTML, rdf:XMLLiteral, and rdf:JSON datatypes. Right now, datatype descriptions describe the lexical-to-value mapping, but not the inverse. |
@pchampin I completely agree with your view that "semantic" seems to be the wrong term here. There are many stages of interpretation to get from a sequence of bytes to some open-world model theory; trying to make do with just two adjectives "syntactic" and "semantic" is bound to be confusing ;-) The distinction between IRI and resource is clear to me. Having a graph view that avoids literal syntax while using IRIs should not be interpreted as an attempt to identify resources with IRIs (thus introducing a kind of unique name assumption). The graph structure is really just the structure that tools with datatype support "see", before applying whatever RDF semantics they want to use further on. So what we are discussing here is essentially "abstract syntax with datatype support". The representation depends on which types are recognised. The current abstract syntax is what you get if no datatype is recognized (using the set of RDF literals as the fallback value space for unknown types, as usual). If further datatypes are supported, then tools can just replace them by their values already during parsing (which is the next thing simple D-entailment would do anyway). Allowing graph representations that use values for some literals could also remove possible confusion in typical RDF applications. For example, Turtle syntax supports expressions like |
@lisp @gkellogg As I understand the proposal, this is not meant to provide a new representation of RDF that can somehow be turned back into literals with lexical values. Tools that use an internal representation that merely represents values would still be free to syntactically return RDF in any (not necessarily canonical) form that denotes the same values. How they represent values internally is not regulated by the standard. This is similar to the view taken in D-entailment. @lisp Your treatment of timezones may not be fully compatible with the value space defined for Re "mutual dependency between the concepts recommendation and the entailment recommendation". If the concepts would explain how to map literals to values, then the entailment recommendation would not need to do the same again. So this editorial issue would be solved by moving a content rather than by mutual references between specs. The only special case to handle is inconsistency due to ill-typed literals (it is technically not a problem to flag such inconsistency during literal parsing, but it would introduce the idea of inconsistency into RDF concepts). Maybe this needs a separate discussion, since it is also strange that the object |
I am somewhat worried about what I read here. Is this issue saying that some implementations consider that the graph written in Turtle like this:
is the same as what is written in Turtle like that?
And, am I correct saying that the suggestion is to make this choice explictly allowed in the spec? |
The Turtle spec is quite clear to me:
Also, be careful with the use of the word "denote", which has a normative meaning in RDF Semantics. |
This was discussed during the #rdf-star meeting on 05 December 2024. View the transcriptRDF graphs with value-space literals 3AndyS: I think we can't do it. It needs careful thoughts and may have many visible impacts. AndyS: don't do it in RDF 1.2 - need significant preparation gkellogg: similarly - we should postpone. ora: call for support to work on it (silence) pchampin: pfps away next week as well gkellogg: we have a pause label ora: leave as-is |
It appears that some RDF implementations build RDF graphs where literals with recognized datatypes are represented if they were members of the value space instead of in their lexical form. This does not appear to be sanctioned by the RDF recommendations. So "1.99999999999999999999999999999"^^xsd:float is stored as the IEEE floating point number 2 and "2"^xsd:byte, "2"^^xsd:short, "2"^^xsd:int, and "2"^^xsd:long are all stored as the integer 2.
Would it be possible to liberalize the treatment of literals with recognized datatypes in RDF to support this? SPARQL entailment regimes already legitimize something along these lines for SPARQL.
The text was updated successfully, but these errors were encountered: