-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSV: state how to handle special characters in strings #10
Comments
This preference is commonly dictated by the data. If my data has lots of Putting some guidance like yours into a distinct "notes on canonicalization" section would probably be OK. |
I think that it is bad idea to change / overwrite the basic spec that is used. TSV has no official spec, but CSV has. And in that spec there is no information about |
TSV does have an official spec! |
In TSV, the quotes and escapes are from the RDF term writing. https://w3c.github.io/sparql-results-csv-tsv/spec/index.html#tsv-terms From what I see on the web, in Turtle, Each needs escaping checking ( Some advice-text would be useful - less than formal, single-choice canonicalization. |
Yes and no. It's rather a documentation for media type than official spec (that is RFC or STD). Regardless of the naming, there is nothing about |
This issue (#10) is specific to TSV. For CSV, we should, of course, use |
This "needs discussion" issue was discussed during the telecon of 2023-11-30. From the issue thread above, are we agreed that:
Anything else? |
Related to handling characters: the TSV Media Type does not specify the character set. Nowadays, the "default" for "text/" is UTF-8, a change from the original ASCII. We can mention this and suggest ("SHOULD") that no character set is treated as UTF-8. |
I think this one could use just a bit of nuance. There's no need for raw TABs in RDF term text, but SPARQL and Turtle do allow raw tabs in their literal syntax. The SPARQL TSV spec already has language about this, though:
That seems clear enough to me. Agree that inline examples would be an improvement. |
The specification does not explicitely states how quotes and ASCII control characters (
\0
...) should be escaped. It might be nice to add some sentences about it.A note to state that the
"
quote should be prefered to the'
quote might also be nice to get some kind of "canonical" TSV serialization.The text was updated successfully, but these errors were encountered: