Request for review | USGS thesauri --> SWEET concept match candidates #268
Replies: 6 comments 5 replies
-
@brandonnodnarb I'll review for CMECS and would like to join the next ESIP working session for this but I'm not sure which one it is?? I just started following the sweetontology slack- do you typically post the information there, or please add me to the email list- whatever works!- Kate |
Beta Was this translation helpful? Give feedback.
-
@r0sek I understand you do not work for the USGS, but perhaps you can help. In the USGS copy of CMECS, the URI for Abyssal Plain is https://apps.usgs.gov/thesaurus/CMECS/Abyssal%20Plain in the thesaurus file, but that URI returns a 404. If one searches that landing page interface for 'Abyssal Plain' the following page is returned, with the definition --- https://apps.usgs.gov/thesaurus/term-simple.php?thcode=62&code=GC-030 I'd like to include the resolvable URL in the reference information for the definition. Do happen to know if there is a pattern for mapping the CMECS URIs to their URLs? Or, do you know if CMECS is hosted anywhere else? |
Beta Was this translation helpful? Give feedback.
-
Hi sorry, I only just saw this. EcoPortal isn't really related to OBO, and
you should only use OBO URIs if you intended to submit to OBO. I can come
up with other suggestions eg w3ids later...
…On Thu, Sep 28, 2023 at 8:31 AM Kate Rose ***@***.***> wrote:
Thanks, and yes, I do need some help! The big picture re: CMECS is that we
have the current version that was published in 2012 (the USGS Thesaurus
contains much of those units but not all); we are in the process of making
community-recommended changes to CMECS (including adding, deprecating, and
redefining units, and moving them up or down in hierarchical levels); and
our user community wants a web application where they can view all the
eventual versions of CMECS and search for units, and we may add in
reference images or maps later.
I have all the 2012 units (terms, definitions, and a few annotations) in
xlsx. I have imported them into Protégé using the Cellfie plugin with the
units/entities arranged hierarchically in the CMECS
Components/Settings/Modifiers framework. I intend to add that OWL file into
a Git repo to manage versions while editing branches in Protégé.
When I imported into Protégé, I had the preferences set to generate IRIs
using the rdfs:label term name for the ID, so I have IRIs like "
http://purl.obolibrary.org/obo/LacustrineLimneticSubsystem". However,
I've since learned that opaque IRIs are preferable for CMECS, given the
types of changes that we anticipate making (eg., we won't always need to
change a term name and thus create a new entity) and all the other reasons
articulated in OBO and the literature. I was hoping that I could have
Protege autogenerate numerical IDs to the IRIs instead of the rdfs:label
term name so that I don't have to do it manually and avoid making errors
over time. I've checked the Protégé email list and the Github issues and
there doesn't seem to be a way to convert the rdfs:label IRIs to
autogenerated numerical IDs. I've also tried starting over with the Cellfie
importing but with Protege set to autogenerate IDs (attached) but that
results in what entities/classes that are just IDs with no other
information?? and none of the class:subclass hierarchical relationships
between the terms are imported. So, I'm wondering if there's a different
way to accomplish this (as a non-coder who is open to learning my way
around this) or if I'm overthinking the IRIs and should just continue with
what I have?
@cmungall <https://github.com/cmungall> I did look at EcoPortal, and
agree that thematically it would be a better fit for CMECS. I assume it
follows OBO principles as well? I most definitely want to establish
PURLs/IRIs so that we can include them in our CMECS web application, as
well as the SWEET work and some ENVO work in the future.
I'd really appreciate any advice. I'm happy to send you a copy of my CMECS
test OWL file if you're interested. Thanks!
[image: image]
<https://user-images.githubusercontent.com/81778204/271335559-0538cfb6-3c4b-4dc1-a43e-d7ea662f4084.png>
—
Reply to this email directly, view it on GitHub
<#268 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAMMOKHG6OQMELF53VWHXLX4WJ4VANCNFSM6AAAAAAQFKQBYQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Ok, progress! The id issue has been resolved and the CMECS ontology has been uploaded to the MMI ORR/ESIP COR (not yet public tho) and the IRIs follow that pattern (eg. https://mmisw.org/ont/~katerose/CMECS/CMECS_00000001). I'm now going to attempt to set up w3id redirect URLs for them. @cmungall I did understand that about the OBO/Ecoportal URIs and intended to use whatever the Ecoportal solution is but there is much that seems to be still in development, so I went back to the COR. |
Beta Was this translation helpful? Give feedback.
-
great!
I assume the URIs in the RDF will actually look like this
https://w3id.org/cmecs/00000001
and will redirect to web URLs like this
https://mmisw.org/ont/~katerose/CMECS/CMECS_00000001
…On Wed, Nov 1, 2023 at 8:12 AM Kate Rose ***@***.***> wrote:
Ok, progress! The id issue has been resolved and the CMECS ontology has
been uploaded to the MMI ORR/ESIP COR (not yet public tho) and the IRIs
follow that pattern (eg.
https://mmisw.org/ont/~katerose/CMECS/CMECS_00000001). I'm now going to
attempt to set up w3id redirect URLs for them.
@cmungall <https://github.com/cmungall> I did understand that about the
OBO/Ecoportal URIs and intended to use whatever the Ecoportal solution is
but there is much that seems to be still in development, so I went back to
the COR.
—
Reply to this email directly, view it on GitHub
<#268 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAMMOLC5ULWWDE3XHE3H7TYCJRFFAVCNFSM6AAAAAAQFKQBYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TINBWHA2DO>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
re-opening; all the mappings have not been completed. |
Beta Was this translation helpful? Give feedback.
-
request
The SWEET community are seeking reviews of candidate matches between SWEET and USGS thesauri. This review is ultimately dual purpose as it will facilitate adding existing USGS thesauri definitions to SWEET concepts (per #211) as well as initiate the process of mapping between SWEET and each USGS published thesaurus.
background
The SequenceMatcher class from the Python package difflib was used to generate a similarity score between a pairwise evaluation of each SWEET concept label (
rdfs:label
) against every USGS thesaurus concept label (skos:prefLabel
). Only results with an arbitrarily determined similarity score of 0.90 or better were returned and included in the spreadsheets.The results are split up by individual thesaurus. An initial cursory scan for false positive matches (e.g. adsorption != absorption) has been completed. These predominantly false positive results are in a second tab in each spreadsheet titled 'CONCEPT SCHEME'_removed.
reviewing
For those willing to review concepts, please use the comments field to capture anything specific to a record. If you have reviewed a concept pairing and:
please then add your name (or ORCID) to one of the reviewer cells for that concept. If the concept is determined to be a match you are done. If the concept is determined to be NOT a match, please cut and paste the entire row to the spreadsheet's '_removed' tab.
When two reviewers have completed their assessment and agreed it a match (or NOT), that concept row can be highlighted and considered completed. If a potential match has conflicting reviews, or needs discussion for any reason, please use this thread or the sweetontology ESIP slack channel for discussion. Any concept needing further deliberation will be added as an agenda item for the STC monthly meeting.
There are currently two reviewer columns. Please do add more reviewer columns if needed.
results for review
The following table lists the USGS thesauri, total number of candidate SWEET matches for that thesaurus and initial observations (comments). Concept schemes are linked to their respective google spreadsheet which will be used to facilitate the review.
'matches'
Beta Was this translation helpful? Give feedback.
All reactions