The GeoLink Cruise Instance Matching Track contains different ontologies along with the populated instance data collected from two different data sources in the GeoLink project. The goal of the track is to find the instances from different ontologies that describe the same cruise in the real-world. The ontologies and alignments were generated and evaluated together by ontologists and domain experts from different organizations to ensure the high quality. The evaluation process of the GeoLink Cruise Instance Matching Track will be supported by SEALS platform and MELT Framework.
The GeoLink Cruise instance matching track contains 4 ontologies. You can either download them here for local analysis and processing, or you can directly use the SEALS platform. For executing the task, the parameters are listed below (repository, suite-id, version-id).
In GeoLink dataset, there are two ontologies which are GeoLink Base Ontology (gbo) and GeoLink Modular Ontology (gmo). The data providers from different organizations populate their own data into these two ontologies. In this track, we utilize instances from two different data providers, Biological and Chemical Oceanography Data Managment Office (bco-dmo) and Rolling Deck to Repository (r2r) and populate all the triples related to Cruise into two ontologies. There are 491 Cruise pairs between these two datasets that are labeled by domain experts as equivalence. Four tasks are designed in this track.
The instance data from bco-dmo and r2r are populated into the gbo ontology independently. The instance matching systems are asked to find the matching cruises between two gbo ontologies.
Similar to Task one, the instance data from bco-dmo and r2r are populated into the gmo ontology independently. The instance matching systems are asked to find the matches between two gmo ontologies.
Previous two tasks aim to find the instance matching under the same schema. In Task three, the instance data from bco-dmo and r2r are populated into the gbo and gmo ontology respectively. The instance matching systems are required to find the matches between gbo and gmo ontologies.
Similar to Task three, in Task four, the instance data from bco-dmo and r2r are populated into the gmo and gbo ontology respectively. The instance matching systems are required to find the matches between gmo and gbo ontologies.
Some statistic information of the ontologies are listed in the table below.
Ontology | #Class | #Object Property | #Data Property | #Individual | #Triple |
---|---|---|---|---|---|
gbo_bco-dmo | 40 | 149 | 49 | 1061 | 13055 |
gbo_r2r | 40 | 149 | 49 | 5320 | 27992 |
gmo_bco-dmo | 79 | 79 | 37 | 1052 | 16303 |
gmo_r2r | 79 | 79 | 37 | 2025 | 24798 |
More details of the Geolink Cruise dataset can be found in the paper [1].
The alignments will be evaluated based on Precision, Recall and F-Measure with running time.
[1] Reihaneh Amini, Lu Zhou, Pascal Hitzler. GeoLink Cruises: A Non-Synthetic Benchmark for Co-Reference Resolution on Knowledge Graphs. In: Conference on Information and Knowledge Management, ACM, 2020.