Complex track - Evaluation

General description

The complex track aims at evaluating the systems which can generate both simple and complex correspondences.

This track contains 7 datasets from 5 different domains: Conference, Populated Conference, Hydrography, GeoLink, Populated GeoLink, Populated Enslaved and Taxon.

The detailed description of each dataset can be found at the OAEI Complex track page

Table below presents the results for the 7 datasets. Only AMLC, AROA, and CANARD were able to generate complex correspondences. The results for the other systems are reported in terms of simple alignments.

Matcher	Conference			Populated Conference		Hydrography			GeoLink			Populated GeoLink			Populated Enslaved			Taxon
Matcher	Precision	F-measure	Recall	Precision (classical - not disjoint)	Coverage (classical - query Fmeas.)	relaxed Precision	relaxed F-measure	relaxed Recall	relaxed Precision	relaxed F-measure	relaxed Recall	relaxed Precision	relaxed F-measure	relaxed Recall	relaxed Precision	relaxed F-measure	relaxed Recall	Precision (classical - overlap)	Coverage (classical - overlap)
ALIN	-	-	-	0.68-0.98	0.20-0.28	-	-	-	-	-	-	-	-	-	-	-	-	-	-
ALOD2Vec	-	-	-	0.39-0.78	0.24-0.33	-	-	-	-	-	-	-	-	-	-	-	-	0.79-0.96	0.08-0.14
AML	-	-	-	0.59-0.93	0.31-0.37	-	-	-	-	-	-	-	-	-	-	-	-	-	-
AMLC	0.31	0.34	0.37	0.23-0.51	0.26-0.31	0.45	0.10	0.05	0.50	0.32	0.23	0.50	0.32	0.23	0.73	0.40	0.28	0.19-0.40	0
AROA	-	-	-	-	-	-	-	-	-	-	-	0.87	0.60	0.46	0.80	0.51	0.38	-	-
ATBox	-	-	-	0.39-0.81	0.27-0.36	-	-	-	-	-	-	-	-	-	-	-	-	0.56-0.71	0.06-0.11
CANARD	-	-	-	0.25-0.88	0.40-0.50	-	-	-	-	-	-	0.89	0.54	0.39	0.42	0.19	0.13	0.16-0.57	0.17-0.36
LogMap	-	-	-	0.56-0.96	0.26-0.33	0.67	0.10	0.05	0.85	0.29	0.18	0.85	0.29	0.18	-	-	-	0.54-0.77	0.08-0.14
LogMapBio	-	-	-	-	-	0.70	0.10	0.05	-	-	-	-	-	-	-	-	-	0.50-0.73	0.06-0.08
LogMapKG	-	-	-	0.56-0.96	0.26-0.33	0.67	0.10	0.05	0.85	0.29	0.18	0.85	0.29	0.18	-	-	-	0.54-0.77	0.08-0.11
LogMapLt	-	-	-	0.50-0.87	0.23-0.31	0.66	0.10	0.06	0.69	0.36	0.25	0.69	0.36	0.25	-	-	-	0.25-0.35	0.08-0.11
Wiktionary	-	-	-	0.49-0.88	0.26-0.35	-	-	-	-	-	-	-	-	-	-	-	-	0.89-0.96	0.08-0.11

Results per dataset

Conference dataset

The complex correspondences generated by the systems were manually compared to the ones of the provided consensus alignment.

For this evaluation, only equivalence correspondences were considered and the confidence of the correspondence were not be taken into account.

The detailed results for this track are accessible in Conference results

Populated Conference dataset

In this subtrack, the alignments are automatically evaluated over a populated version of the Conference dataset.

The dataset as well as the evaluation systems are available at https://framagit.org/IRIT_UT2J/conference-dataset-population.

Two metrics are computed: a Coverage score and a Precision score.

The systems were run:

on the Original dataset conference
on a small Populated version of the dataset conference (with only common instances)
on a big Populated version of the dataset conference (with only common instances)

Then the best set of alignments output by each systems gives the final score for this track. For example, AML performed better on the Original Conference dataset than on the Populated version so the first sets of alignments are kept.

The detailed results for this track are accessible in Populated Conference results

Hydrography dataset

In this subtack, in order to explain the performance of alignment systems, we break the evaluation down into three subtasks: Entity Identification, Relationship Identification, and Full Complex alignment Identification. The alignments generated for the final results have been evaluated using relaxed precision and recall.

The detailed results for this track are accessible in Hydrography results

GeoLink dataset

The evaluation of GeoLink benchmark applies the same methods of evaluating Hydrography benchmark. The evaluation of the systems are performed by computing relaxed precision and recall for final results.

The detailed results for this track are accessible in GeoLink results

Populated GeoLink dataset

The evaluation of Populated GeoLink benchmark applies the same methods of evaluating Hydrography benchmark. The evaluation of the systems are performed by computing relaxed precision and recall for final results.

The detailed results for this track are accessible in Populated GeoLink results

Populated Enslaved dataset

The evaluation of Populated Enslaved benchmark applies the same methods of evaluating Hydrography benchmark. The evaluation of the systems are performed by computing relaxed precision and recall for final results.

The detailed results for this track are accessible in Populated Enslaved results

Taxon dataset

Even though the ontologies of the Taxon dataset have a common scope (plant taxonomy), they are unevenly populated. For this reason, the automatic evaluation system can not be applied to this dataset.

First the alignments have been filtered to remove the correspondences which align the same URIs and the correspondences which align instances. The filtered correspondences have then been manually classified as equivalent, more general, more specific, overlapping or disjoint.

6 reference SPARQL queries are used to compute the Coverage.

The detailed results for this track are accessible in Taxon results

Organizers

Elodie Thiéblin (IRIT, Toulouse, France), elodie [.] thieblin [at] irit [.] fr
Cassia Trojahn (IRIT, Toulouse, France), cassia [.] trojahn [at] irit [.] fr
Ondřej Zamazal (University of Economics, Prague), ondrej [.] zamazal [at] vse [.] cz
Lu Zhou (Kansas State University, USA), kbzhoulu [at] gmail [.] com

References

[1] Ondřej Zamazal, Vojtěch Svátek. The Ten-Year OntoFarm and its Fertilization within the Onto-Sphere. Web Semantics: Science, Services and Agents on the World Wide Web, 43, 46-53. 2017.

[2] Élodie Thiéblin, Ollivier Haemmerlé, Nathalie Hernandez, Cassia Trojahn. Task-Oriented Complex Ontology Alignment: Two Alignment Evaluation Sets. In : European Semantic Web Conference. Springer, Cham, 655-670, 2020.

[3] Élodie Thiéblin, Fabien Amarger, Nathalie Hernandez, Catherine Roussey, Cassia Trojahn. Cross-querying LOD datasets using complex alignments: an application to agronomic taxa. In: Research Conference on Metadata and Semantics Research. Springer, Cham, 25-37, 2017.

[4] Lu Zhou, Michelle Cheatham, Adila Krisnadhi, Pascal Hitzler. A Complex Alignment Benchamark: GeoLink Dataset. In: International Semantic Web Conference. Springer, 2020.

[5] Marc Ehrig, and Jérôme Euzenat. "Relaxed precision and recall for ontology matching." K-CAP 2005 Workshop on Integrating Ontologies, Banff, Canada, 2005.

[6] Lu Zhou, Michelle Cheatham, Adila Krisnadhi, Pascal Hitzler. GeoLink DataSet: A Complex Alignment Benchmark from Real-world Ontology. In: Data Intellegence. Volume 2, Issue 3, Pages 353-378, MIT Press, 2020.

[7] Lu Zhou, Cogan Shimizu, Pascal Hitzler, Alicia M. Sheill, Seila Gonzalez Estrecha, Catherine Foley, Duncan Tarr, Dean Rehberger. The Enslaved Dataset: A Real-world Complex Ontology Alignment Benchmark using Wikibase. In: Conference on Information and Knowledge Management, ACM, 2020.