Ontology Alignment Evaluation Initiative

2024 Campaign

Since 2004, OAEI organises evaluation campaigns aiming at evaluating ontology matching technologies. This year we are adopting the MELT platform for the evaluation. The use of MELT facilitates the participation of non Java systems and, at the same time, it provides compatibility with systems wrapped according to the SEALS specification. The Interactive, SPIMBENCH and Link discovery tracks are exceptions and the only tracks relying on the SEALS client and the HOBBIT platform, respectively. For systems participating in tracks using the HOBBIT platform (SPIMBENCH and Link discovery) and the Interactive track (SEALS), MELT also facilitates the system wrapping.

Please check the organizing committee and main contacts of the OAEI 2024 campaign.

Relevant material

Public OAEI systems for the latest campaigns.

Previous OAEI-related special issues.

Participation: forum and registration

We have a discussion group for the campaign where we share the latest news with the participants and we discuss issues risen during the evaluation.

Detailed instructions about system wrapping and submission are given below.

Problems

The OAEI 2024 campaign will once again confront ontology matchers to ontology and data sources to be matched. This year, the following test sets are available:

anatomy: The anatomy real world case is about matching the Adult Mouse Anatomy (2744 classes) and the NCI Thesaurus (3304 classes) describing the human anatomy.
conference: The goal of the track is to find alignments within a collection of ontologies describing the domain of organising conferences. Additionally, 'complex correspondences' are also very welcome. Alignments will be evaluated automatically against reference alignments also considering its uncertain version presented at ISWC 2014. Summary results along with detail performance results for each ontology pair (test case) and comparison with tools' performance from last years will be provided.
Multifarm: This dataset is composed of a subset of the Conference dataset, translated in nine different languages (Arabic, Chinese, Czech, Dutch, French, German, Portuguese, Russian, and Spanish) and the corresponding alignments between these ontologies. Based on these test cases, it is possible to evaluate and compare the performance of matching approaches with a special focus on multilingualism.
Complex: This track evaluates the detection of complex correspondences between ontologies of the conference domain.
Food Nutritional Composition: This track consists of finding alignments between food concepts from CIQUAL, the French food nutritional composition database, and food concepts from SIREN. Food concepts from both databases are described in LanguaL, a well-known multilingual thesaurus using faceted classification.
Interactive matching evaluation (interactive): This track offers the possibility to compare different interactive matching tools which require user interaction. The goal is to show if user interaction can improve the matching results, which methods are most promising and how many interactions are necessary. All participating systems are evaluated using an oracle which bases on the reference alignment. Using the SEALS client, the matching system only needs to be slightly adapted to participate to this track.
Bio-ML: The Bio-ML track is a Machine Learning (ML) friendly Biomedical track for Equivalence and Subsumption Matching. This track presents an unified evaluation framework suitable for both ML-based and non-ML-based OM systems. The datasets of this track are based on Mondo and UMLS Metathesarus. This track supersedes the previous largebio and phenotype tracks.
Biodiversity and Ecology (biodiv): The goal of the track is to find pairwise alignments between ontologies and thesauri that are particularly useful for biodiversity and ecology research and are being used in various projects. They have been developed in parallel and are very overlapping. They are semantically rich and contain tens of thousands of classes.
Material Sciences and Engineering (mse): The Material Sciences and Engineering (MSE) track contains the first benchmark for the evaluation of (semi-)automatic ontology matching techniques. In this emerging ontological domain, small to mid-sized upper and domain level ontologies are used that contain concepts described in natural language and are implemented by heterogeneous design principles with only partial overlap to each other.
Crosswalks Data Schema Matching: This track aims at evaluating the ability of systems to deal with the schema metadata matching. In particular, with crosswalks from fifteen research data schemas to Schema.org.
Digital Humanities (dh): The goal of the Digital Humanities track is to evaluate matching system performance when dealing with small datasets in different languages and specialist terms from archaeology, cultural history and the interlink of DH and computer science. The track offers manually compiled gold standard reference alignments for all the test cases, ensuring semantic, lexical and part-of-speech similartiy.
Archaeology multiling (arch-multiling): The Archaeology multilingual track aims to evaluate if matching systems are capable of finding alingments in archaeological monolingual datasets in different languages and specialist terms. The track is based on a test case of the Digital Humanities track.
Knowledge graph: The Knowledge Graph Track contains nine isolated knowledge graphs with instance and schema data. The goal of the task is to match both the instances and the schema.
SPIMBENCH (spimbench): The goal of this track is to determine when two OWL instances describe the same Creative Work. The datasets are generated and transformed using SPIMBENCH by altering a set of original data through value-based, structure-based, and semantics-aware transformations (simple combination of transformations).
Link Discovery (link): This track proposes a benchmark generator to deal with link discovery for spatial data where spatial data are represented as trajectories (i.e., sequences of longitude, latitude pairs).
Pharmacogenomics (pgx): The Pharmacogenomics Track involves n-ary tuples representing so-called "pharmacogenomic relationships" and their components of three distinct types: drugs, genetic factors, and phenotypes..
TD→KG (special track): Tabular data to Knowledge Graph (KG) matching is the process of assigning semantic tags from Knowledge Graphs (e.g., Wikidata or DBpedia) to the elements of a table (e.g., a web table or an arbitrary csv file). Ontology alignment and link discovery systems are welcome to participate. We plan to create input data in OWL/RDF format to facilitate their participation. There will be prizes sponsored by SIRIUS and IBM Research.

Ontology Alignment Evaluation Initiative

2024 Campaign

Relevant material

Participation: forum and registration

Problems

T-Box/Schema matching

Instance and schema matching

Instance matching (link discovery)

Tabular data to Knowledge Graph matching

Evaluation

Preparation phase

Execution phase

Evaluation phase

OAEI rules

Schedule (tentative)

Presentation