Ontology Alignment Evaluation Initiative
Since 2004, OAEI organises evaluation campaigns aiming at evaluating ontology matching technologies.
This year we are adopting the MELT platform for the evaluation. The use of MELT facilitates
the participation of non Java systems and, at the same time, it provides compatibility with systems wrapped according to the SEALS specification.
The Interactive, SPIMBENCH and Link discovery tracks are exceptions and the only tracks relying on the SEALS client and the HOBBIT platform, respectively.
For systems participating in tracks using the HOBBIT platform (SPIMBENCH and Link discovery) and the Interactive track (SEALS),
MELT also facilitates the system wrapping.
Please check the organizing committee and main contacts of the OAEI 2022 campign.
Public OAEI systems for the latest campaigns.
Relevant JoWS special issue on Automating Knowledge Graph Construction. Previous OAEI-related special issues.
Participation: forum and registration
We have a discussion group for the campaign where we share the latest news with the participants and we discuss issues risen during the evaluation.
Please register your system using this form.
Detailed instructions about system wrapping and submission are given below.
The OAEI 2022 campaign will once again confront ontology matchers to ontology and data sources to be matched.
This year, the following test sets are available:
- The anatomy
real world case is about matching the Adult Mouse Anatomy (2744 classes) and the NCI Thesaurus (3304 classes) describing the human anatomy.
The goal of the track is to find alignments within a collection of ontologies describing the domain of organising conferences. Additionally, 'complex correspondences' are also very welcome. Alignments will be evaluated automatically against reference alignments also considering its uncertain version presented at ISWC 2014. Summary results along with detail performance results for each ontology pair (test case) and comparison with tools' performance from last years will be provided.
This dataset is composed of a subset of the Conference dataset, translated in nine different languages (Arabic, Chinese, Czech, Dutch, French, German, Portuguese, Russian, and Spanish) and the corresponding alignments between these ontologies. Based on these test cases, it is possible to evaluate and compare the performance of matching approaches with a special focus on multilingualism.
This track evaluates the detection of complex correspondences between ontologies of four different domains: conference, hydrography, geography and species taxonomy. Each dataset has its particularities and evaluation modalities.
- Food Nutritional Composition
This track consists of finding alignments between food concepts from CIQUAL, the French food nutritional composition database, and food concepts from SIREN. Food concepts from both databases are described in LanguaL, a well-known multilingual thesaurus using faceted classification.
- Interactive matching evaluation
This track offers the possibility to compare different interactive matching tools which require user interaction.
The goal is to show if user interaction can improve the matching results, which methods are most promising and how many
interactions are necessary. All participating systems are evaluated using an oracle which bases on the reference alignment.
Using the SEALS client, the matching system only needs to be slightly adapted to participate to this track.
The Bio-ML track is a Machine Learning (ML) friendly Biomedical track for Equivalence and Subsumption Matching.
This track presents an unified evaluation framework suitable for both ML-based and non-ML-based OM systems.
The datasets of this track are based on Mondo and UMLS Metathesarus. This track
supersedes the previous largebio and
- Biodiversity and Ecology (biodiv)
The goal of the track is to find pairwise alignments between ontologies and thesauri that are particularly useful for biodiversity and ecology research and are being used in various projects. They have been developed in parallel and are very overlapping. They are semantically rich and contain tens of thousands of classes.
- Material Sciences and Engineering (mse)
The Material Sciences and Engineering (MSE) track contains the first benchmark for the evaluation of (semi-)automatic
ontology matching techniques. In this emerging ontological domain, small to mid-sized upper and domain level ontologies
are used that contain concepts described in natural language and are implemented by heterogeneous design principles
with only partial overlap to each other.
- Metadata Schema Matching
This track aims at evaluating the ability of systems to deal with the heterogeneities of schema metadata.
- Common Knowledge Graphs
- This track composes of a task aimed at matching the schema of two common and highly influential knowledge graphs which are : DBpedia and the Never Ending Language Learner (NELL).
Instance and schema matching
- Knowledge graph
The Knowledge Graph Track contains nine isolated knowledge graphs with instance and schema data.
The goal of the task is to match both the instances and the schema.
Instance matching or link discovery ( HOBBIT tracks)
- SPIMBENCH (spimbench)
The goal of this track is to determine when two OWL instances describe the same Creative Work.
The datasets are generated and transformed using SPIMBENCH by altering a set of original data through value-based, structure-based, and semantics-aware transformations (simple combination of transformations).
- Link Discovery (link)
In this track two benchmark generators are proposed to deal with link discovery for spatial data where spatial data are
represented as trajectories (i.e., sequences of longitude, latitude pairs).
- GeoLink Cruise (geolink cruise)
The goal of this track is to determine if two instances from different ontologies describe the same cruise.
The datasets are collected from the Geolink project, which was funded under the U.S. National Science Foundation's EarthCube initiative. The datasets and alignments are guaranteed to contain real-world use cases to solve the instance matching problem in practice.
Tabular data to Knowledge Graph matching
- TD→KG (special track)
Tabular data to Knowledge Graph (KG) matching is the process of assigning semantic tags from Knowledge Graphs
(e.g., Wikidata or DBpedia) to the elements of a table (e.g., a web table or an arbitrary csv file).
Ontology alignment and link discovery systems are welcome to participate.
We plan to create input data in OWL/RDF format to facilitate their participation.
There will be prizes sponsored by SIRIUS and IBM Research.
All public datasets should be available by the end of this phase.
MELT includes several built-in evaluation tracks: built-in datasets.
New organisers: can perform the evaluation via a Local Track or contacting the platform chairs to create a new built-in track.
OAEI participants should follow the MELT instructions (schema matching tracks) and/or the HOBBIT instructions (spimbench and link discovery tracks)
depending on the tracks they are willing to participate. The TD→KG track is an exception as it has its own evaluation process.
We encourage systems developers to test their systems in the early stages of this phase to avoid last minute
problems with the evaluation infrastructure. Once the execution phase ends, there will be limited time to solve technical problems
with the evaluation platforms.
Evaluation will be run under MELT
or HOBBIT infrastructure.
The TD→KG track is an exception as it has its own evaluation process.
Participants will be evaluated with respect to
all of the OAEI tracks (when possible) even though the system
might be specialized for some specific kind of matching problems.
We know that this can be a problem for some systems that have specifically been developed for, e.g., matching biomedical ontologies;
but this point can still be emphasized in the specific results paper about the system in case the results generated for some specific
track are not good at all.
The results will be reported at the International Workshop on Ontology Matching,
which will be collocated with the 20th International Semantic Web Conference (ISWC 2022).
Please note that, a matcher may want to behave differently given what it is
provided with as ontologies; however, this should not be based on features
specific of the tracks (e.g., there is a specific string in the URL, or a specific
class name) but on features of the ontologies (e.g., there are no instances or
labels are in German). Check the OAEI rules here.
Systems that rely or are derived from other ontology matching systems should:
(a) clearly state the system they rely on, and (b) what was changed from / added to the original system.
Withdrawal of systems is possible up to one week after submission. After this period you accept that
your systems will be evaluated and the results will be made publicly available within the OAEI pages and the
OAEI evaluation report in accordance to the
OAEI data policy.
- June 30th
- preliminary datasets available.
- July 31st
- preparation phase ends and final datasets are available.
- July 31st
- participants register their tool (mandatory). Please use this form (requires a google account and a valid email)
- August 31st
- execution phase ends and participants submit final versions of their tools. MELT tracks (zip or tar.gz file, e.g., LogMap.zip) using this form. HOBBIT tracks (via platform).
- September 30th
- evaluation phase ends and results are available.
- October 14th
- Preliminary version of system papers due. Submit PDF paper (e.g., LogMap_prelim.pdf). Please use this form (requires a google account and a valid email).
- October 25th
- The 16th Ontology matching workshop
- October 24-28th
- The 20th International Semantic Web Conference.
- November 15th
- Final version of system papers due. Submit a PDF (e.g., LogMap_final.pdf) paper. Please use this form (requires a google account and a valid email).
From the results of the experiments, participants are expected
to provide the organisers with a paper to be published in the proceedings
of the Ontology Matching workshop.
The paper should be no more than 8 pages long and formatted using the
LNCS Style. Long-running systems can submit a 2 pages summary
if there were not significant additions to the system. Please use this form for the submission (requires a google account and a valid email)
These papers are not peer-reviewed, but they will revised by 1-2 OAEI organisers. The main objective of these OAEI paper
is to keep track of the participants and the description of matchers which took part in the campaign.
To ensure easy comparability among the participants it is recommended that the paper follows this
- Presentation of the system
- State, purpose, general statement
- Specific techniques used
- Adaptations made for the evaluation
- Link to the system and parameters file
- Link to the set of provided alignments (in align format)
- 2.x) a comment for each dataset performed
- General comments
(not necessaryly by putting the section below but preferably in
- Comments on the results (strength and weaknesses)
- Discussions on the way to improve the proposed system
- Comments on the OAEI procedure (e.g., comments on the MELT evaluation, if relevant)
- Comments on the OAEI test cases
- Comments on the OAEI measures
- Proposed new measures