Evaluation campaign is over! Final results here

Ontology Alignment Evaluation Initiative

2011 Campaign

The increasing number of methods available for schema matching/ontology integration necessitate to establish a consensus for evaluation of these methods. Since 2004, OAEI organizes evaluation campaigns aiming at evaluating ontology matching technologies.

The OAEI 2011 campaign is associated to the ISWC Ontology matching workshop to be held in Bonn, Germany in October 24, 2011.

Problems

The 2011 campaign introduces a new evaluation modality in association with the SEALS project. Its goal is to provide more automation to the evaluation and more direct feedback to the participants. The concerned datasets are benchmark, conference and anatomy. Participants in this modality must follow the specific instructions for participation.

Comparison track: benchmark

Like in previous campaigns, a systematic benchmark series has to be matched. The goal of this benchmark series is to identify the areas in which each alignment algorithm is strong and weak. The test is based on one particular ontology altered in a systematic way. In 2011, we will use automatically generated datasets on the same mode but with a non disclosed ontology.

Expressive ontologies

anatomy: The anatomy real world case is about matching the Adult Mouse Anatomy (2744 classes) and the NCI Thesaurus (3304 classes) describing the human anatomy.
conference: The goal of this track is to find all correct correspondences within a collection of ontologies describing the domain of organising conferences (the domain being well understandable for every researcher). Additionally, 'interesting correspondences' are also welcome. Results will be evaluated automatically against reference alignment and by data-mining and logical reasoning techniques. Sample of correspondences and 'interesting correspondences' will be evaluated manually.

Oriented matching

The track provides two datasets of real ontologies taken from a) Academia (alterations of ontologies from the benchmark series corpus of the OAEI contest), b) Course catalogs (alterations of ontologies concerning courses in the universities of Cornell and Washington). The alterations aim to introduce additional subsumption mappings between classes that cannot be inferred via reasoning. For each pair of ontologies the dataset provides reference alignments for both equivalence and subsumption mappings. The aim is to evaluate tools for their ability to (a) compute class equivalences and subsumptions, (b) compute class subsumptions without computing any equivalences, and (c) compute class subsumptions and non-class equivalences.

Model matching

This dataset aims at comparing model matching tools from the Model-Driven Engineering (MDE) community on ontologies. In order, to compare model matchers to ontology matchers, the test cases are available in two formats: OWL and Ecore. The model to be matched have been automatically derived from a model-based repository.

Instance matching

The instance data matching track aims at evaluating tools able to identify similar instances among different RDF and OWL datasets. It features Web datasets, as well as a generated benchmark. Instance matching at OAEI 2011 is focused on RDF and OWL data in the context of the Semantic Web. Participants will be asked to execute their algorithms against various datasets and their results will be evaluated by comparing them with a pre-defined reference alignment. Participating systems are free to use any combination of matching techniques and background knowledge. Results, in the alignment format, will be evaluated according to standard precision and recall metrics. This year there are two tasks:

Interlinking New-York Times Data: Participants are requested to re-build the links among the NYT dataset itself, and to the external data sources DBPedia, Geonames and Freebase.
Synthetic Freebase data: Participants should match synthetic data generated from Freebase, in the same style as for the benchmark task.

We summarize below the variation between the results expected by these tests (all results are given in the Alignment format):

test	seals	language	relations	confidence	Modalities	Language	Size (≈)
benchmarks		OWL	=	[0 1]	open+blind	EN	(36+61)^2*49
anatomy		OWL	=	[0 1]	open	EN	3k*3k
conference		OWL-DL	=, <=	[0 1]	blind+open	EN	(20^2)*21
oriented		OWL	=,<,>	[0 1]	open	EN
model m.		OWL+Ecore	=	1	open	EN
nyt		RDF	=	[0 1]	open	EN
freebase		RDF	=	[0 1]	blind	EN

[0 1] in the 'confidence-column' means that submission with confidence values in the range [0 1] are preferred, but does not exclude systems which do not distinguish between different confidence values.

Evaluation process

Each data set has a different evaluation process. They can be roughly divided into four groups:

benchmark, anatomy: open: tests are provided with the expected results;
conference, imei: blind: these are blind tests, i.e., participants do not know the results;

For the tracks included in the new modality, namely benchmark, conference and anatomy, the participants must run their tools in the SEALS platform, following the instructions. For the other tracks, the participants must return their results to organisers.

However, the evaluation will be processed in the same three successive steps as before.

Preparatory Phase

Ontologies are described in OWL-DL and serialized in the RDF/XML format. The expected alignments are provided in the Alignment format expressed in RDF/XML.

The ontologies and alignments of the evaluation are provided in advance during the period between June 1st and June 21st. This gives potential participants the occasion to send observations, bug corrections, remarks and other test cases to the organizers. The goal of this primary period is to be sure that the delivered tests make sense to the participants. The feedback is important, so all participants should not hesitate to provide it. The tests will certainly change after this period, but only for ensuring a better participation to the tests. The final test bases will be released on July 5th.

Execution Phase

During the execution phase the participants will use their algorithms to automatically match the ontologies. Participants should only use one algorithm and the same set of parameters for all tests in all tracks. Of course, it is fair to select the set of parameters that provide the best results (for the tests where results are known). Beside the parameters the input of the algorithms must be the two provided ontology to match and any general purpose resource available to everyone (that is no resource especially designed for the test). In particular, participants should not use the data (ontologies and results) from other test sets to help their algorithm. And cheating is not fair...

Furthermore, a tool that participates in one of the tracks conducted in SEALS modality, will be evaluated with respect to all of the other tracks in SEALS modality even though the tool might be specialized for some specific kind of matching problems.

The deadline for delivering final results is September 23rd, sharp. This means for old-style tracks to submit the generated alignment. For tracks under the SEALS modality it means that the final version of the tool has been uploaded to the SEALS platform. How to wrap and zip the tool is described in these instructions.

For old-style tracks, it is highly advised that participants send results before (preferably by September 1st) to the organisers so that they can check that they will be able to evaluate the results smoothly and can provide some feedback to participants. For tracks under the SEALS modality a similar check is requested. A preliminary version has to be uploaded to the SEALS portal to check technical compatibility. Again, details can be found here. Note that we recommend to upload a first version much earlier to avoid any risks.

Participants of old-style tracks will provide their alignment for each test in the Alignment format. The results will be provided in a zip file containing one directory per test (named after its number) and each directory containing one result file in the RDF/XML Alignment format with always the same name (e.g., participant.rdf replacing "participant" by the name you want your system to appear in the results, limited to 6 alphanumeric characters). This should yield the following structure:

participant.zip
+- benchmarks
|  +- 101
|  |  +- participant.rdf
|  +- 103
|  |  +- participant.rdf
|  + ...
+- anatomy
|  +- 1
|  |  +- participant.rdf
|  +- 2
|  |  +- participant.rdf
|  +- ...
+- directory
|  +- 1
|  |  +- participant.rdf
|  + ...
+ ...

For the SEALS modality, participants must have uploaded their final tool version to the SEALS portal. Participants must guarantee that their tools generate alignments in the correct format (Alignment API). The test-client available on the instructions page, can be used to ensure full compatibility with respect to required interfaces and generated output format.

All participants will also provide, for September 26th, a paper to be published in the proceedings.

All participants are required to provide a link to their program and parameter set. For participants of tracks in the SEALS modality this issue is already solved by uploading the tool to the SEALS portal. In the paper there should just be a note about the version that has been uploaded and used for OAEI 2011.

Apart from the instance matching track, the only interesting alignments are those involving classes and properties of the given ontologies. So these alignments should not align individuals, nor entities from the external ontologies.

Evaluation Phase

The organizers will evaluate the results of the algorithms used by the participants and provide comparisons on the basis of the provided alignments.

In order to ensure that it will be possible to process automatically the provided results, participants are requested to provide (preliminary) results by September 1st (old-style tracks). In the case of blind tests only the organizers will do the evaluation with regard to the withheld alignments.

The standard evaluation measures will be precision and recall computed against the reference alignments. For the matter of aggregation of the measures we will use weighted harmonic means (weight being the size of reference alignment). Precision/recall graphs will also be computed, so it is advised that participants provide their results with a weight to each correspondence they found (participants can provide two alignment results: <name>.rdf for the selected alignment and <name>-full.rdf for the alignment with weights. Additionally, with the help of the SEALS platform, we will be able to measure runtime and alignment coherence for the anatomy and conference tracks.

Schedule

May 30th: ~~datasets are out~~
June 27th: ~~end of commenting period~~
July 6th: ~~tests are frozen~~
September 1st: participants send preliminary results for interoperability-checking (old-style tracks), or ensure that the last version of their tools is submitted to the SEALS platform (SEALS tracks, see further instructions for at the Execution phase)
September 23rd: ~~participants send final results (old-style tracks) or submit the ultimate tool version to the SEALS platform (SEALS tracks)~~.
September 26th: ~~participants send their papers describing their systems and analysing their results.~~
October 24th: ~~final results ready and OM-2011 workshop.~~
November 15th: ~~participants send final versions of papers to Cassia Trojahn and Pavel Shvaiko (for publication in CEUR proceedings)~~.

Presentation

From the results of the experiments the participants are expected to provide the organisers with a paper to be published in the proceedings of the Ontology matching workshop. The paper must be no more than 8 pages long and formatted using the LNCS Style. To ensure easy comparability among the participants it has to follow the given outline. A package with LaTeX and Word templates is available here. The above mentioned paper must be sent in PDF format before September 26th to Cassia . Trojahn (a) inrialpes . fr with copy to pavel (a) dit . unitn . it.

Participants may also submit a longer version of their paper, with a length justified by its technical content, to be published online in the CEUR-WS collection and on the OAEI web site (this last paper will be due just before the workshop).

The outline of the paper is as below (see templates for more details):

Presentation of the system
1. State, purpose, general statement
2. Specific techniques used
3. Adaptations made for the evaluation
4. Link to the system and parameters file
5. Link to the set of provided alignments (in align format)
Results
- 2.x) a comment for each dataset performed
General comments
(not necessaryly by putting the section below but preferably in this order).
1. Comments on the results (strength and weaknesses)
2. Discussions on the way to improve the proposed system
3. Comments on the OAEI procedure (including comments on the SEALS evaluation, if relevant)
4. Comments on the OAEI test cases
5. Comments on the OAEI measures
6. Proposed new measures
Conclusions
References

These papers are not peer-reviewed and are here to keep track of the participations and the description of matchers which took part in the campaign.

The results from both selected participants and organizers will be presented at the Ontology matching workshop at ISWC 2011 taking place at Bonn (DE) in October, 2011. We hope to see you there.

Tools and material

Here are some tools that may help participants.

SEALS platform

The instruction for using the SEALS platform are available here.

Processing tools

Participants may use the Alignment API for generating and manipulating their alignments (in particular for computing evaluation of results).

SKOS conversion tools

The participants may use various options if they need to convert SKOS vocabularies into OWL.

OWL-N3 conversion tools

Vassilis Spiliopoulos pointed out to Altova transformer from OWL to N3 notation. This can be useful for some. This is a commercial tool with a 30 days free trial.