Evaluation campaign is over! Preliminary results
here
Ontology Alignment Evaluation Initiative
2011.5 Campaign
The increasing number of methods available for schema matching/ontology integration
necessitate to establish a consensus for evaluation of these methods.
Since 2004, OAEI organizes evaluation campaigns aiming at evaluating ontology matching technologies.
This year, we will execute an OAEI 2011.5 evaluation campaign,
fully running on the SEALS platform
and coordinated with
the Second
SEALS evaluation campaigns. The results will be reported at
the 2nd iWEST workshop of the 9th Extended Semantic Web Conference (ESWC 2012) and will be integrated to those of OAEI 2012.
Problems
The goal of this campaign is to provide more automation to the evaluation and more direct feedback to the participants. As in OAEI 2011, participants in this modality must follow
the specific instructions for
participation.
- Anatomy
- The anatomy
real world case is about matching the Adult Mouse Anatomy (2744 classes) and the NCI Thesaurus (3304 classes) describing the human anatomy. The dataset has been used already in OAEI 2011. For this track, we will only run systems that have not been participating in OAEI 2011 as well as systems that have been modified between OAEI 2011 and OAEI 2011.5. The measured results will be integrated in an extended OAEI 2011 results presentation.
- Conference
-
The goal of this track is to find all correct correspondences within a collection of ontologies describing the domain of organising conferences (the domain being well understandable for every researcher). Additionally, 'interesting correspondences' are also welcome. Results will be evaluated automatically against reference alignment and by data-mining and logical reasoning techniques. Sample of correspondences and 'interesting correspondences' will be evaluated manually.The dataset has been used already in OAEI 2011. For this track, we will only run systems that have not been participating in OAEI 2011 as well as systems that have been modified between OAEI 2011 and OAEI 2011.5. The measured results will be integrated in an extended OAEI 2011 results presentation.
- Multifarm (NEW DATASET)
-
This dataset is composed of a subset of the Conference dataset, translated in eight different languages (Chinese, Czech, Dutch, French, German, Portuguese, Russian, and Spanish) and the corresponding alignments between these ontologies. Based on these test cases, it is possible to evaluate and compare the performance of matching approaches with a special focus on multilingualism.
- Matching Large Biomedical Ontologies (NEW DATASET)
-
This track consists of finding alignments between the Foundational Model of Anatomy (FMA), SNOMED CT, and the National Cancer Institute Thesaurus (NCI). These ontologies are semantically rich and contain tens of thousands of classes. Note that for the OAEI 2011.5 only the case FMA-NCI will be evaluated. UMLS Metathesaurus has been selected as the basis for the track reference alignments.
Scalability tests: new generated benchmarks (NEW DATASETS)
In order to test the ability of matchers to deal with data sets of
increasing number of elements, we generated new benchmark datasets, in
the same models as the previous benchmarks, from seed ontologies with different sizes.
Evaluation process
The evaluation process follows the same pattern:
- Participants wrap their tools as a SEALS platform package and
register them to the SEALS portal;
- Participants can test their tools with the SEALS client on the
dataset provided with reference alignments (until March 15th);
- Organisers run the evaluation on the SEALS platform from the
tools registered in the platform and with both blind and published
datasets;
- Results are (automatically) available on the SEALS portal.
The standard evaluation measures will be precision and recall
computed against the reference alignments. For the matter of
aggregation of the measures we will use weighted harmonic means
(weight being the size of reference alignment).
Precision/recall
graphs (a.k.a. precision at n) will also be computed, so it is advised that participants
provide their results with a weight to each correspondence they
found.
January 15th- datasets available.
March 15th Deadline Extension: New Deadline March 18th- participants send final versions of their tools.
March - April- evaluation is executed and results are analyzed.
May 27-31- final results are reported at ESWC 2012.
- Later on
- results to be recomputed and integrated
for OAEI 2012.
Tools and material
Here are some tools that may help participants.
SEALS platform
The instruction for using the SEALS platform are available here.
Processing tools
Participants may use the Alignment API for generating and manipulating their alignments (in particular for computing
evaluation of results).