The OAEI-2006 results are available here.

Ontology Alignment Evaluation Initiative

2006 Campaign

The increasing number of methods available for schema matching/ontology integration suggests the need to establish a consensus for evaluation of these methods. Since 2004, OAEI organizes evaluation campaigns aiming at evaluating ontology matching technologies (see http://oaei.ontologymatching.org/2004/Contest/ and http://oaei.ontologymatching.org/2005).

The OAEI 2006 campaign is associated to the ISWC Ontology matching workshop to be help at Athens, GA, USA on November 5, 2006.

Problems

This year's campaign will consist of four tracks gathering six data sets and different evaluation modalities.

Comparison track: benchmark

Like in previous campaigns, a systematic benchmark series has been produced. The goal of this benchmark series is to identify the areas in which each alignment algorithm is strong and weak. The test is based on one particular ontology dedicated to the very narrow domain of bibliography and a number of alternative ontologies of the same domain for which alignments are provided.

Expressive ontologies

anatomy: The anatomy real world case covers the domain of body anatomy and will consists of two ontologies with an approximate size of several 10k classes and several dozen of relations.
jobs: The jobs use case is an industry evaluated real world business case. A company has a need to improve job portal functionality with semantic technologies. To enable higher precision in retrieval of relevant job offers or applicants, OWL ontologies from the employment sector are used to describe jobs and job seekers and semantic matching with regard to these ontologies provides the improved results. For confidentiality reasons, the test will be run by the participant with software provided by the company team. This will involve instrumenting the current company software for using the alignments and running the integration task against real results on this basis.

Directories and thesauri

directory: The directory real world case consists of alignming web sites directory (like open directory or Yahoo's). It is more than 4 thousand elementary tests.
food: Two SKOS thesaurus about food have to be aligned using relations from the SKOS Mapping vocabulary. All results are evaluated by domain experts. Each participant is asked to evaluate a small part of the results of the other participants.

Consensus workshop: conference

Participants will be asked to freely explore a collection of 'conference organisation' ontologies (the domain being well understandable for every researcher). This effort will materialise in (complete or sections of) submitted papers, containing e.g. interesting individual correspondences ('nuggets'), aggregated statistical observations and/or implicit design patterns. There is no a priori reference alignment. For a selected sample of correspondences, consensus will be sought at the workshop and the process of its reaching will be recorded.

We summarize below the variation between the results expected by these tests (all results are given in the Alignment format):

test	language	relations	confidence
benchmarks	OWL	=	[0 1]	open
anatomy	OWL	=	1	blind
jobs	OWL	=	[0 1]	external
directory	OWL	=	1	blind
food	SKOS?	exactMatch, narrowMatch, broadMatch	1	blind+consensual
conference	OWL-DL	=, <=	1	blind+consensual

Evaluation process

Each data set has a different evaluation process. They can be roughly divided into four groups:

benchmark: open: benchmark tests are provided with the expected results. Participants must return their obtained results by September 15th;
anatomy, directory, food: blind: these are blind tests, i.e., participants do not know the results and must return their results by September 15th
anatomy, jobs: external: have a "double blind" evaluation such that participants must submit their systems by September 4th and the tests will be run and evaluated by the organizers.
conference, food: consensus: requires that participants send their results by the same September 15th, the results are not pre-determined through reference alignments but computed and/or discussed as a consensus among given results.

However, the evaluation will be processed in the same three successive steps as before.

Preparatory Phase

Ontologies are described in OWL-DL and serialized in the RDF/XML format. The expected alignments are provided in the Alignment format expressed in RDF/XML.

The ontologies and alignments of the evaluation are provided in advance during the period between June 1st and June 28th. This gives potential participants the occasion to send observations, bug corrections, remarks and other test cases to the organizers. The goal of this primary period is to be sure that the delivered tests make sense to the participants. The feedback is important, so all participants should not hesitate to provide it. The tests will certainly change after this period, but only for ensuring a better participation to the tests. The final test base will be released on July 3rd.

Execution Phase

During the execution phase the participants will use their algorithms to automatically align the ontologies of both part. The participants should only use one algorithm and the same set of parameters for all tests in all tracks. Of course, it is fair to select the set of parameters that provide the best results (for the tests where results are known). Beside the parameters the input of the algorithms must be the two provided ontology to align and any general purpose resource available to everyone (that is no resourse especially designed for the test). In particular, the participants should not use the data (ontologies and results) from other test sets to help their algorithm. And cheating is not fair...

The participants will provide their alignment for each test in the Alignment format. The results will be provided in a zip file containing one directory per test (named after its number) and each directory containing one result file in the RDF/XML Alignment format with always the same name (e.g., participant.rdf). This should yield the following structure:

participant.zip
+- benchmark
|  +- 101
|  |  +- participant.rdf
|  +- 103
|  |  +- participant.rdf
|  + ...
+- anatomy
|  +- participant.rdf
+- directory
|  +- 1
|  |  +- participant.rdf
|  + ...
+ ...

They will also provide for September 15th a paper to be published in the proceedings and a link to their program and parameter set.

The only interesting alignments are those involving classes and properties of the given ontologies. So the alignments should not align individuals, nor entities from the external ontologies.

Evaluation Phase

The organizers will evaluate the results of the algorithms used by the participants and provide comparisons on the basis of the provided alignments.

In order to ensure that it will be possible to process automatically the provided results, the participants are requested to provide (preliminary) results by September 4th. In the case of blind tests only the organizers will do the evaluation with regard to the withheld alignments. In the case of double blind tests, the participants will provide a version of their system and the values of the parameters if any.

An email with the location of the required zip files must be sent to the contact addresses below.

The standard evaluation measures will be precision and recall computed against the reference alignments. For the matter of aggregation of the measures we will use weighted harmonic means (weight being the size of reference alignment). Another improvement that might be used is the computation of precision/recall graphs so it is advised that participants provide their results with a weight to each correspondence they found (participants can provide two alignment results: <name>.rdf for the selected alignment and <name>-full.rdf for the alignment with weights.

Further, it is planned to introduce new measures addressing some limitations of precision and recall. These will be presented at the workshop discussion in order for the participants to provide feedback on the opportunity to use them in a further evaluation.

Schedule Overview

June 1st: tests are out
June 28th: end of commenting period
July 4th: tests are frozen
September 4th: participants send preliminary results (for interoperability-checking
September 15th: participants send final results and papers
October 9th: organizers publish results for comments
November 5th: final results ready and OM-2006 workshop.

Presentation

From the results of the experiments the participants are expected to provide the organisers with a paper to be published in the proceedings of the workshop. The paper must be around 8 pages long and formatted using the LNCS Style. To ensure easy comparability among the participants it has to follow the given outline. A package with LaTeX and Word templates will be made available here. The above mentionned paper must be sent by September 15th to Pavel Shvaiko (pavel (à) dit dot unitn dot it) with copy to Jerome . Euzenat (à) inrialpes . fr.

This year due to space limit and the large number of evaluation, we will authorize authors to submit a second version of their paper, with no space limitation (i.e., typically which can include results) to be published online in the CEUR-WS collection and on the OAEI web site (this last paper will be due just before the workshop).

The outline of the paper is as below:

1) Presentation of the system
- 1.1) State, purpose, general statement
  It is interesting to see here what was the purpose of the algorithm used in terms of the following categories: ontology matching, schema matching, version matching, directory matching. The purpose of this information is to study the correlation between these purposes and the success in some particular tests.
- 1.2) Specific techniques used
- 1.3) Adaptations made for the evaluation
- 1.4) Link to the system and parameters file
- 1.5) Link to the set of provided alignments (in align format)
2) Results
2.x) a comment for each test
3) General comments
- 3.1) Comments on the results (strength and weaknesses)
- 3.2) Discussions on the way to improve the proposed system
- 3.3) Comments on the OAEI procedure
- 3.4) Comments on the OAEI test cases
- 3.5) Comments on the OAEI measures
- 3.6) Proposed new measures
4) Conclusions
References
Appendix: Raw results
- Matrix format
  Of course, this applies to non-blind tests. If possible provide the run time values in hh.mn.ss.mms format.

The results from both, the participants and the organizers, will be presented at the Workshop on Ontology matching at ISWC 2006 taking place at Athens (Georgia, USA) on November, 5th 2006. We hope to see you there.

Tools and material

The material for evaluation will be available soon from this page.

Processing tools

The participants may use the Alignment API for generating and manipulating their alignments (in particular for computing evaluation of results).

Steering Committee

Jérôme Euzenat, INRIA Rhône-Alpes,
Lewis Hart, AT&T,
Tadashi Hoshiai, Fujitsu,
Todd Hughes, Lockheed Martin,
Yannis Kalfoglou, University of Southampton,
John Li, Teknowledge,
Natasha Noy, Stanford university,
Heiner Stuckenschmidt, Universität Mannheim,
York Sure, Universität Karlsruhe,
Raphaël Troncy, CWI, Amsterdam,
Petko Valtchev, University of Montréal,
Mikalaï Yatskevich, Universitá di Trento

Organizers

Pavel Shvaiko (University of Trento)
Jérôme Euzenat (INRIA Rhône-Alpes)
Heiner Stuckenschmidt (Manheim Universität)
Mikalai Yatskevich (University of Trento)
Willem Robert van Hage (VU Amsterdam)
Malgorzata Mochol (FU Berlin and WWJobs)
Vojtech Svatek (University of Economics, Praha)

http://oaei.ontologymatching.org/2006

$Id: index.html,v 1.15 2007/05/21 04:44:49 euzenat Exp $