Instance Matching at OAEI 2010 (IM@OAEI2010)

General description

IM@OAEI2010 is an initiative for the evaluation of instance matching techniques and tools. IM@OAEI2010 is a track of the Ontology Alignment Evaluation Initiative (OAEI - http://oaei.ontologymatching.org/2010), held every year in collaboration with the Ontology Matching Workshop at ISWC (http://www.ontologymatching.org).

IM@OAEI2010 is focused on RDF and OWL data in the context of the Semantic Web. Participants will be asked to execute their algorithms against various datasets and their results will be evaluated by comparing them with a pre-defined reference alignment provided by IM@OAEI2010. Results will be evaluated according to standard precision and recall metrics.

Data sets

Use the following datasets as input for your matching system. You can already test with this data and report probems (send reports to imei2010@islab.dico.unimi.it). They will be frozen by July 5th.

Data interlinking track (DI) - DOWNLOAD (~70MB gzipped text)
Participants are requested to re-build the links among the available resources (available accordint to the nt format). Reference alignments are provided for each resource as RDF alignments.
OWL data track
- IIMB - DOWNLOAD SMALL VERSION (~20MB gzipped text, 363 individuals)
  Updated on 30-08-2010: reference alignment files format fixed - test cases NOT changed
- IIMB - DOWNLOAD LARGE VERSION (~79MB gzipped text, 1416 individuals)
  Updated on 30-08-2010: reference alignment files format fixed - test cases NOT changed
  
  IIMB is divided into tasks and reference alignemnts are automatically generated by introducing controlled modifications in an initial reference ontology instance. The files contain a subdirectory for each task. Participants are requested to match the reference ontology (the one in the 000 directory) against all the others (from 001 to 080). Each directory contains also the reference alignment. Data are provided as OWL individuals according to the RDF/XML format, while reference alignments are provided as RDF alignments.
- PR - DOWNLOAD (~250KB gzipped text)
  Updated on 09-09-2010: bug in the restaurant2.rdf file fixed - reference alignments NOT changed
  
  Persons-Restaurants (PR) is a small real data test case where participants are requested to run matching tools against two collections of data concerning persons (person1 and person2) and one collection about restaurants (restaurant1). A description of each dataset together with the expected alignments are provided within the file.

Modalities

Subtasks

IM@OAEI2010 is organized in two sub-tracks, namely:

Data interlinking track (DI). As the number of datasets published on the Web of data grows quickly, there is a need for tools providing assistance in interlinking the data they contain with other datasets. Many tools have recently be proposed that are able to perform semi-automatic interlinking. In order to scale, and in prevision to the explosion of the quantity of datasets published, semi-automatic interlinking is hardly acceptable. We propose in this track of the Ontology Alignment Evaluation Initiative to evaluate systems able to *automatically* find interlinks between Web datasets. Participants to these tracks will be given a set of datasets to interlink. The interlinking will need to be performed with no a priori knowledge of the datasets content, neither on the data nor on the schema behind them. An important dimension of the problem being the scalability of the tools wrt datasets size, participants should expect having to deal with datasets comparable in size to DBPedia. Datasets to interlink will be actual datasets published as linked-data. Evaluation will be performed by comparing the tools results to existing links.

OWL data track (IIMB & PR). The OWL data track is focused on two main goals: i) to provide an evaluation dataset for various kinds of data trasformations, including value trasformations, structural tranformations, and logical transformations; ii) to cover a wide spectrum of possible techniques and tools. To this end, the IIMB benchmark is generated by starting from an initial OWL knowledge base that is transformed into a set of modified knowledge bases by applying several automatic transformations of data. Participants are requested to find the correct correspondences among individuals of the first knowledge base and individuals of the others. An important task here is that some of the transformations require reasoning for finding the expected alignments.

Participation Conditions

Participating systems are free to use any combination of matching techniques and background knowledge.

Format of submission

For each track you participate, your submission should contain the following folders and files.

+- imei
|  +- [trackname]
|  |  +- participant.rdf

The files participant.rdf (replace 'partcipant' by the name of your system) contain the mapping generated by your system. These files have to follow the format described here (standard format for submissions to OAEI).

The reference mapping contains only correspondences between instances of the ontologies. No correspondences between concepts and properties (roles) are specified in the reference alignment.

Please submit the files (preliminary and final results) directly to the email address imei2010@islab.dico.unimi.it. Send the results (g)zipped in a file participant.zip or participant.tgz and let the name of your matching systems occur somewhere in the subject heading of the mail.

Schedule

June 7th datasets are out

June 21st end of commenting period

July 5th tests are frozen

August 30th participants send preliminary results (for interoperability-checking)

October 4th participants send final results

October 11th organisers publish results for comments

November 7th final results ready

Acknowledgements

We would like to thank all of the participants of the OAEI-09 instance matching track for hints and discussions with respect to the realization and evaluation of the last year.

Contact

Alfio Ferrara, Universita` degli Studi di Milano, Italy

Andriy Nikolov, Knowledge Media Institute, The Open University, UK

Jan Noessner, University of Mannheim, Germany

Stefano Montanelli, Universita` degli Studi di Milano, Italy

Francois Sharffe, INRIA, France

Heiko Stoermer, Fondazione Bruno Kessler (FBK-irst)

Original page: http://www.instancematching.org/oaei/imei2010.html [cached: 08/03/2011]