IM@OAEI2010 is an initiative for the evaluation of instance matching techniques and tools. IM@OAEI2010 is a track of the Ontology Alignment Evaluation Initiative (OAEI - http://oaei.ontologymatching.org/2010), held every year in collaboration with the Ontology Matching Workshop at ISWC (http://www.ontologymatching.org).
IM@OAEI2010 is focused on RDF and OWL data in the context of the Semantic Web. Participants will be asked to execute their algorithms against various datasets and their results will be evaluated by comparing them with a pre-defined reference alignment provided by IM@OAEI2010. Results will be evaluated according to standard precision and recall metrics.
Use the following datasets as input for your matching system. You can already test with this data and report probems (send reports to imei2010@islab.dico.unimi.it). They will be frozen by July 5th.
Participants are requested to re-build the links among the available resources (available accordint to the nt format). Reference alignments are provided for each resource as RDF alignments.
OWL data track
Updated on 30-08-2010: reference alignment files format fixed - test cases NOT changed
Updated on 30-08-2010: reference alignment files format fixed - test cases NOT changed
IIMB is divided into tasks and reference alignemnts are automatically generated by introducing controlled modifications in an initial reference ontology instance. The files contain a subdirectory for each task. Participants are requested to match the reference ontology (the one in the 000 directory) against all the others (from 001 to 080). Each directory contains also the reference alignment. Data are provided as OWL individuals according to the RDF/XML format, while reference alignments are provided as RDF alignments.
Updated on 09-09-2010: bug in the restaurant2.rdf file fixed - reference alignments NOT changed
Persons-Restaurants (PR) is a small real data test case where participants are requested to run matching tools against two collections of data concerning persons (person1 and person2) and one collection about restaurants (restaurant1). A description of each dataset together with the expected alignments are provided within the file.
IM@OAEI2010 is organized in two sub-tracks, namely:
Data interlinking track (DI). As the number of datasets published on the Web of data grows quickly, there is a need for tools providing assistance in interlinking the data they contain with other datasets. Many tools have recently be proposed that are able to perform semi-automatic interlinking. In order to scale, and in prevision to the explosion of the quantity of datasets published, semi-automatic interlinking is hardly acceptable. We propose in this track of the Ontology Alignment Evaluation Initiative to evaluate systems able to *automatically* find interlinks between Web datasets. Participants to these tracks will be given a set of datasets to interlink. The interlinking will need to be performed with no a priori knowledge of the datasets content, neither on the data nor on the schema behind them. An important dimension of the problem being the scalability of the tools wrt datasets size, participants should expect having to deal with datasets comparable in size to DBPedia. Datasets to interlink will be actual datasets published as linked-data. Evaluation will be performed by comparing the tools results to existing links.
OWL data track (IIMB & PR). The OWL data track is focused on two main goals: i) to provide an evaluation dataset for various kinds of data trasformations, including value trasformations, structural tranformations, and logical transformations; ii) to cover a wide spectrum of possible techniques and tools. To this end, the IIMB benchmark is generated by starting from an initial OWL knowledge base that is transformed into a set of modified knowledge bases by applying several automatic transformations of data. Participants are requested to find the correct correspondences among individuals of the first knowledge base and individuals of the others. An important task here is that some of the transformations require reasoning for finding the expected alignments.
Participating systems are free to use any combination of matching techniques and background knowledge.
For each track you participate, your submission should contain the following folders and files.
+- imei | +- [trackname] | | +- participant.rdf
The files participant.rdf (replace 'partcipant' by the name of your system) contain the mapping generated by your system. These files have to follow the format described here (standard format for submissions to OAEI).
The reference mapping contains only correspondences between instances of the ontologies. No correspondences between concepts and properties (roles) are specified in the reference alignment.
Please submit the files (preliminary and final results) directly to the email address imei2010@islab.dico.unimi.it. Send the results (g)zipped in a file participant.zip or participant.tgz and let the name of your matching systems occur somewhere in the subject heading of the mail.
June 7th datasets are out
June 21st end of commenting period
July 5th tests are frozen
August 30th participants send preliminary results (for interoperability-checking)
October 4th participants send final results
October 11th organisers publish results for comments
November 7th final results ready
We would like to thank all of the participants of the OAEI-09 instance matching track for hints and discussions with respect to the realization and evaluation of the last year.
Alfio Ferrara, Universita` degli Studi di Milano, Italy
Andriy Nikolov, Knowledge Media Institute, The Open University, UK
Jan Noessner, University of Mannheim, Germany
Stefano Montanelli, Universita` degli Studi di Milano, Italy
Francois Sharffe, INRIA, France
Heiko Stoermer, Fondazione Bruno Kessler (FBK-irst)