Ontology Alignment Evaluation Initiative - OAEI-2014 Campaign

Instance Matching Track

The Instance Matching Track aims at evaluating the performance of matching tools when the goal is to detect the degree of similarity between pairs of items/instances expressed in the form of OWL Aboxes.

The track is organized in two independent tasks, namely Identity Recognition (id-rec task) and Similarity Recognition (sim-rec task).

For each task, participants receive two datasets called source and target, respectively. The task goal is to discover the matching pairs (i.e., mappings) among the instances in the source dataset and the instances in the target dataset. Both the tasks are blind, meaning that the set of expected mappings (i.e., reference alignment) is not given to the participants.

Access to the datasets for both the tasks of the OAEI 2014 campaign:

Identity Recognition Task (show/hide results)

The goal of the id-rec task is to determine when two OWL instances describe the same real-world entity.

The datasets of the id-rec task have been produced by altering a set of original data with the aim to generate multiple descriptions of the same real-world entities where different languages and representation formats are employed.

We provide two Aboxes: The source Abox contains 1330 instances described through 4 classes, 5 datatype properties, and 1 annotation property. The target Abox contains 2649 instances described through 4 classes, 4 datatype properties, 1 object property, and 1 annotation property.

What we expect from participants. Participants are requested to match the instances of the class http://www.instancematching.org/ontologies/oaei2014#Book in the source Abox against the instances of the corresponding class in the target Abox. The task goal is to produce a set of mappings between the pairs of matching instances that are found to refer to the same real-world entity. A book instance in the source Abox can have none, one, or more than one matching counterparts in the target Abox.

Evaluation strategy. The mapping produced by the participants will be compared against a ground truth where an instance i in the source dataset is associated with all the instances in the target dataset that represent an altered description of i. Evaluation will be performed through precision, recall, and F-measure.

Submission procedure. The task evaluation will be executed with the support of SEALS. Participants are requested to adjust their matching tool in a way that it can be invoked on the SEALS platform (see OAEI 2014 evaluation details).

Additionally, participants are also required to provide the results in the TSV format, i.e.,
uri_of_source_instance\turi_of_target_instance\tsimilarity_value\n
and to send a text file containing the mappings to Alfio Ferrara (alfio.ferrara@unimi.it).

Similarity Recognition Task (show/hide results)

The goal of the sim-rec task is to evaluate the degree of similarity between two OWL instances, even when the two instances describe different real-world entities.

The datasets of the sim-rec task have been produced through crowdsourcing by employing the Argo system (Italian language). More than 250 workers have been involved in the crowdsourcing process to evaluate the degree of similarity between pairs of instances describing real books. Crowdsourcing activities have been organized into a set of HITs (Human Intelligent Task) assigned to workers for execution. A HIT is a question where the worker is asked to evaluate the degree of similarity of two given instances. The worker exploits the instances (i.e., book descriptions) “at a glance” and she/he has to specify her/his own perceived similarity by assigning a degree in the range [0,1].

We provide two Aboxes:

What we expect from participants. Participants are requested to match the instances of the class http://www.instancematching.org/ontologies/oaei2014#Book in the source Abox against the instances of the corresponding class in the target Abox. The task goal is to produce a complete set of mappings between any pair of instances. The source Abox contains 173 book instances and the target Abox contains 172 book instances, then we expect the participants to provide 173x172 = 29756 mappings, each one featured by a degree of similarity in the range [0,1].

Evaluation strategy. The mappings produced by the participants will be compared against the mappings obtained through crowdsourcing. Given a mapping m, we will compare the similarity degree assigned to m by the matching tool against the corresponding similarity degree assigned by the crowdsourcing workers. The evaluation will be performed through the Euclidean distance.

Submission procedure. The task evaluation will be executed with the support of SEALS. Participants are requested to adjust their matching tool in a way that it can be invoked on the SEALS platform (see OAEI 2014 evaluation details).

Additionally, participants are also required to provide the results in the TSV format, i.e.,
uri_of_source_instance\turi_of_target_instance\tsimilarity_value\n
and to send a text file containing the mappings to Alfio Ferrara (alfio.ferrara@unimi.it).

Contact: alfio [.] ferrara [at] unimi [.] it

Original page: http://islab.di.unimi.it/im_oaei_2014/index.html [cached: 13/05/2016]