The Instance Matching Track aims at evaluating the performance of matching tools when the goal is to detect the degree of similarity between pairs of items/instances expressed in the form of OWL Aboxes.
The track is organized in the following five independent tasks, namely:
To participate to the Instance Matching Track, submit results related to one, more, or even all the expected tasks.
Each task is articulated in two tests with different scales (i.e., number of instances to match):
The goal of the author-dis task is to link OWL instances referring to the same person (i.e., author) based on their publications.
Dataset description. The sandbox scale is around 1k instances. The mainbox scale is around 10k instances.
What we expect from participants.
Participants are requested to match **ONLY** the instances of the class http://islab.di.unimi.it/imoaei2015#Person
in the source dataset (i.e., ontoA.owl) against the instances of the class http://islab.di.unimi.it/imoaei2015#Person
in the target dataset (i.e., ontoB.owl).
In both datasets, author publications are given as instances of the class http://islab.di.unimi.it/imoaei2015#Publication
and they are associated with the corresponding person instance through the property http://islab.di.unimi.it/imoaei2015#author_of
. Expected mappings are 1:1 (one person of ontoA.owl corresponds to exactly one person of ontoB.owl and viceversa).
In the sandbox test, the reference alignment containing the set of expected mappings is also provided (i.e., refalign.rdf).
Evaluation strategy. The mapping produced by the participants will be compared against a reference alignment. Evaluation will be performed through precision, recall, and F-measure.
Access to datasets. Datasets can be downloaded as a zip file:
Dataset identifiers for the SEALS OMT client:
Submission procedure. The task evaluation will be executed with the support of SEALS.
Participants are requested to adjust their matching tool in a way that it can be invoked on the SEALS platform (see OAEI 2015 evaluation details).
Participants are also requested to provide the results in the TSV format, i.e.,
uri_of_source_instance\turi_of_target_instance\tsimilarity_value\n
and to send a text file containing the mappings to the contact person.
Contact person. For any question, please contact Alfio Ferrara (alfio 'dot' ferrara 'at' unimi 'dot' it).
The goal of the author-rec task is to associate a person (i.e., author) with the corresponding publication report containing aggregated information about the publication activity of the person, such as number of publications, h-index, years of activity, number of citations.
Dataset description. The sandbox scale is around 1k instances. The mainbox scale is around 10k instances.
What we expect from participants.
Participants are requested to match **ONLY** the instances of the class http://islab.di.unimi.it/imoaei2015#Person
in the source dataset (i.e., ontoA.owl) against the instances of the class http://islab.di.unimi.it/imoaei2015#Person
in the target dataset (i.e., ontoB.owl).
In the source dataset, author publications are given as instances of the class http://islab.di.unimi.it/imoaei2015#Publication
and they are associated with the corresponding person instance through the property http://islab.di.unimi.it/imoaei2015#author_of
. In the target dataset, each person is only associated with a publication titled ‘Publication report’ containing aggregated information. The challenge is to link a person in the source dataset with the person in the target dataset containing the corresponding publication report. Expected mappings are 1:1 (one person of ontoA.owl corresponds to exactly one person of ontoB.owl and viceversa).
In the sandbox test, the reference alignment containing the set of expected mappings is also provided (i.e., refalign.rdf).
Evaluation strategy. The mapping produced by the participants will be compared against a reference alignment. Evaluation will be performed through precision, recall, and F-measure.
Access to datasets. Datasets can be downloaded as a zip file:
Dataset identifiers for the SEALS OMT client:
Submission procedure. The task evaluation will be executed with the support of SEALS.
Participants are requested to adjust their matching tool in a way that it can be invoked on the SEALS platform (see OAEI 2015 evaluation details).
Participants are also requested to provide the results in the TSV format, i.e.,
uri_of_source_instance\turi_of_target_instance\tsimilarity_value\n
and to send a text file containing the mappings to the contact person.
Contact person. For any question, please contact Alfio Ferrara (alfio 'dot' ferrara 'at' unimi 'dot' it).
The goal of the val-sem task is to determine when two OWL instances describe the same Creative Work. The datasets of the val-sem task have been produced by altering a set of original data through value-based and semantics-aware transformations.
Dataset description. A dataset is composed of a Tbox and corresponding Abox. Source and target datasets share the same Tbox. Ontology instances are described in the source through 22 classes, 31 DatatypeProperty, and 85 ObjectProperty properties. From those properties, we have 1 InverseFunctionalProperty and 2 FunctionalProperties. The sandbox scale is 10k instances. The mainbox scale is 100K instances.
What we expect from participants.
Participants are requested to match instances in the source dataset (Abox1.ttl) against the instances of the target dataset (Abox2.ttl). The task goal is to produce a set of mappings between the pairs of matching instances that are found to refer to the same real-world entity. An instance in the source dataset can have none or one matching counterparts in the target dataset. We ask the participants to map ***ONLY*** Creative Works (NewsItem, BlogPost and Programme) and not the instances of the other classes.
In the sandbox test, the reference alignment containing the set of expected mappings is also provided (i.e., refalign.rdf).
Evaluation strategy. The mapping produced by the participants will be compared against a reference alignment where an instance i
in the source dataset is associated with all the instances in the target dataset that represent an altered description of i
. Evaluation will be performed through precision, recall, and F-measure.
Access to datasets. Datasets can be downloaded as a zip file:
Dataset identifiers for the SEALS OMT client:
Submission procedure. The task evaluation will be executed with the support of SEALS.
Participants are requested to adjust their matching tool in a way that it can be invoked on the SEALS platform (see OAEI 2015 evaluation details).
Participants are also requested to provide the results in the TSV format, i.e.,
uri_of_source_instance\turi_of_target_instance\tsimilarity_value\n
and to send a text file containing the mappings to the contact person.
Contact person. For any question, please contact Tzanina Saveta (jsaveta 'at' ics 'dot' forth 'dot' gr) and Irini Fundulaki (fundul 'at' ics 'dot' forth 'dot' gr).
The goal of the val-struct task is to determine when two OWL instances describe the same Creative Work. The datasets of the val-struct task have been produced by altering a set of original data through value-based and structure-based transformations.
Dataset description. A dataset is composed of a Tbox and corresponding Abox. Source and target datasets share almost the same Tbox (with some difference in the properties' level, due to the structure-based transformations). Ontology instances are described in the source through 22 classes, 31 DatatypeProperty, and 85 ObjectProperty properties. From those properties, we have 1 InverseFunctionalProperty and 2 FunctionalProperties. The sandbox scale is 10k instances. The mainbox scale is 100K instances.
What we expect from participants.
Participants are requested to match instances in the source dataset (Abox1.ttl) against the instances of the target dataset (Abox2.ttl). The task goal is to produce a set of mappings between the pairs of matching instances that are found to refer to the same real-world entity. An instance in the source dataset can have none or one matching counterparts in the target dataset. We ask the participants to map ***ONLY*** Creative Works (NewsItem, BlogPost and Programme) and not the instances of the other classes.
In the sandbox test, the reference alignment containing the set of expected mappings is also provided (i.e., refalign.rdf).
Evaluation strategy. The mapping produced by the participants will be compared against a reference alignment where an instance i
in the source dataset is associated with all the instances in the target dataset that represent an altered description of i
. Evaluation will be performed through precision, recall, and F-measure.
Access to datasets. Datasets can be downloaded as a zip file:
Dataset identifiers for the SEALS OMT client:
Submission procedure. The task evaluation will be executed with the support of SEALS.
Participants are requested to adjust their matching tool in a way that it can be invoked on the SEALS platform (see OAEI 2015 evaluation details).
Participants are also requested to provide the results in the TSV format, i.e.,
uri_of_source_instance\turi_of_target_instance\tsimilarity_value\n
and to send a text file containing the mappings to the contact person.
Contact person. For any question, please contact Tzanina Saveta (jsaveta 'at' ics 'dot' forth 'dot' gr) and Irini Fundulaki (fundul 'at' ics 'dot' forth 'dot' gr).
The goal of the val-struct-sem task is to determine when two OWL instances describe the same Creative Work. The datasets of the val-struct-sem task have been produced by altering a set of original data through value-based, structure-based and semantics-aware transformations.
Dataset description. A dataset is composed of a Tbox and corresponding Abox. Source and target datasets share almost the same Tbox (with some difference in the properties' level, due to the structure-based transformations). Ontology instances are described in the source through 22 classes, 31 DatatypeProperty, and 85 ObjectProperty properties. From those properties, we have 1 InverseFunctionalProperty and 2 FunctionalProperties. The sandbox scale is 10k instances. The mainbox scale is 100K instances.
What we expect from participants.
Participants are requested to match instances in the source dataset (Abox1.ttl) against the instances of the target dataset (Abox2.ttl). The task goal is to produce a set of mappings between the pairs of matching instances that are found to refer to the same real-world entity. An instance in the source dataset can have none or one matching counterparts in the target dataset. We ask the participants to map ***ONLY*** Creative Works (NewsItem, BlogPost and Programme) and not the instances of the other classes.
In the sandbox test, the reference alignment containing the set of expected mappings is also provided (i.e., refalign.rdf).
Evaluation strategy. The mapping produced by the participants will be compared against a reference alignment where an instance i
in the source dataset is associated with all the instances in the target dataset that represent an altered description of i
. Evaluation will be performed through precision, recall, and F-measure.
Access to datasets. Datasets can be downloaded as a zip file:
Dataset identifiers for the SEALS OMT client:
Submission procedure. The task evaluation will be executed with the support of SEALS.
Participants are requested to adjust their matching tool in a way that it can be invoked on the SEALS platform (see OAEI 2015 evaluation details).
Participants are also requested to provide the results in the TSV format, i.e.,
uri_of_source_instance\turi_of_target_instance\tsimilarity_value\n
and to send a text file containing the mappings to the contact person.
Contact person. For any question, please contact Tzanina Saveta (jsaveta 'at' ics 'dot' forth 'dot' gr) and Irini Fundulaki (fundul 'at' ics 'dot' forth 'dot' gr).