OAEI 2016::Interactive Track

NEWS:

2016 results available now!

Description

The growth of the ontology alignment area in the past ten years has led to the development of many ontology alignment tools. After several years of experience in the OAEI, we observed that the results can only be slightly improved in terms of the alignment quality (precision/recall resp. F-Measure). Based on this insight, it is clear that fully automatic ontology matching approaches slowly reach an upper bound of the alignment quality they can achieve. A work by (Jimenez-Ruiz et al., 2012) has shown that simulating user interactions with 30% error rate during the alignment process has led to the same results as non-interactive matching. Thus, in addition to the validation of the automatically generated alignments by domain experts, we believe that there is further room for improving the quality of the generated alignments by incorporating user interaction. User involvement during the matching process has been identified as one of the challenges in front of the ontology alignment community by (Shvaiko et al., 2013) and user interaction with a system is an integral part of it.

At the same time with the tendency of increasing ontology sizes, the alignment problem also grows. It is not feasible for a user to, for instance validate all candidate mappings generated by a system, i.e., tool developers should aim at reducing unnecessary user interventions. All required efforts of the human have to be taken into account and it has to be in an appropriate proportion to the result. Thus, beside the quality of the alignment, other measures like the number of interactions are interesting and meaningful to decide which matching system is best suitable for a certain matching task. By now, all OAEI tracks focus on fully automatic matching and semi-automatic matching is not evaluated although such systems already exist, e.g., overview in (Ivanova et al., 2015). As long as the evaluation of such systems is not driven forward, it is hardly possible to systematically compare the quality of interactive matching approaches.

Goal

The OAEI's interactive track aims at offering a systematic and automated evaluation of matching systems with user interaction to compare the quality of interactive matching approaches in terms of F-measure and number of required interactions. To this end we rely on the datasets of the OAEI 2016 Conference, Anatomy, Largebio, and Phenotype tracks. We have used the reference alignments of each track as oracles in order to simulate the interaction with a domain expert (see (Dragisic et al., 2016) and (Paulheim et al., 2013)).

In this track we currently focus on one of the challenges regarding user interaction. The goal of this track is to show in general that the exploitation of user interaction allows further improving the results of ontology matching systems in terms of F-measure. We would also like to see which semi-automatic methods exist, which ones perform best, and which ones need the smallest amount of interactions, i.e., make best use of the scarce resource of users' time. Beside the amount of user interactions, the type of the interaction and the involvement time is interesting. Do matching systems involve the user interaction before or during the process? Do they ask the user only to verify single correspondences or complete alignments? Altogether, we aim to promote the development of semi-automatic ontology matching systems and methods to overcome the limitations which are caused by fully automatic techniques. Furthermore, the track will encourage a discussion of different interactive matching techniques as well as a set of relevant interaction primitives. Currently, this track does not evaluate the user experience or the user interfaces of the systems.

Evaluation

The evaluation of this track will also be run with support of SEALS. This requires that you wrap your matching system in a way that allows us to execute it on the SEALS platform (see OAEI 2016 evaluation details). Note that in this track we allow systems to interact with the SEALS client to check if a given alignment is correct or not (see the oracle tutorial which describes the additional methods for the interactive track). To check whether the matching task allows interactivity, you can call the method Oracle.isInteractive(). This method returns a boolean value, TRUE if it is an interactive track and false otherwise.

We will also simulate domain experts with variable error rate (see (Zlatan et al., 2016)) which reflects a more realistic scenario where a (simulated) user does not necessarily provide always a correct answer. In these scenarios asking a large number of questions to the user may also have a negative impact. The error rates (0..1) that will be simulated are: 0.1, 0.2 and 0.3. This can be controlled as an input parameter when running the SEALS client, please refer to the OAEI 2016 tutorial.

Data sets

The interactive track relies on the datasets of the OAEI 2016 Conference, Anatomy, Largebio, and Phenotype tracks. Please refer to the identifiers of each of the tracks and use the -i parameter as described in the OAEI 2016 tutorial. Note that the latest version of the SEALS OMT client is required.

The Pistoia Alliance will sponsor a prize of $7,500 for the winner of the Phenotype track as an incentive for participation.

Relevant references

Zlatan Dragisic, Valentina Ivanova, Patrick Lambrix, Daniel Faria, Ernesto Jimenez-Ruiz and Catia Pesquita. "User validation in ontology alignment". ISWC 2016. [paper] [technical report]

Heiko Paulheim, Sven Hertling, Dominique Ritze. "Towards Evaluating Interactive Ontology Matching Tools". ESWC 2013. [pdf]

Ernesto Jimenez-Ruiz, Bernardo Cuenca Grau, Yujiao Zhou, Ian Horrocks. "Large-scale Interactive Ontology Matching: Algorithms and Implementation". ECAI 2012. [pdf]

Valentina Ivanova, Patrick Lambrix, Johan Åberg. "Requirements for and evaluation of user support for large-scale ontology alignment". ESWC 2015. [publisher page]

Pavel Shvaiko, Jérôme Euzenat. "Ontology matching: state of the art and future challenges". Knowledge and Data Engineering 2013. [publisher page]

Contact

This track is currently organized by Zlatan Dragisic, Daniel Faria, Valentina Ivanova, Ernesto Jimenez Ruiz, Patrick Lambrix, and Catia Pesquita. If you have any problems working with the ontologies or any suggestions related to this track, feel free to write an email to ernesto [at] cs [.] ox [.] ac [.] uk or ernesto [.] jimenez [.] ruiz [at] gmail [.] com

Acknowledgements

We thank Dominique Ritze and Heiko Paulheim, the organisers of the 2013 and 2014 editions of this track.

The track is partially supported by the Optique project.