Rules when participating to an evaluation
Until 2011, OAEI participation rules were usually defined within the individual
Some of these rules were obsoleted by the use of the SEALS platform,
but new problems appeared.
So we recall these rules and to clarify them here since
they seem to be difficult to understand.
The goal of OAEI is to forster the advancement of ontology matching
technology. So, it aims at evaluating ontology matchers, not "OAEI
ontology" matchers. This means that any effort made to look good at
OAEI, as opposed to, improve a matcher, is obviously not in the spirit
The basic rules are as follows:
- Participants should use one system over all tracks,
- with one single set of parameters,
- with whichever resources that are not specifically designed to be
adapted to the OAEI data sets.
The interpretations of such rules may not be fully clear. For
instance, a matching system may automatically assign
different values to its parameters depending on the input
characteristics of the matching task. This may seem to be adapted to
OAEI in some cases, especially if the tool design have used OAEI data
sets as guidelines.
One way to be less at odds with OAEI organisation, is to make
any adaptative behaviour explicit in the system desription submitted
to the OAEI organizers (see examples below).
Similarly, if a system re-uses another matching systems or some of
its core libraries, this has to be made explicit in the system
description (this is basically, attribution commonsense). In such a
case it is required that the authors clarify exactly in how far the
included libraries have been extended and what is the expected impact
on the results.
In case they want to clarify compliance, organisers reserve the right
to ask relevant questions to participants and, if answers are not
satisfactory, to ask them for their source code. Participant may
decide or not to disclose their code; organisers reserve the right to
publish results or not.
How to interpret OAEI rules
The rules above entail that, unless explicitely stated the contrary:
- General purpose resources are authorised; OAEI taylored resources
- The program of the matcher should not be tweaked for passing (or
even recognising) the tests;
- Using network connections for accessing a general purpose resource
(Wikipedia) is authorised; accessing a specific resources made for
OAEI is prohibited;
- Using published OAEI tests (with or without reference alignments)
for learning should be considered as prohibited;
- The program should not be used for storing data about the OAEI
test set or to rely on users;
- It is authorised for a system to behave differently depending on
the input which is provided as long as it is not to recognise OAEI
It is not always easy to see whether a certain aspect of a matching
system implementation is valid against the rules.
To enable a better understanding, we present a list of typical examples together with short explanations.
Examples breaking the rules
- Machine learning techniques are used on subsets of (or complete) OAEI datasets to find an optimal weighting of different similarity measures.
Why incorrect: OAEI datasets cover different types of matching tasks, however, this choice is not representative. Learning with these examples results in an inacceptable overfitting. In machine learning there has to be a clear distinction between test data and training data.
- Several predefined configurations that are activated by detecting certain namespaces, e.g. it is checked for the occurence of a specific string like "benchmark".
Why incorrect: No need for an explanation.
- A system is trained on OAEI datasets to automatically choose from a set of predefined settings.
Why incorrect: Such a strategy does not take into account that this approach can result in learning something that works very well for OAEI datasets, but the results of the learning process might nevertheless be completely unreasonable. Moreover, see the general remark on machine learning from above.
Examples not breaking the rules
- A specific large-scale ontology matching strategy is automatically activated, if the ontologies to be matched have more than 1000 concepts.
Why correct: This rule of thumb is intended to work well for both OAEI testcases and other ontology matching problems. It is in general, a reasonable distinction that is guided by detecting a characteristic of the matching problem, which is relevant for solving it.
- The similarity of the ontologies to be matched is measured prior to the core matching process. The higher the similarity is, the more weight you put to smilarity flooding.
Why correct: See remark above.
- Labels are analzed in a preprocessing step. If biomedical terms are detected, activate UMLS is activated as background knowledge.
- Why correct: Again, the same argument as above holds. Given a non OAEI matching problem, it can be argued that the conditional usage of UMLS will have no negative effects on matching non-biomedical ontologies and probably a positive effect on matching biomedical ontologies.
This listing is for sure not complete. Please contact OAEI organizers if you are thinking about an issue that is not clarified by this listing.
In case of doubt, it may be possible to display, and publish, the
conditions under which a matcher has different settings.
Here are two different decision tree drawings showing obviously incorrect and
$Id: oaei-rules.2.html,v 1.4 2012/12/12 17:41:50 euzenat Exp $