Ontology Alignment Evaluation Initiative

Consensus workshop: results

Final results have been presented at Ontology Matching workshop 2006 during so-called 'Consensus building workshop'. During presentation, 'controversial' mappings have been argued. Argumentation process is now under processing.
Presentation from 'Consensus building workshop' is here.

Evaluation

According to the nature of this track, we are mainly interested in some "interesting" mappings ("nuggets"). Although traditional evaulation was not our intention, we made some sort of evaluation as a side-effect of processing results from our six participants. All the statistics as well as precision and recall have been provisionally made by track organisers, who can often be subjective; the focus of the track is on interesting individual alignments and repeated patterns rather than on precision/recall figures.

The abovementioned table encompasses several numerical statistics related to the results of six participants, called according to name of their systems (Automs, Coma++, OWL-CtxMatch, Falcon, HMatch and RiMOM). Finally, there is row with overall statistics. In the following, columns are explained:

measure shows whether mapping is strictly true/false or is scaled between 0 and 1,
# alignments shows number of alignments per one system,
# mappings shows number of all mappings which have been included in "assessment,"
# correct shows number of correct mappings, in other words, it is number of true positive mappings,
# uniq correct shows number of unique correct mappings, ie. o1:entity1 o2:entity2 was counted once if at least one system include it in its result. Also o1:entity1 o2:entity2 and o2:entity2 o1:entity1 was taken as the same one,
# interesting shows number of "interesting" correct mappings, these mappings are correspondences which is not so easy to find at first sight (due to eg. string-based approach is not enough),
# incorrect shows number of incorrect mappings, ie. false positive mappings,
# subsumptions shows number of mappings which have been wrong classified as equivalence, but it should be in correspondence with relation of subsumption, in arbitrary direction,
# siblings shows number of mappings which have been wrong classified as equivalence, but mapped elements are rather siblings,
# inversion shows number of mappings which have been wrong classified as equivalence, but mapped relations are rather inverse than equivalent,
# relClass shows number of mappings which is mapping between class and relation or vice versa,
# int FP shows number of "interesting" incorrect mappings, ie. "interesting" false positive mappings. These mappings are incorrect correspondences which do not belong to any abovementioned groups (eg. subsumptions, siblings, etc.),
# unclear shows number of unclear mappings where evaluator has not been able to decide whether mapping is correct or not.

The following columns are dealing with measure of precision and recall:

precision (P) is computed as ratio of the number of all correct mappings to the number of all mappings,
rrecall (rR) is computed as ratio of the number of all unique correct mappings (sum of all unique correct mappings per one system) to the number of all unique correct mappings found by any of systems (per all systems). This is our "relative" recall.

Additional comments:

in the case of HMatch system, 4529 relations with measure of similarity 0.64 were not included in assessment to make results manually processable,
in the case of RiMOM system, mapping with measure higher than 0.199 were included in assessment. Number of mappings with measure of similarity lower than 0.199 or equal is 2477,
in the right-hand corner is stated the number of all unique correct mappings which is useful for our realtive (approximative) assessment of recall.

http://oaei.ontologymatching.org/2006/results/conference/