Ontology Alignment Evaluation Initiative - OAEI-2012 Campaign

MultiFarm Results for OAEI 2012

In this page one can find the results of the OAEI 2012 for the MultiFarm track. The details on the dataset can be found at MultiFarm dataset. If you notice any kind of error (wrong numbers, incorrect information on a matching system, etc.) do not hesitate to contact us (mail see below in the last paragraph on this page).

Generated alignments

You can download the complete set of all generated alignments. These alignments have been generated by executing the tools with the help of the SEALS infrastructure. They are based on one run. All results presented in the following are based on analyzing these alignments.

Experimental setting

In the 2012 evaluation campaign, we have used a subset of the whole MultiFarm dataset, omitting all the pairs of matching tasks involving the ontologies edas and ekaw (resulting in 36 x 25 matching tasks). This allows for using the omitted test cases as blind evaluation tests in the future. Contrary to OAEI 2011.5, we have included the Chinese and Russian translations.

Within the MultiFarm dataset, we can distinguish two types of matching tasks: (i) those test cases where two different ontologies have been translated in different languages (cmt--confOf, for instance); and (ii) those test cases where the same ontology has been translated in different languages (cmt--cmt, for instance). For the test cases of type (ii), good results are not directly related to the use of specific techniques for dealing with ontologies in different natural languages, but on the ability to exploit the fact that both ontologies have an identical structure (and that the reference alignment covers all entities described in the ontologies). %It can be supposed that these test cases are dominated by specific techniques designed for matching different versions of the same ontology.

This year, seven participating systems (out of 21) use specific multilingual methods: ASE, AUTOMSv2, GOMMA, MEDLEY, WeSeE, Wmatch, and YAM++. The other systems are not specifically designed to match ontologies in different languages, nor do they make use of a component that can be utilized for that purpose. Please, refer to the OAEI 2012 paper (Multifarm section) for further details on each tools.

Overall results

Before discussing the results per pairs of languages, we present the aggregated results for the test cases within each type of matching task -- types (i) and (ii) -- table below. These results are based on a single run. The systems not listed in this table have generated empty alignments, for all test cases, have thrown some exceptions (case of ASE), or have been not evaluated due to their execution requirements. AROMA was not able to generate alignments for test cases of type (i).

MultiFarm aggregated results per matcher, for each type of matching task -- types (i) and (ii). Runtime is measured in minutes (time for completing the 36x25 matching tasks)
Different ontologies (i) Same ontologies (ii)
System Runtime Precision Fmeasure Recall Precision Fmeasure Recall
Specific multilingual matchers AUTOMSv2 512.7 .49 .36 .10 .69 .24 .06
GOMMA 35.0 .29 .31 .36 .63 .38 .29
MEDLEY 76.5 .16 .16 .07 .34 .18 .09
WeSeE 14.7 .61 .41 .32 .90 .41 .27
Wmatch 1072.0 .22 .21 .22 .43 .17 .11
YAM++ 367.1 .50 .40 .36 .91 .60 .49
Non-specific matchers AROMA 6.9 .31 .01 .01
CODI x .17 .08 .02 .82 .62 .50
Hertuda 23.5 .00 .01 1.00 .02 .03 1.00
HotMatch 16.5 .00 .01 .00 .40 .04 .02
LogMap 14.9 .17 .09 .02 .35 .03 .01
LogMapLt 5.5 .12 .07 .02 .30 .03 .01
MaasMatch 125.0 .02 .03 .14 .14 .14 .14
MapSSS 17.3 .08 .09 .04 .97 .66 .50
Optima 142.5 .00 .01 .59 .02 .03 .41

We can observe significant differences between the results obtained for each type of matching task, specially in terms of precision. While the systems that implement specific multilingual techniques clearly generate the best results for test cases of type (i), only one of these systems (YAM++) is among the top (3) f-measures for type (ii) test cases. For these test cases, the systems ahead are MapSSS and CODI, respectively, which implement good strategies to deal with ontologies that share structural similarities.

As observed in the 2011.5 campaign and corroborated in 2012, MapSSS and CODI have generated very good results on the benchmark track. On this way, it may suggest a strong correlation between the ranking in Benchmark and the ranking for MultiFarm test cases of type (ii), while there is, on the other hand, no (or only a very weak) correlation between results for test cases of type (i) and type (ii). For that reason, we analyze in the following only the results for test cases of type (i).

Language specific results

As expected and already reported above, the systems that apply specific strategies to deal with multilingual matching labels outperform all other systems (overall F-measure for both cases): YAM++, followed by WeSeE, GOMMA, AUTOMSv2, Wmatch and MEDLEY, respectively. Wmatch has the ability to deal with all pairs of languages, what is not the case for AUTOMSv2 and MEDLEY, specially for the pairs involving Chinese, Czech and Russian languages.

Most of the systems translating non-English ontology labels to English have better scores on pairs where English is present (by group of pairs, YAM++ is the typical case). It is due to the fact that multiple translations (pt->en and fr->en, for matching pt->fr, for instance) may result in ambiguous translated concepts, what makes harder the process of finding correct correspondences. Furthermore, as somehow expected, good results are also obtained for pairs of languages having a minimal overlap on their vocabularies (es-pt and fr-pt, for instance). These two observations may explain the top F-measures of the specific multilingual methods: AUTOMSv2 (es-pt, en-es, de-nl, en-nl), GOMMA (en-pt, es-pt, cn-en, de-en), MEDLEY (en-fr, en-pt, cz-en, en-es), WeseE (en-es, es-fr, en-pt, es-pt, fr-pt), YAM++ (cz-en, cz-pt, en-pt). Wmatch has an interesting pairs score, where Russian appears in the top F-measures: nl-ru, en-es, en-nl, fr-ru, es-ru. This may be explained by the use of ! Wikipedia multilingual inter-links, which are not limited to English or language similarities.

For non-specific systems, while all of them can not deal at all with Chinese and Russian languages, MapSSS, LogMap and CODI are ahead the other non-specific systems. These system perform better for some specific pairs: MapSSS (es-pt, en-es, de-en), LogMap and LogMapL (es-pt, de-en, en-es, cz-pt), CODI (es-pt, de-en, cz-pt). From all these systems, the pairs es-pt and de-en are ahead in their sets of best F-measures. Again, we can see that similarities in the language vocabulary have an important role in the matching task. On the other hand, although it is likely harder to find correspondences between cz-pt than es-pt, for some systems their best score include such combinations (cz-pt, for CODI and LogMapLt). It can be explained by the specific way systems combine their internal matching techniques (ontology structure, reasoning, coherence, linguistic similarities, etc).

algo AUTOMSv2 CODI GOMMA Hertuda HotMatch LogMap LogMapLt MaasMtch MapSSS MEDLEY Optima WeSeE Wmatch YAM
test P F R P F R P F R P F R P F R P F R P F R P F R P F R P F R P F R P F R P F R P F R
cn-cz NaN NaN 0.00 0.00 NaN 0.00 0.38 0.34 0.31 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 0.02 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.50 0.29 0.21 0.35 0.21 0.15 0.40 0.35 0.31
cn-de NaN NaN 0.00 0.00 NaN 0.00 0.42 0.34 0.29 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 0.02 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.70 0.25 0.15 0.13 0.08 0.06 0.40 0.35 0.30
cn-en NaN NaN 0.00 0.00 NaN 0.00 0.59 0.41 0.32 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.00 0.01 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.56 0.31 0.22 0.15 0.10 0.07 0.48 0.41 0.36
cn-es NaN NaN 0.00 0.00 NaN 0.00 0.34 0.33 0.31 0.00 0.01 1.00 0.01 0.01 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.00 0.02 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.53 0.34 0.25 0.19 0.14 0.11 0.41 0.20 0.13
cn-fr NaN NaN 0.00 0.00 NaN 0.00 0.38 0.35 0.33 0.00 0.01 1.00 0.03 0.01 0.01 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.00 0.01 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.52 0.32 0.23 0.14 0.09 0.07 0.44 0.39 0.35
cn-nl NaN NaN 0.00 0.00 NaN 0.00 0.23 0.25 0.27 0.00 0.01 1.00 0.01 0.01 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.00 0.01 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.42 0.23 0.16 0.17 0.10 0.07 0.38 0.34 0.30
cn-pt NaN NaN 0.00 0.00 NaN 0.00 0.37 0.31 0.27 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.01 0.01 0.03 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.53 0.30 0.21 0.14 0.09 0.07 0.40 0.35 0.31
cn-ru NaN NaN 0.00 0.00 NaN 0.00 0.29 0.27 0.25 0.00 0.01 1.00 0.01 0.01 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.00 0.01 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.45 0.27 0.20 0.22 0.13 0.09 0.41 0.33 0.27
cz-de NaN NaN 0.00 0.39 0.10 0.06 0.17 0.24 0.41 0.00 0.01 1.00 0.00 NaN 0.00 0.39 0.10 0.06 0.30 0.09 0.05 0.04 0.06 0.22 0.12 0.07 0.05 0.43 0.19 0.12 0.00 0.01 1.00 0.82 0.41 0.27 0.27 0.24 0.22 0.50 0.45 0.41
cz-en NaN NaN 0.00 0.31 0.07 0.04 0.36 0.36 0.37 0.00 0.01 1.00 0.00 NaN 0.00 0.25 0.05 0.03 0.18 0.04 0.02 0.04 0.06 0.23 0.12 0.08 0.06 0.32 0.28 0.26 0.00 0.01 1.00 0.68 0.48 0.37 0.28 0.24 0.22 0.58 0.58 0.57
cz-es NaN NaN 0.00 0.44 0.11 0.07 0.22 0.30 0.48 0.00 0.01 1.00 0.00 NaN 0.00 0.44 0.11 0.07 0.36 0.11 0.07 0.03 0.06 0.21 0.17 0.11 0.08 0.33 0.13 0.08 0.00 0.01 1.00 0.56 0.47 0.41 0.26 0.25 0.24 0.56 0.20 0.12
cz-fr NaN NaN 0.00 0.09 0.01 0.01 0.10 0.16 0.39 0.00 0.01 1.00 0.03 0.01 0.01 0.09 0.01 0.01 0.07 0.01 0.01 0.02 0.04 0.14 0.02 0.01 0.01 0.20 0.08 0.05 0.00 0.01 1.00 0.61 0.47 0.38 0.25 0.20 0.17 0.54 0.53 0.52
cz-nl NaN NaN 0.00 0.38 0.09 0.05 0.14 0.21 0.48 0.00 0.01 1.00 0.00 NaN 0.00 0.19 0.04 0.02 0.15 0.04 0.02 0.04 0.07 0.24 0.09 0.05 0.04 0.21 0.09 0.06 0.00 0.01 1.00 0.65 0.48 0.39 0.24 0.21 0.18 0.56 0.55 0.53
cz-pt NaN NaN 0.00 0.52 0.15 0.09 0.31 0.37 0.45 0.00 0.01 1.00 0.00 NaN 0.00 0.47 0.13 0.07 0.40 0.13 0.08 0.03 0.06 0.22 0.19 0.12 0.09 0.41 0.18 0.11 0.00 0.01 1.00 0.60 0.44 0.35 0.12 0.12 0.13 0.57 0.57 0.57
cz-ru NaN NaN 0.00 0.00 NaN 0.00 0.17 0.21 0.27 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.00 0.01 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.61 0.47 0.38 0.18 0.17 0.16 0.52 0.49 0.47
de-en 0.81 0.38 0.25 0.60 0.20 0.12 0.43 0.41 0.39 0.00 0.01 1.00 0.00 NaN 0.00 0.60 0.22 0.13 0.52 0.20 0.12 0.04 0.06 0.23 0.22 0.16 0.13 0.34 0.27 0.22 0.00 0.01 0.76 0.69 0.39 0.28 0.26 0.28 0.30 0.56 0.52 0.48
de-es 0.72 0.35 0.24 0.29 0.06 0.03 0.30 0.35 0.43 0.00 0.01 1.00 0.00 NaN 0.00 0.46 0.12 0.07 0.22 0.06 0.03 0.04 0.06 0.23 0.22 0.15 0.12 0.32 0.15 0.10 0.00 0.01 1.00 0.65 0.41 0.30 0.21 0.24 0.27 0.56 0.20 0.12
de-fr 0.91 0.32 0.20 0.20 0.04 0.02 0.15 0.21 0.37 0.00 0.01 1.00 0.00 NaN 0.00 0.20 0.04 0.02 0.15 0.04 0.02 0.03 0.05 0.18 0.19 0.13 0.10 0.31 0.13 0.09 0.00 0.01 1.00 0.81 0.41 0.27 0.24 0.25 0.26 0.50 0.46 0.43
de-nl 0.81 0.39 0.26 0.23 0.05 0.03 0.17 0.24 0.43 0.00 0.01 1.00 0.00 NaN 0.00 0.21 0.04 0.02 0.17 0.04 0.02 0.04 0.06 0.23 0.21 0.15 0.12 0.24 0.12 0.08 0.00 0.01 1.00 0.66 0.37 0.26 0.27 0.25 0.24 0.44 0.40 0.36
de-pt 0.83 0.35 0.22 0.35 0.08 0.04 0.34 0.36 0.39 0.00 0.01 1.00 0.00 NaN 0.00 0.33 0.07 0.04 0.26 0.07 0.04 0.03 0.05 0.18 0.10 0.06 0.04 0.33 0.15 0.10 0.00 0.01 1.00 0.65 0.35 0.24 0.20 0.22 0.24 0.45 0.42 0.39
de-ru NaN NaN 0.00 0.00 NaN 0.00 0.35 0.33 0.31 0.00 0.01 1.00 0.01 0.01 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.01 0.01 0.04 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.73 0.42 0.29 0.27 0.26 0.26 0.53 0.47 0.43
en-es 0.60 0.42 0.33 0.23 0.04 0.02 0.44 0.40 0.37 0.00 0.01 1.00 0.00 NaN 0.00 0.50 0.15 0.09 0.18 0.04 0.02 0.05 0.08 0.28 0.24 0.18 0.15 0.28 0.28 0.27 0.00 0.01 1.00 0.65 0.52 0.44 0.24 0.30 0.40 0.58 0.23 0.15
en-fr 0.56 0.31 0.22 0.21 0.04 0.02 0.33 0.36 0.41 0.00 0.01 1.00 0.01 0.01 0.00 0.28 0.06 0.04 0.14 0.04 0.02 0.05 0.09 0.33 0.19 0.13 0.10 0.34 0.33 0.33 0.00 0.01 1.00 0.69 0.48 0.37 0.25 0.27 0.29 0.56 0.53 0.51
en-nl 0.64 0.39 0.28 0.36 0.10 0.06 0.36 0.38 0.41 0.00 0.01 1.00 0.00 NaN 0.00 0.29 0.08 0.04 0.30 0.10 0.06 0.05 0.09 0.31 0.21 0.15 0.12 0.27 0.24 0.21 0.00 0.01 1.00 0.65 0.49 0.40 0.28 0.29 0.30 0.56 0.53 0.51
en-pt 0.62 0.37 0.26 0.35 0.08 0.04 0.53 0.45 0.39 0.00 0.01 1.00 0.00 NaN 0.00 0.28 0.06 0.03 0.24 0.06 0.04 0.04 0.06 0.23 0.11 0.07 0.05 0.30 0.30 0.29 0.00 0.01 1.00 0.63 0.51 0.43 0.20 0.26 0.35 0.58 0.56 0.54
en-ru 0.00 NaN 0.00 0.00 NaN 0.00 0.45 0.34 0.27 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.01 0.01 0.04 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.61 0.43 0.33 0.22 0.25 0.29 0.51 0.47 0.44
es-fr 0.61 0.37 0.27 0.09 0.01 0.01 0.21 0.29 0.50 0.00 0.01 1.00 0.00 NaN 0.00 0.31 0.07 0.04 0.07 0.01 0.01 0.05 0.08 0.29 0.09 0.06 0.04 0.15 0.06 0.04 0.00 0.01 1.00 0.63 0.52 0.44 0.20 0.24 0.31 0.57 0.20 0.12
es-nl 0.53 0.38 0.29 0.00 NaN 0.00 0.25 0.33 0.51 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.03 0.05 0.17 0.02 0.01 0.01 0.07 0.03 0.02 0.00 0.01 1.00 0.56 0.46 0.40 0.25 0.27 0.30 0.49 0.16 0.10
es-pt 0.56 0.44 0.37 0.47 0.22 0.15 0.37 0.44 0.54 0.00 0.01 1.00 0.00 NaN 0.00 0.53 0.24 0.16 0.47 0.23 0.15 0.06 0.11 0.40 0.31 0.23 0.19 0.37 0.22 0.16 0.00 0.01 1.00 0.58 0.51 0.45 0.18 0.25 0.41 0.59 0.25 0.16
es-ru 0.00 NaN 0.00 0.00 NaN 0.00 0.21 0.21 0.20 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.00 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.57 0.47 0.40 0.25 0.28 0.31 0.55 0.19 0.11
fr-nl 0.51 0.27 0.19 0.45 0.13 0.07 0.14 0.21 0.44 0.00 0.01 1.00 0.01 0.01 0.00 0.44 0.13 0.07 0.38 0.12 0.07 0.04 0.07 0.24 0.17 0.11 0.09 0.32 0.16 0.11 0.00 0.01 1.00 0.59 0.43 0.34 0.25 0.28 0.31 0.48 0.47 0.47
fr-pt 0.62 0.35 0.24 0.00 NaN 0.00 0.26 0.32 0.43 0.00 0.01 1.00 0.01 0.01 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.03 0.06 0.21 0.03 0.02 0.01 0.17 0.08 0.05 0.00 0.01 1.00 0.65 0.50 0.41 0.18 0.23 0.31 0.53 0.53 0.54
fr-ru 0.00 NaN 0.00 0.00 NaN 0.00 0.21 0.24 0.30 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.00 0.02 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.62 0.47 0.37 0.26 0.28 0.30 0.48 0.46 0.44
nl-pt 0.59 0.37 0.27 0.19 0.04 0.02 0.26 0.32 0.43 0.00 0.01 1.00 0.00 NaN 0.00 0.09 0.01 0.01 0.07 0.01 0.01 0.03 0.05 0.18 0.04 0.02 0.02 0.12 0.06 0.04 0.00 0.01 1.00 0.62 0.47 0.38 0.16 0.20 0.27 0.54 0.51 0.49
nl-ru 0.00 NaN 0.00 0.00 NaN 0.00 0.17 0.22 0.28 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 0.02 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 0.43 0.55 0.42 0.34 0.31 0.31 0.30 0.44 0.42 0.40
pt-ru 0.00 NaN 0.00 0.00 NaN 0.00 0.38 0.30 0.24 0.00 0.01 1.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.00 0.02 0.00 NaN 0.00 0.00 NaN 0.00 0.00 0.01 1.00 0.60 0.45 0.36 0.15 0.17 0.20 0.47 0.44 0.42

n/a: result alignment not provided or not readable
NaN: division per zero, likely due to empty alignment.

References

[1] Christian Meilicke, Raul Garcia-Castro, Fred Freitas, Willem Robert van Hage, Elena Montiel-Ponsoda, Ryan Ribeiro de Azevedo, Heiner Stuckenschmidt, Ondrej Svab-Zamazal, Vojtech Svatek, Andrei Tamilin, Cassia Trojahn, Shenghui Wang. MultiFarm: A Benchmark for Multilingual Ontology Matching. Journal of Web Semantics. An authors version of the paper can be found at the MultiFarm homepage, where the dataset is described in details..