We have evaluated (modified or new) systems submitted to the OAEI 2011.5, and (non-modified) systems from the OAEI 2011 able to cope with the Anatomy track:
In order to complement scalability, we have executed all systems in two different settings: (1) a standard laptop with 2 cores and 4Gb or RAM, and (2) a high performance server with 16 CPUs and 10 Gb. In total, 9 (out of 16) systems have been able to cope with at least one of the matching problems of the track. AgrMaker, AUTOMSv2, Lily and YAM++ failed to process the input ontologies and threw an Exception. We should emphasise that AgrMaker and Lily did not submit a version to OAEI 2011.5 and thus did not test the input ontologies of the track. CODI, Hertuda and WeSeE, on the other hand, threw an exception related to insufficient memory when processing the smallest matching problem. Note that, CODI was evaluated in a different setting using only 7Gb.
Together with precision, recall, f-value and runtimes we have also evaluated the coherence of alignments. We have reported (1) number of unsatisfiabilities when reasoning (using HermiT) with the input ontologies together with the computed mappings, (2) the ratio/degree of unsatisfiable classes with respect to the size of the merged ontology (based on the Unsatisfiability Measure proposed in [1]), and (3) an approximation of the root unsatisfiability. The root unsatisfiability aims at providing a more precise amount of errors, since many of the unsatisfiabilities may be derived (i.e., a subclass of an unsatisfiable class will also be reported as unsatisfiable). The provided approximation is based on LogMap's (incomplete) repair facility and shows the number of classes that this facility needed to repair in order to solve (most of) the unsatisfiabilities [2].
Next tables summarise the obtained results for each system in the three matching problems. We have ordered systems in terms of F-value with respect to the refined UMLS-based reference alignment (this curated reference alignment has been automatically created and it is the first step for the creation of an error-free silver standard. For next OAEI we aim at "harmonising" the output of the different matching tools together with the current UMLS-based mapping sets). Note that we have evaluated GOMMA system with two different configurations. In one we activated the usage of background knowledge (bk) and in the other we deactivated this feature (nobk). The background knowledge of GOMMA-bk involves the application of mapping composition techniques and the reuse of mappings from FMA-UMLS and NCI-UMLS. Although, the reference alignment of this track is also based on UMLS, we consider the alignments provided by GOMMA-bk very interesting.
It is worth mentioning that LogMapLt has been used as a base-line since it only applies simple string equivalence techniques.
GOMMA-bk obtained the best results in terms of both recall and F-value while GOMMA-nobk provided the most precise alignments, for the matching problem involving the small (overlapping) modules of FMA and NCI. We can appreciate that GOMMA (with its two configurations) and LogMap are bit ahead with respect to Aroma, MaasMatch, CSA and MapSSS in terms of F-value. Furthermore, our base-line LogMapLt also provided better results in terms of both precision and recall. MapSSS provided a good precision, however the F-value was damaged due to the low recall of its mappings. Nevertheless, these tools can deal with large ontologies such as FMA and NCI and they will be very helpful for the creation of the future silver standard reference alignment for the track.
MapPSO and MapEVO are two especial cases for which we did not obtain meaningful alignments. Both systems generate comprehensive alignments, however, they only found a few correct correspondences. Furthermore, when running in the server, MapPSO threw an exception related to the parallelisation of its algorithm. The reason of such low quality results is mostly due to MapEVO and MapPSO configurations for this track. MapEVO and MapPSO algorithms work iteratively converging towards an optimum alignment and are designed to be executed on a large parallel infrastructure. MapEVO and MapPSO authors reduced significantly the number of iterations and parallel threads due to infrastructure availability.
The runtimes were quite good in general. Only MaasMatch and CSA needed more than 2.5 and 4 hours, respectively, to complete the task. Furthermore, MaasMatch did not finished after more than 12 hours of execution in the laptop setting. We can also appreciate that, in some cases, times in the server are reduced in more than 50%.
Regarding mapping coherence, only LogMap generates an almost clean output. In the table, we can appreciate that even the most precise mappings (GOMMA-nobk) lead to a huge amount of unsatisfiable classes when reasoning together with the input ontologies; and thus, it proves the importance of using techniques to assess the coherence of the generated alignments. Unfortunately, LogMap and CODI are the unique systems (participating in the OAEI 2011.5) that use such techniques.
System | # Mappings | Refined UMLS | Original UMLS | Time (s) | Incoherence Analysis | |||||||
Precision | Recall | F-value | Precision | Recall | F-value | Server | Laptop | Unsat. | Degree | Root Unsat. | ||
GOMMA-bk | 2,878 | 0.925 | 0.918 | 0.921 | 0.957 | 0.910 | 0.933 | 34 | 67 | 6,292 | 61.78% | 251 |
LogMap | 2,739 | 0.935 | 0.884 | 0.909 | 0.952 | 0.863 | 0.905 | 20 | 41 | 2 | 0.02% | 2 |
GOMMA-nobk | 2,628 | 0.945 | 0.857 | 0.899 | 0.973 | 0.846 | 0.905 | 27 | 50 | 2,130 | 20.92% | 132 |
LogMapLt | 2,483 | 0.942 | 0.807 | 0.869 | 0.969 | 0.796 | 0.874 | 10 | 12 | 2,104 | 20.66% | 118 |
Aroma | 2,575 | 0.802 | 0.713 | 0.755 | 0.824 | 0.702 | 0.758 | 68 | 140 | 7,558 | 74.21% | 489 |
MaasMatch | 3,696 | 0.580 | 0.744 | 0.652 | 0.597 | 0.730 | 0.657 | 9,437 | - | 9,718 | 95.42% | 1952 |
CSA | 3,607 | 0.514 | 0.640 | 0.570 | 0.528 | 0.629 | 0.574 | 14,414 | 26,580 | 9,590 | 94.17% | 3808 |
MapSSS | 1,483 | 0.840 | 0.430 | 0.569 | 0.860 | 0.422 | 0.566 | 571 | 937 | 565 | 5.55% | 87 |
MapPSO | 3,654 | 0.021 | 0.025 | 0.023 | 0.022 | 0.027 | 0.024 | - | 41,686 | 10,145 | 99.62% | 5176 |
MapEVO | 633 | 0.003 | 0.001 | 0.002 | 0.003 | 0.001 | 0.002 | 2,985 | 5,252 | 9164 | 89.98% | 370 |
MaasMatch failed to complete the matching problem using the extended overlapping modules of FMA and NCI. It threw an exception related to memory requirements when running in the server with 10Gb and it did not end after 5 days of execution using 20Gb.
LogMap provided the best results in terms of Precision and F-value, whereas GOMMA-bk got the best recall. F-values have decreased considerably with respect to the previous matching problem. This is due to the fact that this matching problem involves much more possible candidate mappings. CSA is, however, an exception, since (surprisingly) it maintained exactly the same results.
The runtimes were still very good for GOMMA-bk, GOMMA-nobk, LogMap and LogMapLt. However the required time for Aroma increased considerably. Furthermore, CSA threw an exception related to memory requirements when running in the laptop.
System | # Mappings | Refined UMLS | Original UMLS | Time (s) | Incoherence Analysis | |||||||
Precision | Recall | F-value | Precision | Recall | F-value | Server | Laptop | Unsat. | Degree | Root Unsat. | ||
LogMap | 2,664 | 0.877 | 0.806 | 0.840 | 0.887 | 0.782 | 0.831 | 72 | 150 | 5 | 0.01% | 2 |
GOMMA-bk | 2,942 | 0.817 | 0.830 | 0.823 | 0.838 | 0.815 | 0.826 | 216 | 557 | 7,304 | 13.41% | 161 |
GOMMA-nobk | 2,631 | 0.856 | 0.777 | 0.815 | 0.873 | 0.760 | 0.813 | 160 | 541 | 2,127 | 3.91% | 74 |
LogMapLt | 3,219 | 0.726 | 0.807 | 0.764 | 0.748 | 0.796 | 0.771 | 26 | 26 | 12,682 | 23.29% | 446 |
CSA | 3,607 | 0.514 | 0.640 | 0.570 | 0.528 | 0.629 | 0.574 | 14,048 | - | 49,831 | 91.51% | 1,794 |
Aroma | 3,796 | 0.471 | 0.616 | 0.534 | 0.484 | 0.607 | 0.539 | 2,088 | 4,484 | 23,298 | 42.79% | 1,551 |
MapSSS | 2,314 | 0.459 | 0.366 | 0.407 | 0.471 | 0.360 | 0.408 | 20,352 | 31,560 | 8,401 | 15.43% | 247 |
MapPSO | 25,510 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | - | 1,491 | 54,451 | 99.99% | - |
MapEVO | 14,482 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 119 | 156 | 54,449 | 99.99% | - |
MaasMatch | - | - | - | - | - | - | - | - | - | - | - | - |
For the matching problem involving the whole FMA and NCI ontologies, LogMap obtained the best results in terms of Precision and F-value when comparing with the refined reference alignment. However, GOMMA-bk was the best tool in terms of both Recall and F-measure when comparing the original UMLS-based alignment. Again, the F-values decreased with respect the previous matching problem.
Regarding scalability, MapSSS joined CSA and also failed to complete the matching task in the laptop setting. The runtimes for GOMMA-bk and GOMMA-nobk also suffered an important increase. Nevertheless, they still require less than 20 minutes (in the server setting) to complete the largest matching problem of the track.
System | # Mappings | Refined UMLS | Original UMLS | Time (s) | Incoherence Analysis | |||||||
Precision | Recall | F-value | Precision | Recall | F-value | Server | Laptop | Unsat. | Degree | Root Unsat. | ||
LogMap | 2,658 | 0.868 | 0.796 | 0.830 | 0.875 | 0.769 | 0.819 | 126 | 247 | 9 | 0.01% | 2 |
GOMMA-bk | 2,983 | 0.806 | 0.830 | 0.818 | 0.826 | 0.815 | 0.820 | 1,093 | 2,880 | 17,005 | 11.67% | 280 |
GOMMA-nobk | 2,665 | 0.845 | 0.777 | 0.810 | 0.862 | 0.759 | 0.807 | 960 | 2,876 | 5,238 | 3.60% | 97 |
LogMapLt | 3,466 | 0.675 | 0.807 | 0.735 | 0.695 | 0.796 | 0.742 | 57 | 66 | 26,429 | 18.14% | 574 |
CSA | 3,607 | 0.514 | 0.640 | 0.570 | 0.528 | 0.629 | 0.574 | 14,068 | - | 122,296 | 83.93% | 3,734 |
Aroma | 4,080 | 0.467 | 0.657 | 0.546 | 0.480 | 0.647 | 0.551 | 9,503 | 15,725 | 117,314 | 80.51% | 1,561 |
MapSSS | 2,440 | 0.426 | 0.359 | 0.390 | 0.438 | 0.353 | 0.391 | 170,056 | - | 33,186 | 22.78% | 339 |
MapPSO | 59,706 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | - | 8,090 | 145,704 | 99.99% | - |
MapEVO | 33,771 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 440 | 531 | 145,677 | 99.99% | - |
MaasMatch | - | - | - | - | - | - | - | - | - | - | - | - |
[1] Christian Meilicke and Heiner Stuckenschmidt. Incoherence as a basis for measuring the quality of ontology mappings. In Proc. of 3rd International Workshop on Ontology Matching (OM), 2008. [url]
[2] Ernesto Jimenez-Ruiz and Bernardo Cuenca Grau. LogMap: Logic-based and scalable ontology matching. In Proc. of 10th International Semantic Web Conference (ISWC), 2011. [url]
This track is organised by Ernesto Jimenez Ruiz, Bernardo Cuenca Grau and Ian Horrocks, and supported by the SEALS and LogMap projects. If you have any question/suggestion related to the results of this track, feel free to write an email to ernesto [at] cs [.] ox [.] ac [.] uk