We have collected all generated alignments and make them available in a zip-file via the following link. These alignments are the raw results that the following report is based on.
>>> download raw results
We conducted experiments by executing each system in its standard setting and we compare precision, recall, F-measure and recall+. The measure recall+ indicates the amount of detected non-trivial correspondences. The matched entities in a non-trivial correspondence do not have the same normalized label. The approach that generates only trivial correspondences is depicted as baseline StringEquiv in the following section. We used a different machine compared to previous years. Thus, runtime results are not fully comparable across years.
This year we run the systems on a server with 3.46 GHz (6 cores) and 8GB RAM allocated to the matching systems. Further, we used the SEALS
client to execute our evaluation. However, similar to last year's evaluation, we slightly changed the way how precision and recall are computed, i.e., the results generated by the SEALS client vary in some cases by 0.5% compared to the results presented below. In particular, we removed trivial correspondences in the oboInOwlnamespace like
http://...oboInOwl#Synonym = http://...oboInOwl#Synonym
as well as correspondences expressing relations different from equivalence. Using the Pellet reasoner we also checked whether the generated alignment is coherent, i.e., there are no unsatisfiable concepts when the ontologies are merged with the alignment.
In the following, we analyze all participating systems that could generate an alignment in less than ten hours. The listing comprises of 20 entries. Four systems participated each with two different versions. These are AML and GOMMA with versions which use background knowledge (indicated with suffix "-bk"), LogMap with a lightweight version LogMapLite that uses only some core components and XMap with versions XMapSig and XMapGen which use two different algorithms. GOMMA and HerTUDA participated with the same system as last year (indicated by * in the table). In addition to these two tools we have eight more systems which participated in 2012 and now participated with new versions (HotMatch, LogMap, MaasMatch, MapSSS, ServOMap, WeSeEMatch, WikiMatch and Yam++). Due to some software and hardware incompatibilities, Yam++ had to be run on a different machine and therefore its runtime (indicated by **) is not fully comparable to the other systems' runtimes. For more details, we refer the reader to the papers presenting the systems. Thus, 20 different systems generated an alignment within the given time frame. There were four participants CroMatcher, RiMOM2013, OntoK and Synthesis that did not finish in time or threw an exception.
We have 9 systems which finished in less than 100 seconds, compared to 8 systems in OAEI 2012 and 2 systems in OAEI 2011. This year we have 20 out of 24 systems which generated results which is comparable to last year when 14 out of 18 systems generated results within the given time frame. The top systems in terms of runtimes are LogMap, GOMMA, IAMA and AML. Depending on the specific version of the systems, they require between 7 and 15 seconds to match the ontologies. The table shows that there is no correlation between quality of the generated alignment in terms of precision and recall and required runtime. This result has also been observed in previous OAEI campaigns.
The table also shows the results for precision, recall and F-measure. In terms of F-measure, the two top ranked systems are AML-bk and GOMMA-bk. These systems use background knowledge, i.e., they are based on mapping composition techniques and the reuse of mappings between UMLS, Uberon and FMA.
AML-bk and GOMMA-bk are followed by a group of matching systems (YAM++, AML, LogMap, GOMMA) generating alignments that are very similar with respect to precision, recall and F-measure (between 0.87 and 0.91 F-measure). To our knowledge, these systems either do not use specific background knowledge for the biomedical domain or only in a very limited way. The results of these systems are at least as good as the results of the best system in OAEI 2007-2010. Only AgreementMaker in 2011 using additional background knowledge could generate better results than these systems.This year we have 8 out of 20 systems which achieved an F-measure that is lower than the baseline which is based on (normalized) string equivalence (StringEquiv in the table). This is a negative trend compared to the last year when only 4 out of 17 systems produced alignments with F-measure lower than the baseline.
Moreover, nearly all systems find many non-trivial correspondences. An exception are systems IAMA and WeSeE that generate an alignment that is quite similar to the alignment generated by the baseline approach.
From the systems which participated in the last year WikiMatch showed a considerable improvement. It increased the precision from 0.864 to 0.987 (and the F-measure from 0.758 to 0.797). The other systems produced very similar results compared to the previous year. One exception is the WeSeE system which achieved an F-measure which is almost 50% lower than the previous year.
This year 24 systems participated in the anatomy track out of which 20 produced results within 10 hours. This is so far the highest number of participating systems as well as the highest number of systems which produce results given time constraints for the anatomy track. There is an increase in the number of systems which achieve an F-measure lower than the baseline.
As last year, we have witnessed a positive trend in runtimes as the majority of systems finish execution in less than an hour (16 out of 20). This year's result for the AML-bk system improves the best result in terms of F-measure set by a previous version of the system in 2010 and makes it also the top result for the anatomy track.
We would like to thank Christian Meilicke for his advices and support with the organization of this track.
This track is organized by Zlatan Dragisic, Valentina Ivanova and Patrick Lambrix. If you have any problems working with the ontologies, any questions related to tool wrapping, or any suggestions related to the anatomy track, feel free to write an email to oaei-anatomy [at] ida [.] liu [.] se.