We have collected all generated alignments from the participating systems for this track and make them available in a zip-file via the following link. These alignments are the raw results that the following report is based on.
>>> download raw results (alignments)
We conducted experiments by executing each system in its standard setting and we compare precision, F-measure, recall and recall+. The measure recall+ indicates the amount of detected non-trivial correspondences. The matched entities in a non-trivial correspondence do not have the same normalized label.
Except Agent-OM, DRAL-OA and LogMapLLM, we ran all the matchers on a machine with 16GB RAM installed. As last four years, we used the MELT platform to execute our evaluations for all systems except ALIN and MDMapper that we used the SEALS client. The generated alignments and runtime for Agent-OM, DRAL-OA, and LogMapLLM were provided directly by their respective developers. Please note that the reported runtime for Agent-OM and LogMapLLM are approximate rather than exact. For instance, the runtime may vary due to fluctuations in the duration of LLM API calls.
As in earlier years, we slightly changed the way how precision and recall are computed, i.e., the results generated by the MELT and SEALS clients vary in some cases by 0.5% compared to the results presented below. In particular, we removed trivial correspondences in the oboInOwlnamespace like:
http://...oboInOwl#Synonym = http://...oboInOwl#Synonym
as well as correspondences expressing relations different from equivalence.
Using the Pellet reasoner we also checked whether the generated alignment is coherent, i.e., there are no unsatisfiable concepts when the ontologies are merged with the alignment.
In the following, we analyze all participating systems that could generate an alignment. The listing comprises of 11 entries. LogMap participated with different versions, namely LogMap, LogMapBio, LogMapKG and a lightweight version LogMapLite that uses only some core components as previous years. There are three systems, Agent-OM, DRAL-OA and LogMapLLM participating in the anatomy track this year for the first time. The rest of the systems have participated in OAEI for more than two years. From the systems participating this year, Matcha, MDMapper, ALIN and LogMap (all versions except LogMapKG) participated last year as well. LogMap is a constant participant since 2011 and ALIN joined in 2016. Matcha has been participating since 2022 and MDMapper joined in last year (2024). For more details, we refer the reader to the papers presenting the systems. Thus, this year we have 8 different systems (not counting different versions) which generated an alignment.
This year 5 out of 11 systems were able to achieve the alignment task in less than 100 seconds (they require between 2 and 47 seconds to match the ontologies). These are LogMapLite, LogMap, LogMapKG, LSMatch and Matcha. In 2024 and 2023 , there were 3 out of 7 and 5 out of 9 systems respectively which generated an alignment in this time frame. Similarly to the last 13 years, LogMapLite has the shortest run time (2 seconds). The table shows that there is no correlation between the required time for running and the quality of the generated alignment in specific metric. This result has also been observed in previous OAEI campaigns.
The table also shows the results for F-measure, recall+ and the size of the alignments. Regarding F-measure, the top 3 ranked systems are Matcha, Agent-OM and ALIN. Among these, Matcha achieved the highest F-measure (0.941) which is the same as last year. This year, among the systems that participated last year, ALIN shows an increase in F-measure from 0.851 in 2024 to 0.912 in 2025. Regarding recall+, Matcha, LogMap and LogMapLite show similar results as earlier and Matcha achieved the highest recall+ (0.82) as last year. Also, ALIN shows an increase in recall+ from 0.489 in 2024 to 0.7 in 2025. Regarding the number of correspondences, Matcha, LogMap and LogMapLite computed a similar number of correspondences as last year. Compared with last year's results, ALIN generated 267, LogMapBio generated 12 and MDMapper generated 42 more correspondences, respectively.
This year 10 out of 11 systems achieved a F-measure higher than the baseline which is based on (normalized) string equivalence (StringEquiv in the table). Among these 6 systems, Agent-OM, LogMapKG and DRAL-OA are new participants.This year 4 systems produced coherent alignments which are LogMap, LogMapKG, LogMapBio and LogMapLLM.
The number of participating systems varies between the years. In 2025, there are 4 and 2 participants more than the number of participants in 2024 and 2023 respectively. As noted earlier there are newly-joined systems as well as long-term participants.
This year, Matcha sets the top result for the anatomy track with respect to the F-measure (same as last year), followed by Agent-OM and ALIN.
This track is organized by Mina Abd Nikooie Pour, Huanyu Li, Ying Li, and Patrick Lambrix. If you have any problems working with the ontologies, any questions related to tool wrapping, or any suggestions related to the anatomy track, feel free to write an email to oaei-anatomy [at] ida [.] liu [.] se.