In this page, we report the results of the OAEI 2025 campaign for the MultiFarm track. This year, the dataset has been extended to include Turkish and Indian (Hindi and related language) resources, further broadening the multilingual scope and diversity of the evaluation. The details on this data set can be found at the MultiFarm web page.
If you notice any kind of error (wrong numbers, incorrect information on a matching system, etc.) do not hesitate to contact us (for the mail see below in the last paragraph on this page).
We have conducted an evaluation based on the blind data set. This data set includes the matching tasks involving the edas and ekaw ontologies (resulting in 55 x 24 tasks). Participants were able to test their systems on the open subset of tasks. The open subset counts on 45 x 25 tasks and it does not include Italian translations.
We distinguish two types of matching tasks :
As we could observe in previous evaluations, for the tasks of type (ii) which is similar ontologies, good results are not directly related to the use of specific techniques for dealing with cross-lingual ontologies, but on the ability to exploit the fact that both ontologies have an identical structure. This year, we report the results on different ontologies (i).
This year, four systems participated in the MultiFarm track: LogMap, LogMapLt, LSMatch, and Matcha. The number of participating tools remains comparable to previous campaigns (four in 2023, five in 2022, six in 2021, and six in 2020). In comparison to last year, we welcomed back LSMatch and introduced Matcha as a new entrant to the evaluation. The inclusion of these systems highlights the continued engagement of both established and emerging approaches in multilingual ontology matching. Participants may refer to their respective OAEI system papers for a detailed description of the strategies and architectures adopted.
The systems were executed on a Windows Server 2025 machine equipped with 96 GB of RAM and an Intel Xeon Silver 4114 @ 2.20 GHz CPU, along with a Tesla P40 GPU. All measurements are based on a single execution. As in previous campaigns, we observed substantial variation in the time required for systems to complete the 55 × 24 matching tasks: LogMap (~7 minutes), LogMapLt (~17 minutes), LSMatch (~36 minutes), and Matcha (~408 minutes). Although the experiments were conducted on a different machine this year, the performance of the LogMap family remained consistent with previous evaluations. In contrast, LSMatch and Matcha required considerably longer runtimes, reflecting their more resource-intensive processing pipelines. These runtime measurements are provided solely for indicative comparison of system efficiency under a common environment.
The table below presents the aggregated results for the matching tasks. MultiFarm aggregated results per matcher for different ontologies. Time is measured in minutes (for completing the 55x24 matching tasks).
| Different ontologies (i) | |||||
|---|---|---|---|---|---|
| System | Time (Min) | Prec. | F-m. | Rec. | |
| LogMap | ~6.7 | 0.87 | 0.18 | 0.10 | |
| LogMapLt | ~16.9 | 0.84 | 0.008 | 0.004 | |
| LSMatch | ~36 | 0.79 | 0.44 | 0.30 | |
| Matcha | ~408 | 0.26 | 0.26 | 0.25 | |
The updated results show clear performance variations across all systems. LogMap achieved the highest precision (0.87) but a relatively low recall (0.10), leading to a modest F1-score of 0.18. LogMapLt performed similarly in precision (0.84) but with very low recall and F1, indicating limited overall effectiveness. LSMatch demonstrated more balanced performance, achieving a macro F1-score of 0.37 and a micro F1 of 0.43, with precision and recall values of 0.79 and 0.30, respectively. Finally, Matcha showed comparable precision and recall values (0.26 and 0.25) with a total runtime of approximately 6 hours and 48 minutes. Overall, LSMatch displayed the strongest balance between precision and recall, while LogMap remained the fastest system with the highest precision.
The 2025 MultiFarm campaign continued to attract a diverse set of ontology matching systems, maintaining a consistent level of participation across recent years. This year’s evaluation incorporated new Turkish and Indian datasets, further enriching the multilingual spectrum of the benchmark. The comparative analysis revealed that while LogMap remains highly efficient and precise, LSMatch achieved a more balanced trade-off between precision and recall. LogMapLt demonstrated stable precision but limited recall performance, and Matcha exhibited steady yet computationally intensive behavior. Overall, these results reaffirm the ongoing challenges in multilingual ontology alignment, particularly in balancing accuracy and scalability across diverse language pairs. The inclusion of new linguistic resources this year provides valuable insights into the robustness of current systems and sets a foundation for future cross-lingual alignment research.
[1] Christian Meilicke, Raul Garcia-Castro, Fred Freitas, Willem Robert van Hage, Elena Montiel-Ponsoda, Ryan Ribeiro de Azevedo, Heiner Stuckenschmidt, Ondrej Svab-Zamazal, Vojtech Svatek, Andrei Tamilin, Cassia Trojahn, Shenghui Wang. MultiFarm: A Benchmark for Multilingual Ontology Matching. Accepted for publication at the Journal of Web Semantics.
An authors version of the paper can be found at the MultiFarm homepage, where the data set is described in details.
This track is organized by Beyza Yaman, Abhisek Sharma, Sarika Jain and Cassia Trojahn dos Santos. If you have any problems working with the ontologies, any questions or suggestions, feel free to write an email to beyza [.] yaman [at] adaptcentre [.] ie, jasarika [at] nitkkr [.] ac [.] in, and cassia [.] trojahn [at] irit [.] fr.