In our preliminary evaluation, only 4 systems (LogMap, LogMapBio, LogMapLt and Matcha) managed to generate an output for at least one of the track tasks.
We conducted experiments using the MELT client. We executed each system in its standard settings and we calculated precision, recall and f-measure. The execution times are calculated considering the whole process pipeline, starting from ontologies upload and environment preparation.
We have run the evaluation on two different machines: a Windows 10 (64-bit) desktop with an Intel Core i7-4770 CPU @ 3.40GHz x 4, allocating 16GB of RAM as well as on a MacOS laptop with a 2 GHz Quad-Core Intel Core i5 and allocating 16GB of RAM.
1. Results for the ENVO-SWEET matching task
System | Time (HH:MM:SS) | # Mappings | Scores | ||
Precision | Recall | F-measure | |||
LogMap | 00:00:25 | 676 | 0.781 | 0.656 | 0.713 |
LogMapBio | 01:00:03 | 697 | 0.753 | 0.652 | 0.699 |
LogMapLt | 00:07:32 | 576 | 0.829 | 0.594 | 0.692 |
2. Results for the ANAEETHES-GEMET matching task
System | Time (HH:MM:SS) | # Mappings | Scores | ||
Precision | Recall | F-measure | |||
LogMapLt | 00:00:03 | 182 | 0.840 | 0.458 | 0.593 |
3. Results for the AGROVOC-NALT matching task
System | Time (HH:MM:SS) | # Mappings | Scores | ||
Precision | Recall | F-measure | |||
LogMapLt | 00:00:10 | 19185 | 0.744 | 0.953 | 0.836 |
4. Results for the NCBITAXON-TAXREFLD matching task
System | Time (HH:MM:SS) | # Mappings | Scores | ||
Precision | Recall | F-measure | |||
LogMapLt | 00:00:43 | 72010 | 0.665 | 0.993 | 0.796 |
System | Time (HH:MM:SS) | # Mappings | Scores | ||
Precision | Recall | F-measure | |||
LogMap | 00:00:01 | 304 | 0.575 | 1.0 | 0.730 |
LogMapLt | 00:00:00 | 290 | 0.6 | 0.994 | 0.748 |
Matcha | 00:00:05 | 303 | 0.577 | 1.0 | 0.732 |
System | Time (HH:MM:SS) | # Mappings | Scores | ||
Precision | Recall | F-measure | |||
LogMap | 00:00:04 | 2218 | 0.623 | 0.985 | 0.764 |
LogMapLt | 00:00:01 | 2165 | 0.637 | 0.982 | 0.773 |
Matcha | 00:00:15 | 2219 | 0.623 | 0.984 | 0.763 |
System | Time (HH:MM:SS) | # Mappings | Scores | ||
Precision | Recall | F-measure | |||
LogMap | 00:00:39 | 12949 | 0.783 | 0.998 | 0.878 |
LogMapLt | 00:00:07 | 12929 | 0.783 | 0.997 | 0.877 |
Matcha | 00:00:51 | 12936 | 0.785 | 0.999 | 0.879 |
System | Time (HH:MM:SS) | # Mappings | Scores | ||
Precision | Recall | F-measure | |||
LogMapLt | 00:00:17 | 26359 | 0.746 | 0.987 | 0.849 |
Matcha | 00:01:15 | 26675 | 0.741 | 0.992 | 0.848 |
System | Time (HH:MM:SS) | # Mappings | Scores | ||
Precision | Recall | F-measure | |||
LogMap | 00:00:01 | 496 | 0.719 | 1.0 | 0.837 |
LogMapLt | 00:00:00 | 477 | 0.746 | 0.997 | 0.853 |
Matcha | 00:00:11 | 494 | 0.722 | 1.0 | 0.839 |
This evaluation has been run by Naouel Karam, Alsayed Algergawy and Amir Laadhar. If you have any problems working with the ontologies, any questions related to tool wrapping, or any suggestions related to the Biodiv track, feel free to write an email to: naouel [.] karam [at] fokus [.] fraunhofer [.] de