OAEI Evaluation Report: Beyond Equivalence 2025

Traditional (Precision, Recall, F1-score) and isAmong (Precision*, Recall*, F1-score*) evaluation across datasets and matchers. Values shown as percentages.

Datasets (Industrial Clasification Standards)
5
Datasets (STROMA/TaSeR)
5
Effective Matchers
5

Overview & Interpretation

Description

This first-year track evaluates five matchers—LogMap, LogMap-bio, LogMap-kg, Matcha, and MDMapper—across 10 datasets from two families: industrial classification standards (ECLASS–GPC, ECLASS–UNSPSC, ETIM–ECLASS, GPC–UNSPSC, GPC–UNSPSCplus) and STROMA/TaSeR (G1, G2, G3, G5, G7). We report both traditional metrics (Precision, Recall, F1-Score), which reward only exact identicial correspondences, and isAmong metrics (Precision*, Recall*, F1-Score*), which also give credit for partially correct relations such as superclass/subclass and overlap.

Overall, the produced alignments suggest that most systems have limited support for non-equivalence relations; an exception is MDMapper, which also captures superclass/subclass/overlap.

What is isAmong? In brief, an alignment is transformed so that each class is mapped to a covering set of descendant classes in the other ontology (its “isAmong” set). Class-level (for each class) Precision/Recall/F1-Score are then computed from the overlap between predicted and reference descendant-sets, and averaged across source and target sides. This yields a fine-grained, relation-aware score that fairly rewards containment/overlap—even when a system does not predict the exact relation in the reference alignment. This approach is designed for classification ontologies and avoids hand-tuned weights while supporting inference of border relations (≡, ≤, ≥, ≃).

Analysis


>>> download raw results (alignments)

Overall Results

Cells highlighted indicate the best score within that dataset for the given metric. Matchers are listed alphabetically.

Macro-Averages by Matcher — All (10 datasets)

Matcher PrecisionRecallF1-ScorePrecision*Recall*F1-Score*
LogMap30.69%17.44%13.99%22.33%19.48%19.62%
LogMap-bio45.33%15.30%19.31%20.82%17.50%17.92%
LogMap-kg38.78%16.68%16.62%21.29%18.37%18.59%
Matcha25.57%12.02%6.10%13.81%11.44%11.01%
MDMapper47.94%10.51%16.40%21.33%16.12%16.80%

Macro-Averages by Matcher — Industrial Clasification Standards

Matcher PrecisionRecallF1-ScorePrecision*Recall*F1-Score*
LogMap24.60%5.02%6.29%10.98%8.52%8.64%
LogMap-bio36.49%5.03%7.88%10.85%8.35%8.49%
LogMap-kg32.86%5.03%7.47%10.85%8.35%8.49%
Matcha13.53%0.07%0.14%3.69%1.65%1.70%
MDMapper29.74%5.87%9.13%17.09%12.31%12.66%

Macro-Averages by Matcher — STROMA/TaSeR

Matcher PrecisionRecallF1-ScorePrecision*Recall*F1-Score*
LogMap36.77%29.86%21.69%33.68%30.44%30.61%
LogMap-bio54.18%25.56%30.73%30.79%26.66%27.34%
LogMap-kg44.70%28.34%25.78%31.74%28.39%28.69%
Matcha37.62%23.96%12.06%23.94%21.23%20.32%
MDMapper66.15%15.15%23.67%25.57%19.92%20.93%

Per-Dataset Results

eclass-gpc

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap32.35%0.09%0.17%4.09%1.67%1.83%
LogMap-bio30.56%0.09%0.17%4.19%1.65%1.82%
LogMap-kg30.56%0.09%0.17%4.19%1.65%1.82%
MDMapper12.83%0.19%0.37%10.81%6.09%6.29%

eclass-unspsc

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap17.34%0.04%0.07%3.97%1.59%1.74%
LogMap-bio16.10%0.04%0.08%4.31%1.75%1.91%
LogMap-kg16.10%0.04%0.08%4.31%1.75%1.91%
Matcha14.09%0.03%0.05%3.32%1.14%1.28%
MDMapper11.56%0.11%0.21%11.86%5.04%5.56%

etim-eclass

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap40.32%24.69%30.62%37.00%34.17%34.44%
LogMap-bio88.13%24.82%38.74%37.15%34.35%34.60%
LogMap-kg70.02%24.82%36.65%37.15%34.35%34.60%
MDMapper96.77%28.55%44.09%42.17%38.73%39.50%

gpc-unspsc

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap24.51%0.13%0.25%6.92%3.11%3.23%
LogMap-bio24.51%0.13%0.25%6.92%3.11%3.23%
LogMap-kg24.51%0.13%0.25%6.92%3.11%3.23%
Matcha21.83%0.16%0.31%9.14%4.28%4.32%
MDMapper13.73%0.29%0.56%16.35%8.75%9.07%

gpc-unspscplus

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap8.49%0.16%0.31%2.91%2.07%1.96%
LogMap-bio23.15%0.09%0.17%1.68%0.90%0.92%
LogMap-kg23.15%0.09%0.17%1.68%0.90%0.92%
Matcha18.18%0.10%0.20%2.30%1.17%1.19%
MDMapper13.79%0.22%0.44%4.23%2.93%2.88%

g1-web

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap3.20%53.64%6.05%50.10%50.22%48.76%
LogMap-bio60.81%40.91%48.91%44.59%41.25%41.50%
LogMap-kg16.47%53.64%25.20%50.10%50.22%48.76%
MDMapper88.24%36.36%51.50%45.96%39.20%40.69%

g2-diseases

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap51.67%69.77%59.38%45.81%47.64%45.92%
LogMap-bio60.17%61.02%60.59%44.36%44.55%43.85%
LogMap-kg57.14%62.15%59.54%43.59%44.25%43.31%
Matcha2.50%70.34%4.83%28.04%36.68%28.64%
MDMapper57.45%7.63%13.47%11.03%7.64%8.07%

g3-text

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap43.75%7.35%12.58%19.88%9.11%11.56%
LogMap-bio43.75%7.35%12.58%19.88%9.11%11.56%
LogMap-kg43.75%7.35%12.58%19.88%9.11%11.56%
Matcha39.85%6.96%11.84%19.79%8.91%11.47%
MDMapper38.53%5.51%9.64%14.97%6.49%8.37%

g5-groceries

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap20.51%5.13%8.21%16.71%14.00%14.24%
LogMap-bio27.59%5.13%8.65%15.70%13.60%13.80%
LogMap-kg27.59%5.13%8.65%15.70%13.60%13.80%
Matcha23.53%5.13%8.42%16.98%14.02%14.32%
MDMapper46.51%12.82%20.10%28.84%24.81%24.61%

g7-literature

Matcher Traditional isAmong
PrecisionRecallF1-Score Precision*Recall*F1-Score*
LogMap64.71%13.41%22.22%35.90%31.24%32.56%
LogMap-bio78.57%13.41%22.92%29.43%24.77%26.01%
LogMap-kg78.57%13.41%22.92%29.43%24.77%26.01%
Matcha84.62%13.41%23.16%30.94%25.31%26.87%
MDMapper100.00%13.41%23.66%27.07%21.47%22.92%