There are five alignment systems enrolled in the complex track, which are AML, AMLC, AROA, CANARD and Lily. Besides these five systems, we also evaluated other alignment systems. The systems were run on the GeoLink dataset , and output alignments were evaluated as described below.
In this subtrack, the alignments are automatically evaluated over the GeoLink dataset which contains around 74k instance data from real-world datasets. In order to assess the quality of a mapping, there are two dimensions that we can look into. We can first evaluate if a mapping contains the correct entities that should be involved based on the reference alignment. Another dimension is the relationship between these entities, e.g., equivalence and subsumption. Based on this, we break the evaluation procedure down into three subtasks, which are entity identification, relationship identification, and full complex identification.
For each entity in the source ontology, the alignment system is asked to list all of the entities in the target ontologies that are related to it in some way.
owl:equivalentClasses(ont1:A1 owl:intersectionOf(ont2:B1 owl:someValuesFrom(ont2:B2 ont2:B3))
The goal in this task is to find the most relevant entities in the ont2 to the class ont1:A1. In this case, the best output would be ont2:B1, ont2:B2, and ont2:B3.
The result is evaluated based on precision, recall, and f-measure.
For each alignment, the system should then endeavor to find the concrete relationships, such as equivalence, subsumption, intersection, value restriction, and so on, that hold between the entities. In terms of the example above, an alignment system needs to eventually determine that the relationship between the two sides is equivalence. Table 1 shows the different similarity that we used in the evaluation for different situations. We do not penalise the incorrect relationship by giving a ZERO value because that would completely neglect the entity identification outputs without considering whether it is a reasonable result or a completely incorrect one.
Table 1. Similarity for Relationship Identification
|Found Relation||Correct Relation||Similarity||Comment|
|⊂||=||0.8||return less information, but correct|
|=||⊃||0.8||return less information, but correct|
|⊃||=||0.6||return more information, but incorrect|
|=||⊂||0.6||return more information, but incorrect|
This task is a combination of the former two steps. We multiply the results from the entity identification by the similarity of the relationship as the relaxed precision, recall, and f-measure. To be accurate, it could also have been better aggregated with other aggregation functions rather than multiplication 
relaxed_precision = entity_precision * similarity of relationship
relaxed_recall = entity_recall * similarity of relationship
relaxed_f-measure = 2 * relaxed_precision * relaxed_recall/ (relaxed_precision + relaxed_recall)
The output alignments as well as the detailed results of the systems over the GeoLink dataset are downloadable here.
Tabel 2. The Performance of All Alignment Systems on the Hydrography Benchmark
|Systems||(1:1)||(1:n)||(m:n)||relaxed precision||relaxed recall||relaxed f-measure|
 Lu Zhou, Michelle Cheatham, Adila Krisnadhi, Pascal Hitzler. A Complex Alignment Benchmark: GeoLink Dataset. In: International Semantic Web Conference. Springer, Proceedings, Part II, pp. 273-288, 2018.
 Marc Ehrig, and Jérôme Euzenat. "Relaxed precision and recall for ontology matching." K-CAP 2005 Workshop on Integrating Ontologies, Banff, Canada, 2005.