Ontology Alignment Evaluation Initiative - OAEI-2014 Campaign

Results for OAEI 2014 - Library Track

The following content is (mainly) based on the final version of the library section in the OAEI results paper.
If you notice any kind of error (wrong numbers, incorrect information on a matching system) do not hesitate to contact us.

Reference Alignment

The reference alignment we used for the evaluation is now available. Download Reference Alignment


Libraries play an important role in the linked data web, and they widely agree that linked data technologies are ideal to integrate the data of libraries around the world and to foster the collaboration on cataloguing among the libraries. Library data does not only consist of the vast amount of cataloguing data, but especially -- and probably more interesting for other communities -- also of authority data, i.e., normed descriptions of locations, events, persons, corporate bodies, and subject concepts. The subject concepts are usually organized in more or less hierarchical knowledge organization systems, together with semantic relations between the concepts. A thesaurus is such a knowledge organization system that is used for indexing purposes and that provides quasi-synonymous, describing labels for each concept. Thesauri are sometimes referred to as lightweight ontologies, however, we will see that this definition can be misleading.

Thesauri, and authority data in general, have a long history in libraries and are actively used and maintained by information professionals and domain experts. Due to their high quality and their long-term development, they could function as a "backbone of the Semantic Web".

Most thesauri are domain-dependent and specialized to be used within a certain field, e.g., to index publications with an economical focus. During previous experiments, we examined the topical overlap between the two thesauri used in this challenge: TheSoz (social sciences) and STW (economics). They share not only a lot of concepts, there is also a manually created alignment that can be used as reference. Many thesauri exist that cover the same or overlapping domains, often in different languages. Multilingual thesauri are an important means to bridge the gap between catalogs in different languages, so that users can search for relevant literature using their own language. Another possibility is the creation of links between concepts across different thesauri, possibly in different languages. Such alignments -- or correspondences or cross-concordances -- can be exploited to mutually add further information to both thesauri and subsequently improve the retrieval. Therefore, for many thesauri alignments exist that are manually created by domain experts. Nevertheless, the automatic identification of alignments is strongly desired, mainly due to two reasons: First, the manual creation of alignments between all existing thesauri is not feasible, so additional alignments have to be created, possibly by exploiting existing alignments (e.g., their transitivity). Second, automatically created alignments can be used to improve and enhance existing alignments, after approval by a domain expert. This is necessary, as most existing alignments are not complete and even if they are supposed to be complete, they have to be maintained just like the thesauri themselves, i.e., a constant effort is required to keep them up-to-date.

This library track is a new track within OAEI. However, there has already been a library track from 2007 to 2009 using different thesauri, as well as other thesaurus tracks like the food track and the environment track. A common motivation is that these tracks use a real-world scenario, i.e., real thesauri. For us, it is still a motivation to develop a better understanding, how thesauri differ from ontologies and how these differences affect state-of-the-art ontology matchers. We hope that the community accepts the challenge and that subsequently significant improvements can be seen that push the quality of automatic alignments between thesauri. Furthermore, we will use the matching results as input for the maintainers of the reference alignment to improve the alignment. While a full manual evaluation of all matching results is certainly not feasible, this way we constantly improve the reference alignment and mitigate possible weaknesses and incompleteness.

Test data

The library track uses two real-world thesauri, that are in many aspects comparable. They have roughly the same size, are both originally developed in German, are today both multilingual, both have English translations, and, most important, despite being from two different domains, they have huge overlapping areas. Not least, both are freely available in RDF using SKOS.


The STW Thesaurus for Economics provides vocabulary on any economic subject: more than 6,000 standardized subject headings (skos:Concepts, with preferred labels in English and German) and 19,000 additional keywords (skos:altLabels) in both languages. The vocabulary was developed for indexing purposes in libraries and economic research institutions and includes technical terms used in law, sociology, or politics, and geographic names. The entries are richly interconnected by 16,000 skos:broader/narrower and 10,000 skos:related relations. An additional hierarchy of main categories provides a high level overview. The vocabulary is maintained on a regular basis by ZBW German National Library of Economics - Leibniz Centre for Economics and has been translated into SKOS.


The Thesaurus for the Social Sciences (TheSoz) serves as a crucial instrument for indexing documents and research information in the social sciences. It contains overall about 12,000 keywords, from which 8,000 are standardized subject headings (in English and German) and 4,000 additional keywords. The thesaurus covers all topics and sub-disciplines of the social sciences. Additionally terms from associated and related disciplines are included in order to support an accurate and adequate indexing process of interdisciplinary, practical-oriented and multi-cultural documents. The thesaurus is owned and maintained by GESIS- Leibniz Institute for the Social Sciences and is available in SKOS.

Reference Alignment

A mapping between STW and TheSoz already exists and has been manually created by domain experts in the KoMoHe project \cite{Mayr2008}. However, it does not cover the changes and enhancements in both thesauri since 2006. It is available in SKOS with the different matching types SKOS:exactMatch, SKOS:broaderMatch and SKOS:narrwowerMatch. Within the reference alignment, concepts of one thesaurus are aligned to more than one concept of the second thesaurus. Thus, we face a \textit{n:m} mapping of the concepts. All in all, 4,285 TheSoz concepts and 2,320 STW concepts are aligned with 2,839 exact matches, 34 broader matches and 1,416 narrower matches. It is important to note that the reference alignment only contains alignments between the descriptors of both thesauri, i.e., the concepts that are actually used for document indexing. The upper part of the hierarchy consists of non-descriptor concepts (or categories) that are only used to organize the descriptors below them. We take this specialty into account as we only assess the generated alignments between descriptors and ignore alignments between non-descriptors. However, this might change in the future, as the results of this track could be used to extend the reference alignment to the upper part of the hierarchy.


Ontology matching systems taking part in the OAEI only work on OWL ontologies and are not (yet) ready to deal with the specialties of a thesaurus. To get first results and to lower the barrier of taking part in this challenge, we provide OWL versions of the thesauri, generated as follows:

skos:concept ➔ owl:class
skos:prefLabel, skos:altLabel ➔ rdfs:label
skos:scopeNote, skos:notation ➔ rdfs:comment
skos:narrower ➔ rdfs:superClassOf
skos:broader ➔ rdfs:subClassOf
skos:related ➔ rdfs:seeAlso
This transformation obviously is not loss-less. First and foremost, within the ontology, it is not recognizable which label is the preferred one and which ones are alternative labels. Since matching systems mostly have to focus on the labels, this transformation might lead to suboptimal results. There are, however, more fundamental differences between ontologies and thesauri that we show in the next section.
This year, we also provide an OWL-version including skos:prefLabel and skos:altLabel as annotation properties.


Thesauri -- and other, similar knowledge structures like classifications or taxonomies -- are often called lightweight ontologies. However, ontologies and thesauri fundamentally differ. This is also reflected by the fact that with SKOS a specific model for thesauri exists that is formulated in OWL. There, a skos:Concept is not an owl:Class. Concepts sometimes represent classes, for example the STW concept Commodities. However, this is not true for every skos:Concept, e.g., the STW concept Germany is an instance, not a class. Having a look at the subordinate concepts of Commodities, they mostly indeed represent classes, like Metals -- Metal Products -- Razor. Nevertheless, the relation in SKOS between these concepts is skos:broader, not rdfs:subClassOf. A subclass relationship states that if a class B is a subclass of a class A, then all instances of B will also be instances of A. Here, all metals are commodities, but not all metal products are metals: the razor consists partly of metal, but it is no metal. Thesauri are created for a very specific purpose and are used in a predetermined way. This is inter alia reflected by the distinction of descriptors and non-descriptors. Only descriptors are assigned to publications during the indexation or classification. All non-descriptors serve as additional information to provide the correct context or to build up a proper hierarchy. Such a distinction typically does not exist in an ontology. Very difficult for ontology matchers (not necessarily only automatic ones) is the quasi-synonymy of the describing labels for a concept. A skos:altLabel is often used to indicate subconcepts that should be subsumed under the concept in question to avoid extensive subclassing. As an example, the STW descriptor 14117-2 with the preferred English label Tropical fruit has German alternative labels like pineapple, avocado, and kiwi. In an (OWL) ontology, these alternative labels should be modeled as instances of the class Tropical fruit. In contrast, other alternative labels might really indicate alternative, synonymous terms for the preferred label. At last, instead of arbitrary semantic relations that are part of an ontology, in thesauri, relations like skos:related or compoundEquivalence in TheSoz exist. They often contain information for the (manual) use of the thesaurus for indexing, i.e., which descriptor should be used in which case or how combinations of descriptors are to be used. Transferring them to ontological relations is not always possible and depends often on the single case. It can be seen that the development of a thesaurus matcher is indeed a challenge that differs from ontology matching. Nevertheless, the commonalities between thesauri and ontologies are large enough to pave the way for further developments by means of current ontology matchers.

Experimental Setting

To compare the created alignments with the reference alignment, we use the Alignment API. We only included equivalence relations (skos:exactMatch).

The generated alignments are available here.

All matching processes have been performed on a Debian machine with one 2.4GHz core and 7GB RAM allocated to each system. The evaluation has been executed by using SEALS technologies. Each participating system uses the OWL version. We computed precision, recall and F-measure (beta=1) for each matcher. Moreover, we measured the runtime, the size of the created alignment and checked whether a 1:1 alignment has been created. To assess the results of the matchers, we developed three straight-forward matching strategies, using the original SKOS version of the thesauri:


Of all 12 participating matchers (or variants), 7 were able to generate an alignment within 8 hours. The results can be found in the table above.

The best systems in terms of F-measure are AML and LogMap. AML* and LogMap* are the matching systems performed on the OWL-dataset with SKOS annotations. For both systems, using this ontology version increases the F-measure up to 7% which shows that the additional information are useful. Except for AML, all systems are below the MatcherPrefDE and MatcherAllLabels strategies. A group of matchers including LogMap, LogMapLite, and XMap2 are above the MatcherPrefEN baseline. Compared to the last year evaluation, the results are similar: The baselines with prefered labels are still very good and can only be beaten by one system. AML* has a better F-Measure than any other system before (4% increase compared to the best matcher of last year).

Like in previous years, an additional intellectual evaluation of the alignments established automatically was done by a domain expert to further improve the reference alignment. Since the competing ontology alignment tools predominantly apply lexical approaches for the mapping of the two vocabularies they foremost establish new correspondences on the character level. Main approaches they apply here are Levenshtein distance or string recognition where character strings could consist of up to a whole part of a compound word, partly used as an adjective. These character respectively string matching approaches together with the three above described straightforward matching strategies could lead to different types of systematic mismatches. Especially in the case of short terms Levensthein distance could lead to wrong correspondence, e.g., “Ziege” (Eng. goat) and “Zeuge” (Eng. witness) or “Dumping” (Eng. dumping) and “Doping” (Eng. doping). Mere string match could also lead to wrong correspondences. This could happen when the longest string is at the beginning of a word, e.g., “Monopson” (Eng. monopsony) and “Monotonie” (monotony), when the longest string is at the end of a word, e.g., “Zession” (Eng. cession) and “Rezession” (Eng. recession), or when the longest corresponding string is in the middle of a word, e.g., “Rohrleitungsbau” (Eng. pipeline construction) and “Jugendleiter” (Eng. youth leader). Mismatches could also occur when the longest string consists of an independently occurring word, e.g., “Kraftfahrtversicherung” (Eng. motor-vehicle insurance) and “Zusatzversicherung” (Eng. supplementary insurance) or the longest occurring word is an adjective, e.g., “Arabisch” (Eng. Arab) and “Arabische Liga” (Eng. Arab League). Both sources of mismatch, Levensthein distance and string match, could also occur in one single mapping case, e.g., “Leasinggesellschaft” (Eng. leasing company) and “Leistungsgesellschaft” (Eng. achieving society). Since the translations were equally used to build up mappings they could also lead to a number of mismatches, e.g., “Brand” (Eng. incendiary) and “Marke” (Eng. brand). The same applies to indications of homonyms, e.g. “Samen (Volk)” (Eng. sami (people)) and “Volk” (Eng. people).


The overall improvement of the performance is encouraging in this challenge. While it might not look impressive to beat simple baselines as ours at first sight, it is actually a notable achievement. The baselines are not only tailored for very high precision, benefitting from the fact that in many cases a consistent terminology is used, they also exploit additional knowledge about the labels. The matchers are general-purpose matchers that have to perform well in all challenges of the OAEI. Using the SKOS properties as annotation propeties is a first step in order to make use of the many concept hierarchies provided on the Web. The intellectual evaluation of new mappings which have been created automatically has shown that matching tools are apparently still based exclusively on lexical approaches (comparison at string level). It becomes obvious that, instead, context knowledge is needed to avoid false mappings. This context knowledge must clearly go beyond the mere consideration of translations and synonyms. One approach could be the consideration of the classification schemes of the Thesauri before establishing new mappings. Taking into account the reference alignment, the highest confidence values should be assigned to those mapping candidates which come from those classification schemes which have been most commonly mapped in the reference alignment.


Dominique Ritze (Research Gorup Data and Web Science, University of Mannheim) dominique[.][at]informatik[.]uni-mannheim[.]de
Kai Eckert (Research Gorup Data and Web Science, University of Mannheim)
Benjamin Zapilko(GESIS)
Andreas Oskar Kempf (GESIS)
Joachim Neubert (ZBW)

Original page: http://web.informatik.uni-mannheim.de/oaei-library/2014/