The Knowledge Graph Track contains isolated knowledge graphs with instance and schema data. The goal of the task is to match both the instances and the schema. The knowledge graphs were created in the course of the DBkWik by running the DBpedia extraction framework on Wikis from the Fandom Wiki hosting platform. The evaluation process of the Knowledge Graph Track will be supported by SEALS platform.
The preliminary results for OAEI have been generated and are available now: Link to results page
The data set is available from this Web page (see below). You can either download them for local analysis and processing, or you can directly use the SEALS platform. For running the Knowledge Graph test suite, you will have to specify the following input parameters:
For the evaluation, we use a gold standard of correspondences both on the schema and the instance level. While the schema level correspondences were created by experts, the instance correspondences were extracted from the "External links" section of the wiki pages. Due to the large amount of instances and classes this gold standard is only a partical gold standard.
The following table describes the knowledge graphs and the sources they were created from:
Source | Source URL | Language | Hub | Topic | #Instances | #Properties | #Classes | Dump |
Star Wars Wiki | http://starwars.wikia.com | en | Movies | Entertainment | 145,033 | 700 | 269 | rdf/xml |
The Old Republic Wiki | http://swtor.wikia.com | en | Games | Gaming | 4,180 | 368 | 101 | rdf/xml |
Star Wars Galaxies Wiki | http://swg.wikia.com | en | Games | Gaming | 9,634 | 148 | 67 | rdf/xml |
Marvel Database | http://marvel.wikia.com | en | Comics | Comics | 210,996 | 139 | 186 | rdf/xml |
Marvel Cinematic Universe Wiki | http://marvelcinematicuniverse.wikia.com | en | Movies | Entertainment | 17,187 | 147 | 55 | rdf/xml |
Memory Alpha | http://memory-alpha.wikia.com | en | TV | Entertainment | 45,828 | 325 | 181 | rdf/xml |
Star Trek Expanded Universe | http://stexpanded.wikia.com | en | TV | Entertainment | 13,426 | 202 | 283 | rdf/xml |
Memory Beta | http://memory-beta.wikia.com | en | Books | Entertainment | 51,323 | 423 | 240 | rdf/xml |
Name | Comment | Example/Preview |
anchor-text | (wiki page, dbkwik:wikiPageWikiLinkText>, text) triplestext appears as link text of the wiki page | preview |
article-categories | (wiki page, dct:subject, category) triples | preview |
category-labels | (category, rdfs:label, label) triples | preview |
disambiguations | (wiki page, dbkwik:wikiPageDisambiguates, wiki page) triples | preview |
external-links | (wiki page, dbkwik:wikiPageExternalLink, url) triples. all links to external uri | preview |
images | triples with foaf:depiction, foaf:thumbnail, dc:rights | preview |
infobox-properties | extracted triples from infoboxesuri of property: http://dbkwik.webdatacommons.org/{wiki}/property/{name} | preview |
infobox-property-definitions | defines the type of properties to be rdf:Property and contains corresponding labels | preview |
infobox-template-type | (wiki page, rdf:type, class) triples, where template name contains "infobox" | preview |
infobox-template-type-definitions | defines the type of properties to be rdf:Property and contains corresponding labels | preview |
labels | (wiki page, rdfs:label, label) triplesthe label is usually the title of the wiki page | preview |
long-abstracts | (wiki page, dbkwik:abstract, abstract) triplesthe abstract is the text until the toc or first header. | preview |
short-abstracts | (wiki page, rdfs:comment, comment) triplesshort abstract between [200,600] see extractor | preview |
skos-categories | skos:prefLabel and skos:broader of categories(category tree) | preview |
template-type | (wiki page, rdf:type, class) triples | preview |
template-type-definitions | label and type of classes. | preview |
page-links | (wiki page, dbkwik:wikiPageWikiLink, wiki page) triplesrepresents any link between two pages | preview |
The alignments will be evaluated based on Precision, Recall and F-Measure. We will compare the overall performance, as well as the performance on instance and schema level in isolation. The matching system does not need to match instances of classes "http://dbkwik.webdatacommons.org/ontology/Image" and skos:Concept. These are only included to help the system to find good matches. For the evaluation it doesn't make a difference because instances of these classes are not evaluated at all.
[1] Sven Hertling, Heiko Paulheim: DBkWik: A Consolidated Knowledge Graph from Thousands of Wikis. International Conference on Big Knowledge 2018. [pdf]
[2] Alexandra Hofmann, Samresh Perchani, Jan Portisch, Sven Hertling, and Heiko Paulheim. DBkWik: Towards Knowledge Graph Creation from Thousands of Wikis. International Semantic Web Conference (Posters & Demos) 2017. [pdf]