Benchmark test

The goal of the benchmark test is to offer a set of tests which are wide in feature coverage, progressive and stable. It serves the purpose of evaluating the strength and weakness of matchers (by being progressive and wide coverage) and measuring the progress of matchers (by being stable and reusable over the years).

Data sets

The benchmark test library consists of data sets that are built from reference ontologies of different sizes and from different domains. The bibliographic ontology described here has been the main reference ontology since the beginning of OAEI campaigns. Since OAEI 2011, we use new systematically generated benchmarks, based on other ontologies than the bibliographic one.

As for previous campaigns, Benchmark test suites (or data sets) will be generated from seed ontologies. The following (incomplete) table summarizes the information about ontologies' sizes. In 2013, we plan to generate more difficult tests, likely from the same bibliographic ontology or from a third ontology. So, for these new tests, the evaluation will be conducted in a blind fashion, i.e., the participants will have no access to the original ontologies.


Test set	biblio	finance
	ontology size
classes+prop	97	633
instances	112	1113
entities	209	1746
triples	1332	21979

Testing your tool

It is not necessary anymore to download the data sets (it has always been better to get it on the web). The SEALS platform will provide the data sets.

Participants can test their tools using the standard benchmark dataset described above, which can be downloaded here. They can enforce testing with a subset of data sets built with reference ontologies used in previous campaigns, which are stored in the Test Data Repository accessible through the SEALS portal. Note that two of these data sets were built based on the same reference ontologies (biblio and finance) which will be used in OAEI 2013.

All those data sets maintain the structure explained in the Example of a complete benchmark data set section, and testing with those data sets can be done by using the SEALS client. This client iterates over tests in a data set whose identifier is provided as a parameter. In all cases, the ontologies found in the data set directories are matched (either against the ontology found in 101/onto.rdf, or both ontologies to match are in the same directory). The resulting alignments must be outputted in the alignment format. They are placed in a local directory given also as a parameter to the client.

The identifiers of data sets for testing with the SEALS client are given below:

Biblio data set

Repository: http://seals-test.sti2.at/tdrs-web/
Suite-ID: biblio-dataset
Version-ID: biblio-dataset-r1

Finance data set

Repository: http://seals-test.sti2.at/tdrs-web/
Suite-ID: finance-dataset
Version-ID: finance-dataset-r1

We encourage you to use the Alignment API for manipulating and generating your alignments, and, in particular, for computing evaluation of your results.

Contacts

Contact address is Jerome : Euzenat # inria : fr