Alignment format and API

The Alignment format

We use a simple alignment format example of which can be found in the alignments of the benchmark dataset.

This format is described in the API documentation.

These formats can be manipulated through the Alignment API (see below).

How to use the alignment API for generating evaluation results

The instructions here work with version 3.0 (currently 3.6) of the alignment API

Getting the API

See the instructions on http://alignapi.gforge.inria.fr or download directly now at: http://gforge.inria.fr/frs/?group_id=117

All required libraries are included.

Please, always download the last version.

Installing

$ mkdir test
$ cd test
$ curl http://gforge.inria.fr/frs/download.php/1775/align-3.0.zip -o align.zip
$ unzip align.zip

Checking that it works

First here is a simple test that it works:

$ setenv CWD `pwd`
$ java -jar lib/procalign.jar --help
$ java -jar lib/procalign.jar file://$CWD/rdf/onto1.owl file://$CWD/rdf/onto2.owl

The last instruction should output some XML.

Using the API for the tests

You can run all the tests at once by using the GroupAlign class

The systematic benchmark tests

$ curl http://oaei.ontologymatching.org/2009/benchmarks/bench.zip -o bench.zip
$ unzip bench.zip
$ cd benchmarks
$ java -cp $CWD/lib/procalign.jar fr.inrialpes.exmo.align.util.GroupAlign -o stringeq  -n http://oaei.inrialpes.fr/2009/benchmarks/101/onto.rdf -i fr.inrialpes.exmo.align.impl.method.StringDistAlignment -Dnoinst=1

This will output the results of the test in the Alignment format, required by the organizers, in one file named stringeq.rdf in each test directory.

Note for using ISO-Latin: For using the ISO-Latin files it is first necessary to replace the onto.rdf by their onto-iso8859.rdf counterpart.

The directory tests

$ mkdir directory
$ cd directory
$ curl http://dit.unitn.it/~accord/ev_subs.zip -o ev_subs.zip
$ unzip ev_subs.zip
$ java -Xmx100000k -cp $CWD/lib/procalign.jar fr.inrialpes.exmo.align.util.GroupAlign -o stringeq -s source.owl -t target.owl -i fr.inrialpes.exmo.align.impl.method.StringDistAlignment

This will output the results of the test in the Alignment format, required by the organizers, in one file named stringeq.rdf in each test directory.

Currently (2007) this test breaks after having dealt with most of the tests (it breaks at 97 in alphabetical order with the standard 64Kb of heap). We are currently working towards spotting memory leaks; a temporary fix is the one given above: expanding memory size with -Xmx100000k.

Currently the anatomy test breaks on loading ontologies.

Evaluating your results

In the case of the benchmark tests, the reference ontologies are available so you can directly get the performance of the used method by using the GroupEval class:

$ java -cp $CWD/lib/procalign.jar fr.inrialpes.exmo.align.util.GroupEval -c -l "edna" > result.html

which will output the precision and recall results in the results.html file which looks like this:

algo	edna
test	Prec.	Rec.
101	0.90	0.99
103	0.90	0.99
104	0.90	0.99
...

This class will only work if you are connected to the web because the reference alignments look for their ontologies on Internet.

Note for using ISO-Latin: For using the ISO-Latin files it is first necessary to replace the onto.rdf by their onto-iso8859.rdf counterpart.

It is not possible to use this for the directory and anatomy tests because reference alignments are not available (but you can create yours and check your performances in the same way).

Implementing your matcher within the API

In order for your method to be used in this way you have to implement the Alignment API (or rather to extend our implementation of the Alignment API).

Some instructions for implementing the API can be found in the API documentation.

This basically amounts to define a subclass of BasicAlignment, getting its parameters from a Parameters structure and to fill the alignment with the results of the algorithm.

Passing parameters

If you use the Procalign or GroupAlign class, it is very convenient to pass them parameters stored in a file (e.g., param.xml) by:

$ java -cp ... -p param.xml

Such a parameter file can be sent to the organizers of the evaluation.

Parameter file are in a simple XML format:

<Whatever>
<param name="name₁">value₁</param>
...
<param name="name_n">value_n</param>
</Whatever>

$Id: align.html,v 1.3 2009/08/24 09:38:33 euzenat Exp $