These results are those presented at the EON workshop to which the results of test #206 have been added.
More changes will come in another context.

Table of results

Below is the table of Precision and recall results commputed on the output provided by the participants with the help of the GroupEval class of the Alignment API.

java -cp lib/procalign.jar fr.inrialpes.exmo.align.util.GroupEval -c -l "karlsruhe2,umontreal,fujitsu,stanford"

algo karlsruhe2 umontreal fujitsu stanford
test Prec. Rec. Prec. Rec. Prec. Rec. Prec. Rec.
101 n/an/a 0.59 0.97 0.99 1.00 0.99 1.00
102 NaN NaN 0.00 NaN NaN NaN NaN NaN
103 n/an/a 0.55 0.90 0.99 1.00 0.99 1.00
104 n/an/a 0.56 0.91 0.99 1.00 0.99 1.00
201 0.43 0.51 0.44 0.71 0.98 0.92 1.00 0.11
202 n/an/a 0.38 0.63 0.95 0.42 1.00 0.11
204 0.62 1.00 0.55 0.90 0.95 0.91 0.99 1.00
205 0.47 0.60 0.49 0.80 0.79 0.63 0.95 0.43
206 0.48 0.60 0.46 0.75 0.85 0.64 1.00 0.46
221 n/an/a 0.61 1.00 0.98 0.88 0.99 1.00
222 n/an/a 0.55 0.90 0.99 0.92 0.98 0.95
223 0.59 0.96 0.59 0.97 0.95 0.87 0.95 0.96
224 0.97 0.97 0.97 1.00 0.99 1.00 0.99 1.00
225 n/an/a 0.59 0.97 0.99 1.00 0.99 1.00
228 n/an/a 0.38 1.00 0.91 0.97 1.00 1.00
230 0.60 0.95 0.46 0.92 0.97 0.95 0.99 0.93
301 0.85 0.36 0.49 0.61 0.89 0.66 0.93 0.44
302 1.00 0.23 0.23 0.50 0.39 0.60 0.94 0.65
303 0.85 0.73 0.31 0.50 0.51 0.50 0.85 0.81
304 0.91 0.92 0.44 0.62 0.85 0.92 0.97 0.97

I further provide some comments and about these results, in next section. I also comment on a number of modifications that have been made on the provided files.

Analysis of the results

Here are some consideration of the results obtained by the various participants. These are not statistically backed up and only corresponds to a rough analysis. More explainations are found in the papers presented by the participants.

There were two groups of competitors...

In this test, there are clear winners it seems that the results provided by Stanford and Fujitsu/Tokyo outperform those provided by Karlsruhe and Montréal/INRIA.

In fact, it can be considered that these constitute two groups of programs. The Stanford+Fujitsu programs are very different but strongly based on the labels attached to entities. For that reason they performed especially well when labels were preserved (i.e., most of the time). The Karlsruhe+INRIA systems tend to rely on many different features and thus to balance the influence of individual features, so they tend to reduce the fact that labels were preserved.

This intuition should be further considered in the light of more systematic tests which were planned but never made.

...and indeed three groups of tests

Without going through a throughout statistical analysis of the results, it seems that the separation between sets of test that we presented (indicated by the first digit of their numbers) is significant for the participants as well.

Additional remarks

It is very difficult indeed to compare these results. This is true because there were different tests and not all go in the same direction. So aggregating the results can be done in many ways (e.g., averaging, global P/R, counting dominance). Moreover, these results are based on two measures, precision and recall, which are very easily understood but dual in the sense that increasing one often decreases the other. This means that one algorithms can have sometimes the same results as another but they are found non comparable in the table.

As an indication, the average values are given below:

algo karlsruhe2 umontreal fujitsu stanford
test Prec. Rec. Prec. Rec. Prec. Rec. Prec. Rec.
1xx..0,570,930,991,000,991,00
2xx0,590,800,540,880,940,840,990,75
3xx0,900,560,370,560,660,670,920,72
total0,710,710,480,820,890,830,970,78

Modifications made to the files

Establishing these results has been as automated as possible. However, we add to modify the provided files in a number of ways that are listed below.

The complete results can be retrieved from this archive.

One first remark is that we finally had some problems that we did not solved with accentuated characters (test #206).

Karlsruhe

Karlsruhe did not provided results for all the tests and we did not complete their tests for them.

The Karlsruhe results for 204-206 were inverted (i.e., ontology 2 was that of the 101 test). We used the ParserPrinter utility of the alignment API to revert the correct order. Otherwise, the results provided were 0.

Most of Karlsruhe results contained references to the first test series (notably Techreport and others in the # 230 which have been discarded).

It also contained references to special ontologies (onto-noinst) not provided with the tests and references to alignments with FOAF entities (perfectly valid but discarded for technical reasons). It is believed that these discarding tend to improve the results of the tests anyway.

The first set of result files from Karlsruhe were based on the initially released tests, had no threshold applied and some KAON objects were in the results. After asking to redo the tests these problems were solved.

Fujitsu and Kyoto university

The ontology namespaces and filenames provided in these tests were based on a local file system (moreover, filenames onto.rdf were changed to refonto.rdf). They thus have been automatically converted in those of the contest.

INRIA and University of Montréal

The results provided by INRIA and University of Montréal used the Alignment API. The only change that was made was in the namespace of the alignment API itself since that of the implementation did not correspond to that of the specification.

Stanford

The results of Stanford were provided in pure RDF files with heavy use of references and the use of XML attributes for datatyped attributes. Since our alignment parser expect RDF/XML files and works only on the XML (i.e., it is unable to resolve rdf references), we developped an XSLT stylesheet solving these problems.

Results #302 and #221 were however slightly different and deserved a separate treatment.

Results #230, 201, 202 were not referring to the last file so they have been corrected for the best (by discarding cells causing trouble).

Most URI were local (referring to my own filesystem), so we had to adjust this too.

Stanford provided two additional tests that where not included in these results. They are, however, perfectly legitimate tests that should be included in further campaign.

Lockheed Martin

Lockheed Martin provided a set of results in notation 3 (n3). Due to the size of the result files and the need to create another XSLT stylesheet on XML/RDF files returned by CWM, it was not possible to included this before the meeting.

Appendice 1: shell-script for adjusting most of the problems

#!/bin/sh

CWD=`pwd`

for i in `ls -d ???`
do
if [ -a $i/refalign.rdf ]
then
ed $i/refalign.rdf << EOF
1,$ s:http\://exmo.inrialpes.fr/align/1.0:http\://knowledgeweb.semanticweb.org/heterogeneity/alignment:
w
EOF
else
	echo no $i/refalign.rdf
fi
if [ -a $i/umontreal.rdf ]
then
ed $i/umontreal.rdf << EOF
1,$ s:http\://exmo.inrialpes.fr/align/1.0:http\://knowledgeweb.semanticweb.org/heterogeneity/alignment:
w
EOF
else
	echo no $i/umontreal.rdf
fi
if [ -a $i/karlsruhe.rdf ]
then
	echo "Normaly we should have to suppress Kaon objects" 
	echo "Normaly we should have to suppress non existing objects" 
ed $i/karlsruhe.rdf << EOF
1,$ s;101/onto.rdf#Techreport;101/onto.rdf#TechReport;
w
EOF
else
	echo no $i/karlsruhe.rdf
fi
if [ -a $i/fujitsu.rdf ]
then
ed $i/fujitsu.rdf << EOF
1,$ s;file:///D:/align/align/oacontest/;http://co4.inrialpes.fr/align/Contest/;
1,$ s;file:/D:/align/align/oacontest/;http://co4.inrialpes.fr/align/Contest/;
1,$ s;refonto.rdf;onto.rdf;
1,$ s:../../../dtd:../dtd:
w
EOF
else
	echo no $i/fujitsu.rdf
fi
if [ -a $i/stanford.rdf ]
then
	xsltproc stan.xsl $i/stanford.rdf > $i/stan.rdf
	mv $i/stan.rdf $i/stanford.rdf
else
	echo no $i/stanfordrdf
fi
done

Appendice 2: XSLT stylesheet for transforming RDF into parsable XML/RDF

This can be found here.

Appendice 3: Complementary results

This first table displays, together with the results of the participants of the contests, those obtained by the demonstration aligners provided with the Alignment API and the first results obtained by Karlsruhe.

java -cp /Volumes/Phata/JAVA/ontoalign/lib/procalign.jar fr.inrialpes.exmo.align.util.GroupEval -l "std,nea,ssda5,edna5,sdna5,karlsruhe,karlsruhe2,umontreal,fujitsu,stanford" -f "pr" -c

algo std nea ssda5 edna5 sdna5 karlsruhe karlsruhe2 umontreal fujitsu stanford
test Prec. Rec. Prec. Rec. Prec. Rec. Prec. Rec. Prec. Rec. Prec. Rec. Prec. Rec. Prec. Rec. Prec. Rec. Prec. Rec.
101 0.89 0.36 0.89 0.98 0.87 0.99 0.87 0.99 0.87 0.99 n/an/a n/an/a 0.59 0.97 0.99 1.00 0.99 1.00
102 0.00 NaN 0.00 NaN 0.00 NaN 0.00 NaN 0.00 NaN n/an/a NaN NaN 0.00 NaN NaN NaN NaN NaN
103 0.89 0.36 0.90 0.99 0.87 0.99 0.87 0.99 0.87 0.99 n/an/a n/an/a 0.55 0.90 0.99 1.00 0.99 1.00
104 0.89 0.36 0.89 0.98 0.86 0.98 0.86 0.98 0.87 0.99 n/an/a n/an/a 0.56 0.91 0.99 1.00 0.99 1.00
201 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.43 0.51 0.43 0.51 0.44 0.71 0.98 0.92 1.00 0.11
202 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 n/an/a n/an/a 0.38 0.63 0.95 0.42 1.00 0.11
204 0.83 0.22 0.85 0.66 0.71 0.78 0.84 0.96 0.70 0.77 0.00 0.00 0.62 1.00 0.55 0.90 0.95 0.91 0.99 1.00
205 0.60 0.07 0.61 0.21 0.36 0.34 0.39 0.32 0.40 0.34 0.00 0.00 0.47 0.60 0.49 0.80 0.79 0.63 0.95 0.43
221 0.89 0.36 0.89 0.98 0.86 0.98 0.86 0.98 0.86 0.98 n/an/a n/an/a 0.61 1.00 0.98 0.88 0.99 1.00
222 0.85 0.31 0.89 0.93 0.82 0.93 0.84 0.93 0.83 0.93 n/an/a n/an/a 0.55 0.90 0.99 0.92 0.98 0.95
223 0.78 0.32 0.85 0.93 0.83 0.95 0.82 0.93 0.83 0.95 0.59 0.96 0.59 0.96 0.59 0.97 0.95 0.87 0.95 0.96
224 0.89 0.36 0.89 0.98 0.87 0.99 0.87 0.99 0.86 0.98 0.97 0.98 0.97 0.97 0.97 1.00 0.99 1.00 0.99 1.00
225 0.89 0.36 0.90 0.99 0.86 0.98 0.86 0.98 0.87 0.99 n/an/a n/an/a 0.59 0.97 0.99 1.00 0.99 1.00
228 0.92 1.00 0.79 1.00 0.67 1.00 0.63 1.00 0.69 1.00 n/an/a n/an/a 0.38 1.00 0.91 0.97 1.00 1.00
230 0.86 0.33 0.87 0.92 0.70 0.97 0.77 0.97 0.76 0.99 0.60 0.95 0.60 0.95 0.46 0.92 0.97 0.95 0.99 0.93
301 0.93 0.21 0.94 0.25 0.60 0.80 0.76 0.79 0.75 0.79 n/an/a 0.85 0.36 0.49 0.61 0.89 0.66 0.93 0.44
302 0.91 0.21 0.97 0.58 0.41 0.65 0.57 0.60 0.54 0.65 0.67 0.21 1.00 0.23 0.23 0.50 0.39 0.60 0.94 0.65
303 0.87 0.27 0.81 0.46 0.43 0.79 0.52 0.81 0.46 0.79 n/an/a 0.85 0.73 0.31 0.50 0.51 0.50 0.85 0.81
304 0.87 0.36 0.85 0.61 0.77 0.96 0.77 0.95 0.79 0.95 n/an/a 0.91 0.92 0.44 0.62 0.85 0.92 0.97 0.97

The following table presents the F-measure and overall results for the same set of algorithms.

java -cp /Volumes/Phata/JAVA/ontoalign/lib/procalign.jar fr.inrialpes.exmo.align.util.GroupEval -l "std,nea,ssda5,edna5,sdna5,karlsruhe,karlsruhe2,umontreal,fujitsu,stanford" -f "mo" -c

algo std nea ssda5 edna5 sdna5 karlsruhe karlsruhe2 umontreal fujitsu stanford
test FMeas. Over. FMeas. Over. FMeas. Over. FMeas. Over. FMeas. Over. FMeas. Over. FMeas. Over. FMeas. Over. FMeas. Over. FMeas. Over.
101 0.52 0.32 0.93 0.86 0.92 0.84 0.92 0.84 0.92 0.84 n/an/a n/an/a 0.73 0.30 0.99 0.99 0.99 0.99
102 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN n/an/a NaN NaN NaN NaN NaN NaN NaN NaN
103 0.52 0.32 0.94 0.88 0.92 0.84 0.92 0.84 0.92 0.84 n/an/a n/an/a 0.68 0.16 0.99 0.99 0.99 0.99
104 0.52 0.32 0.93 0.86 0.91 0.81 0.91 0.81 0.92 0.84 n/an/a n/an/a 0.69 0.19 0.99 0.99 0.99 0.99
201 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.46 0.0-16 0.46 0.0-16 0.54 0.0-20 0.95 0.90 0.20 0.11
202 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN n/an/a n/an/a 0.48 0.0-38 0.58 0.40 0.20 0.11
204 0.35 0.18 0.74 0.54 0.74 0.46 0.89 0.77 0.73 0.44 NaN NaN 0.76 0.38 0.68 0.16 0.93 0.87 0.99 0.99
205 0.12 0.02 0.31 0.08 0.35 0.0-27 0.35 0.0-17 0.37 0.0-16 NaN NaN 0.53 0.0-6 0.61 0.0-3 0.70 0.46 0.59 0.41
206 0.06 0.0-1 0.24 0.01 0.33 0.0-27 0.44 0.03 0.41 0.00 0.54 0.0-4 0.54 0.0-4 0.57 0.0-14 0.73 0.53 0.63 0.46
221 0.52 0.32 0.93 0.86 0.91 0.81 0.91 0.81 0.91 0.81 n/an/a n/an/a 0.76 0.36 0.92 0.86 0.99 0.99
222 0.45 0.25 0.91 0.81 0.87 0.73 0.89 0.76 0.88 0.75 n/an/a n/an/a 0.68 0.16 0.95 0.91 0.96 0.92
223 0.45 0.23 0.89 0.77 0.88 0.75 0.87 0.73 0.88 0.75 0.73 0.30 0.73 0.30 0.73 0.30 0.91 0.82 0.95 0.90
224 0.52 0.32 0.93 0.86 0.92 0.84 0.92 0.84 0.91 0.81 0.97 0.95 0.97 0.93 0.98 0.97 0.99 0.99 0.99 0.99
225 0.52 0.32 0.94 0.88 0.91 0.81 0.91 0.81 0.92 0.84 n/an/a n/an/a 0.73 0.30 0.99 0.99 0.99 0.99
228 0.96 0.91 0.88 0.73 0.80 0.52 0.78 0.42 0.81 0.55 n/an/a n/an/a 0.55 0.0-66 0.94 0.88 1.00 1.00
230 0.48 0.28 0.90 0.79 0.82 0.56 0.86 0.69 0.86 0.67 0.73 0.31 0.73 0.31 0.62 0.0-14 0.96 0.92 0.96 0.92
301 0.35 0.20 0.39 0.23 0.69 0.28 0.77 0.54 0.77 0.52 n/an/a 0.51 0.30 0.54 0.0-1 0.75 0.57 0.60 0.41
302 0.34 0.19 0.73 0.56 0.50 0.0-29 0.59 0.17 0.59 0.10 0.32 0.10 0.37 0.23 0.31 -1.0-20 0.47 0.0-35 0.77 0.60
303 0.41 0.23 0.59 0.35 0.55 0.0-27 0.63 0.06 0.58 0.0-12 n/an/a 0.79 0.60 0.38 0.0-60 0.51 0.02 0.83 0.67
304 0.50 0.30 0.71 0.50 0.85 0.67 0.85 0.67 0.86 0.70 n/an/a 0.92 0.83 0.51 0.0-17 0.89 0.76 0.97 0.95