These results are those presented at the EON workshop to which the
results of test #206 have been added. More changes will come in another context. |
Below is the table of Precision and recall results commputed on the output provided by the participants with the help of the GroupEval class of the Alignment API.
java -cp lib/procalign.jar fr.inrialpes.exmo.align.util.GroupEval -c -l "karlsruhe2,umontreal,fujitsu,stanford"
algo | karlsruhe2 | umontreal | fujitsu | stanford | ||||
---|---|---|---|---|---|---|---|---|
test | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. |
101 | n/a | n/a | 0.59 | 0.97 | 0.99 | 1.00 | 0.99 | 1.00 |
102 | NaN | NaN | 0.00 | NaN | NaN | NaN | NaN | NaN |
103 | n/a | n/a | 0.55 | 0.90 | 0.99 | 1.00 | 0.99 | 1.00 |
104 | n/a | n/a | 0.56 | 0.91 | 0.99 | 1.00 | 0.99 | 1.00 |
201 | 0.43 | 0.51 | 0.44 | 0.71 | 0.98 | 0.92 | 1.00 | 0.11 |
202 | n/a | n/a | 0.38 | 0.63 | 0.95 | 0.42 | 1.00 | 0.11 |
204 | 0.62 | 1.00 | 0.55 | 0.90 | 0.95 | 0.91 | 0.99 | 1.00 |
205 | 0.47 | 0.60 | 0.49 | 0.80 | 0.79 | 0.63 | 0.95 | 0.43 |
206 | 0.48 | 0.60 | 0.46 | 0.75 | 0.85 | 0.64 | 1.00 | 0.46 |
221 | n/a | n/a | 0.61 | 1.00 | 0.98 | 0.88 | 0.99 | 1.00 |
222 | n/a | n/a | 0.55 | 0.90 | 0.99 | 0.92 | 0.98 | 0.95 |
223 | 0.59 | 0.96 | 0.59 | 0.97 | 0.95 | 0.87 | 0.95 | 0.96 |
224 | 0.97 | 0.97 | 0.97 | 1.00 | 0.99 | 1.00 | 0.99 | 1.00 |
225 | n/a | n/a | 0.59 | 0.97 | 0.99 | 1.00 | 0.99 | 1.00 |
228 | n/a | n/a | 0.38 | 1.00 | 0.91 | 0.97 | 1.00 | 1.00 |
230 | 0.60 | 0.95 | 0.46 | 0.92 | 0.97 | 0.95 | 0.99 | 0.93 |
301 | 0.85 | 0.36 | 0.49 | 0.61 | 0.89 | 0.66 | 0.93 | 0.44 |
302 | 1.00 | 0.23 | 0.23 | 0.50 | 0.39 | 0.60 | 0.94 | 0.65 |
303 | 0.85 | 0.73 | 0.31 | 0.50 | 0.51 | 0.50 | 0.85 | 0.81 |
304 | 0.91 | 0.92 | 0.44 | 0.62 | 0.85 | 0.92 | 0.97 | 0.97 |
I further provide some comments and about these results, in next section. I also comment on a number of modifications that have been made on the provided files.
Here are some consideration of the results obtained by the various participants. These are not statistically backed up and only corresponds to a rough analysis. More explainations are found in the papers presented by the participants.
In this test, there are clear winners it seems that the results provided by Stanford and Fujitsu/Tokyo outperform those provided by Karlsruhe and Montréal/INRIA.
In fact, it can be considered that these constitute two groups of programs. The Stanford+Fujitsu programs are very different but strongly based on the labels attached to entities. For that reason they performed especially well when labels were preserved (i.e., most of the time). The Karlsruhe+INRIA systems tend to rely on many different features and thus to balance the influence of individual features, so they tend to reduce the fact that labels were preserved.
This intuition should be further considered in the light of more systematic tests which were planned but never made.
Without going through a throughout statistical analysis of the results, it seems that the separation between sets of test that we presented (indicated by the first digit of their numbers) is significant for the participants as well.
It is very difficult indeed to compare these results. This is true because there were different tests and not all go in the same direction. So aggregating the results can be done in many ways (e.g., averaging, global P/R, counting dominance). Moreover, these results are based on two measures, precision and recall, which are very easily understood but dual in the sense that increasing one often decreases the other. This means that one algorithms can have sometimes the same results as another but they are found non comparable in the table.
As an indication, the average values are given below:
algo | karlsruhe2 | umontreal | fujitsu | stanford | ||||
---|---|---|---|---|---|---|---|---|
test | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. |
1xx | . | . | 0,57 | 0,93 | 0,99 | 1,00 | 0,99 | 1,00 |
2xx | 0,59 | 0,80 | 0,54 | 0,88 | 0,94 | 0,84 | 0,99 | 0,75 |
3xx | 0,90 | 0,56 | 0,37 | 0,56 | 0,66 | 0,67 | 0,92 | 0,72 |
total | 0,71 | 0,71 | 0,48 | 0,82 | 0,89 | 0,83 | 0,97 | 0,78 |
Establishing these results has been as automated as possible. However, we add to modify the provided files in a number of ways that are listed below.
The complete results can be retrieved from this archive.
One first remark is that we finally had some problems that we did not solved with accentuated characters (test #206).
Karlsruhe did not provided results for all the tests and we did not complete their tests for them.
The Karlsruhe results for 204-206 were inverted (i.e., ontology 2 was that of the 101 test). We used the ParserPrinter utility of the alignment API to revert the correct order. Otherwise, the results provided were 0.
Most of Karlsruhe results contained references to the first test series (notably Techreport and others in the # 230 which have been discarded).
It also contained references to special ontologies (onto-noinst) not provided with the tests and references to alignments with FOAF entities (perfectly valid but discarded for technical reasons). It is believed that these discarding tend to improve the results of the tests anyway.
The first set of result files from Karlsruhe were based on the initially released tests, had no threshold applied and some KAON objects were in the results. After asking to redo the tests these problems were solved.
The ontology namespaces and filenames provided in these tests were based on a local file system (moreover, filenames onto.rdf were changed to refonto.rdf). They thus have been automatically converted in those of the contest.
The results provided by INRIA and University of Montréal used the Alignment API. The only change that was made was in the namespace of the alignment API itself since that of the implementation did not correspond to that of the specification.
The results of Stanford were provided in pure RDF files with heavy use of references and the use of XML attributes for datatyped attributes. Since our alignment parser expect RDF/XML files and works only on the XML (i.e., it is unable to resolve rdf references), we developped an XSLT stylesheet solving these problems.
Results #302 and #221 were however slightly different and deserved a separate treatment.
Results #230, 201, 202 were not referring to the last file so they have been corrected for the best (by discarding cells causing trouble).
Most URI were local (referring to my own filesystem), so we had to adjust this too.
Stanford provided two additional tests that where not included in these results. They are, however, perfectly legitimate tests that should be included in further campaign.
Lockheed Martin provided a set of results in notation 3 (n3). Due to the size of the result files and the need to create another XSLT stylesheet on XML/RDF files returned by CWM, it was not possible to included this before the meeting.
#!/bin/sh CWD=`pwd` for i in `ls -d ???` do if [ -a $i/refalign.rdf ] then ed $i/refalign.rdf << EOF 1,$ s:http\://exmo.inrialpes.fr/align/1.0:http\://knowledgeweb.semanticweb.org/heterogeneity/alignment: w EOF else echo no $i/refalign.rdf fi if [ -a $i/umontreal.rdf ] then ed $i/umontreal.rdf << EOF 1,$ s:http\://exmo.inrialpes.fr/align/1.0:http\://knowledgeweb.semanticweb.org/heterogeneity/alignment: w EOF else echo no $i/umontreal.rdf fi if [ -a $i/karlsruhe.rdf ] then echo "Normaly we should have to suppress Kaon objects" echo "Normaly we should have to suppress non existing objects" ed $i/karlsruhe.rdf << EOF 1,$ s;101/onto.rdf#Techreport;101/onto.rdf#TechReport; w EOF else echo no $i/karlsruhe.rdf fi if [ -a $i/fujitsu.rdf ] then ed $i/fujitsu.rdf << EOF 1,$ s;file:///D:/align/align/oacontest/;http://co4.inrialpes.fr/align/Contest/; 1,$ s;file:/D:/align/align/oacontest/;http://co4.inrialpes.fr/align/Contest/; 1,$ s;refonto.rdf;onto.rdf; 1,$ s:../../../dtd:../dtd: w EOF else echo no $i/fujitsu.rdf fi if [ -a $i/stanford.rdf ] then xsltproc stan.xsl $i/stanford.rdf > $i/stan.rdf mv $i/stan.rdf $i/stanford.rdf else echo no $i/stanfordrdf fi done
This can be found here.
This first table displays, together with the results of the participants of the contests, those obtained by the demonstration aligners provided with the Alignment API and the first results obtained by Karlsruhe.
java -cp /Volumes/Phata/JAVA/ontoalign/lib/procalign.jar fr.inrialpes.exmo.align.util.GroupEval -l "std,nea,ssda5,edna5,sdna5,karlsruhe,karlsruhe2,umontreal,fujitsu,stanford" -f "pr" -c
algo | std | nea | ssda5 | edna5 | sdna5 | karlsruhe | karlsruhe2 | umontreal | fujitsu | stanford | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
test | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. | Prec. | Rec. |
101 | 0.89 | 0.36 | 0.89 | 0.98 | 0.87 | 0.99 | 0.87 | 0.99 | 0.87 | 0.99 | n/a | n/a | n/a | n/a | 0.59 | 0.97 | 0.99 | 1.00 | 0.99 | 1.00 |
102 | 0.00 | NaN | 0.00 | NaN | 0.00 | NaN | 0.00 | NaN | 0.00 | NaN | n/a | n/a | NaN | NaN | 0.00 | NaN | NaN | NaN | NaN | NaN |
103 | 0.89 | 0.36 | 0.90 | 0.99 | 0.87 | 0.99 | 0.87 | 0.99 | 0.87 | 0.99 | n/a | n/a | n/a | n/a | 0.55 | 0.90 | 0.99 | 1.00 | 0.99 | 1.00 |
104 | 0.89 | 0.36 | 0.89 | 0.98 | 0.86 | 0.98 | 0.86 | 0.98 | 0.87 | 0.99 | n/a | n/a | n/a | n/a | 0.56 | 0.91 | 0.99 | 1.00 | 0.99 | 1.00 |
201 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.43 | 0.51 | 0.43 | 0.51 | 0.44 | 0.71 | 0.98 | 0.92 | 1.00 | 0.11 |
202 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | n/a | n/a | n/a | n/a | 0.38 | 0.63 | 0.95 | 0.42 | 1.00 | 0.11 |
204 | 0.83 | 0.22 | 0.85 | 0.66 | 0.71 | 0.78 | 0.84 | 0.96 | 0.70 | 0.77 | 0.00 | 0.00 | 0.62 | 1.00 | 0.55 | 0.90 | 0.95 | 0.91 | 0.99 | 1.00 |
205 | 0.60 | 0.07 | 0.61 | 0.21 | 0.36 | 0.34 | 0.39 | 0.32 | 0.40 | 0.34 | 0.00 | 0.00 | 0.47 | 0.60 | 0.49 | 0.80 | 0.79 | 0.63 | 0.95 | 0.43 |
221 | 0.89 | 0.36 | 0.89 | 0.98 | 0.86 | 0.98 | 0.86 | 0.98 | 0.86 | 0.98 | n/a | n/a | n/a | n/a | 0.61 | 1.00 | 0.98 | 0.88 | 0.99 | 1.00 |
222 | 0.85 | 0.31 | 0.89 | 0.93 | 0.82 | 0.93 | 0.84 | 0.93 | 0.83 | 0.93 | n/a | n/a | n/a | n/a | 0.55 | 0.90 | 0.99 | 0.92 | 0.98 | 0.95 |
223 | 0.78 | 0.32 | 0.85 | 0.93 | 0.83 | 0.95 | 0.82 | 0.93 | 0.83 | 0.95 | 0.59 | 0.96 | 0.59 | 0.96 | 0.59 | 0.97 | 0.95 | 0.87 | 0.95 | 0.96 |
224 | 0.89 | 0.36 | 0.89 | 0.98 | 0.87 | 0.99 | 0.87 | 0.99 | 0.86 | 0.98 | 0.97 | 0.98 | 0.97 | 0.97 | 0.97 | 1.00 | 0.99 | 1.00 | 0.99 | 1.00 |
225 | 0.89 | 0.36 | 0.90 | 0.99 | 0.86 | 0.98 | 0.86 | 0.98 | 0.87 | 0.99 | n/a | n/a | n/a | n/a | 0.59 | 0.97 | 0.99 | 1.00 | 0.99 | 1.00 |
228 | 0.92 | 1.00 | 0.79 | 1.00 | 0.67 | 1.00 | 0.63 | 1.00 | 0.69 | 1.00 | n/a | n/a | n/a | n/a | 0.38 | 1.00 | 0.91 | 0.97 | 1.00 | 1.00 |
230 | 0.86 | 0.33 | 0.87 | 0.92 | 0.70 | 0.97 | 0.77 | 0.97 | 0.76 | 0.99 | 0.60 | 0.95 | 0.60 | 0.95 | 0.46 | 0.92 | 0.97 | 0.95 | 0.99 | 0.93 |
301 | 0.93 | 0.21 | 0.94 | 0.25 | 0.60 | 0.80 | 0.76 | 0.79 | 0.75 | 0.79 | n/a | n/a | 0.85 | 0.36 | 0.49 | 0.61 | 0.89 | 0.66 | 0.93 | 0.44 |
302 | 0.91 | 0.21 | 0.97 | 0.58 | 0.41 | 0.65 | 0.57 | 0.60 | 0.54 | 0.65 | 0.67 | 0.21 | 1.00 | 0.23 | 0.23 | 0.50 | 0.39 | 0.60 | 0.94 | 0.65 |
303 | 0.87 | 0.27 | 0.81 | 0.46 | 0.43 | 0.79 | 0.52 | 0.81 | 0.46 | 0.79 | n/a | n/a | 0.85 | 0.73 | 0.31 | 0.50 | 0.51 | 0.50 | 0.85 | 0.81 |
304 | 0.87 | 0.36 | 0.85 | 0.61 | 0.77 | 0.96 | 0.77 | 0.95 | 0.79 | 0.95 | n/a | n/a | 0.91 | 0.92 | 0.44 | 0.62 | 0.85 | 0.92 | 0.97 | 0.97 |
The following table presents the F-measure and overall results for the same set of algorithms.
java -cp /Volumes/Phata/JAVA/ontoalign/lib/procalign.jar fr.inrialpes.exmo.align.util.GroupEval -l "std,nea,ssda5,edna5,sdna5,karlsruhe,karlsruhe2,umontreal,fujitsu,stanford" -f "mo" -c
algo | std | nea | ssda5 | edna5 | sdna5 | karlsruhe | karlsruhe2 | umontreal | fujitsu | stanford | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
test | FMeas. | Over. | FMeas. | Over. | FMeas. | Over. | FMeas. | Over. | FMeas. | Over. | FMeas. | Over. | FMeas. | Over. | FMeas. | Over. | FMeas. | Over. | FMeas. | Over. |
101 | 0.52 | 0.32 | 0.93 | 0.86 | 0.92 | 0.84 | 0.92 | 0.84 | 0.92 | 0.84 | n/a | n/a | n/a | n/a | 0.73 | 0.30 | 0.99 | 0.99 | 0.99 | 0.99 |
102 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | n/a | n/a | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
103 | 0.52 | 0.32 | 0.94 | 0.88 | 0.92 | 0.84 | 0.92 | 0.84 | 0.92 | 0.84 | n/a | n/a | n/a | n/a | 0.68 | 0.16 | 0.99 | 0.99 | 0.99 | 0.99 |
104 | 0.52 | 0.32 | 0.93 | 0.86 | 0.91 | 0.81 | 0.91 | 0.81 | 0.92 | 0.84 | n/a | n/a | n/a | n/a | 0.69 | 0.19 | 0.99 | 0.99 | 0.99 | 0.99 |
201 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.46 | 0.0-16 | 0.46 | 0.0-16 | 0.54 | 0.0-20 | 0.95 | 0.90 | 0.20 | 0.11 |
202 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | n/a | n/a | n/a | n/a | 0.48 | 0.0-38 | 0.58 | 0.40 | 0.20 | 0.11 |
204 | 0.35 | 0.18 | 0.74 | 0.54 | 0.74 | 0.46 | 0.89 | 0.77 | 0.73 | 0.44 | NaN | NaN | 0.76 | 0.38 | 0.68 | 0.16 | 0.93 | 0.87 | 0.99 | 0.99 |
205 | 0.12 | 0.02 | 0.31 | 0.08 | 0.35 | 0.0-27 | 0.35 | 0.0-17 | 0.37 | 0.0-16 | NaN | NaN | 0.53 | 0.0-6 | 0.61 | 0.0-3 | 0.70 | 0.46 | 0.59 | 0.41 |
206 | 0.06 | 0.0-1 | 0.24 | 0.01 | 0.33 | 0.0-27 | 0.44 | 0.03 | 0.41 | 0.00 | 0.54 | 0.0-4 | 0.54 | 0.0-4 | 0.57 | 0.0-14 | 0.73 | 0.53 | 0.63 | 0.46 |
221 | 0.52 | 0.32 | 0.93 | 0.86 | 0.91 | 0.81 | 0.91 | 0.81 | 0.91 | 0.81 | n/a | n/a | n/a | n/a | 0.76 | 0.36 | 0.92 | 0.86 | 0.99 | 0.99 |
222 | 0.45 | 0.25 | 0.91 | 0.81 | 0.87 | 0.73 | 0.89 | 0.76 | 0.88 | 0.75 | n/a | n/a | n/a | n/a | 0.68 | 0.16 | 0.95 | 0.91 | 0.96 | 0.92 |
223 | 0.45 | 0.23 | 0.89 | 0.77 | 0.88 | 0.75 | 0.87 | 0.73 | 0.88 | 0.75 | 0.73 | 0.30 | 0.73 | 0.30 | 0.73 | 0.30 | 0.91 | 0.82 | 0.95 | 0.90 |
224 | 0.52 | 0.32 | 0.93 | 0.86 | 0.92 | 0.84 | 0.92 | 0.84 | 0.91 | 0.81 | 0.97 | 0.95 | 0.97 | 0.93 | 0.98 | 0.97 | 0.99 | 0.99 | 0.99 | 0.99 |
225 | 0.52 | 0.32 | 0.94 | 0.88 | 0.91 | 0.81 | 0.91 | 0.81 | 0.92 | 0.84 | n/a | n/a | n/a | n/a | 0.73 | 0.30 | 0.99 | 0.99 | 0.99 | 0.99 |
228 | 0.96 | 0.91 | 0.88 | 0.73 | 0.80 | 0.52 | 0.78 | 0.42 | 0.81 | 0.55 | n/a | n/a | n/a | n/a | 0.55 | 0.0-66 | 0.94 | 0.88 | 1.00 | 1.00 |
230 | 0.48 | 0.28 | 0.90 | 0.79 | 0.82 | 0.56 | 0.86 | 0.69 | 0.86 | 0.67 | 0.73 | 0.31 | 0.73 | 0.31 | 0.62 | 0.0-14 | 0.96 | 0.92 | 0.96 | 0.92 |
301 | 0.35 | 0.20 | 0.39 | 0.23 | 0.69 | 0.28 | 0.77 | 0.54 | 0.77 | 0.52 | n/a | n/a | 0.51 | 0.30 | 0.54 | 0.0-1 | 0.75 | 0.57 | 0.60 | 0.41 |
302 | 0.34 | 0.19 | 0.73 | 0.56 | 0.50 | 0.0-29 | 0.59 | 0.17 | 0.59 | 0.10 | 0.32 | 0.10 | 0.37 | 0.23 | 0.31 | -1.0-20 | 0.47 | 0.0-35 | 0.77 | 0.60 |
303 | 0.41 | 0.23 | 0.59 | 0.35 | 0.55 | 0.0-27 | 0.63 | 0.06 | 0.58 | 0.0-12 | n/a | n/a | 0.79 | 0.60 | 0.38 | 0.0-60 | 0.51 | 0.02 | 0.83 | 0.67 |
304 | 0.50 | 0.30 | 0.71 | 0.50 | 0.85 | 0.67 | 0.85 | 0.67 | 0.86 | 0.70 | n/a | n/a | 0.92 | 0.83 | 0.51 | 0.0-17 | 0.89 | 0.76 | 0.97 | 0.95 |