[an error occurred while processing this directive] [an error occurred while processing this directive]

Evaluation

Confusion Matrix

A confusion matrix indicates the distribution of the submitted classification result:

Predicted label
012
True
label
0abc
1def
2ghi

0 correspond to irrelevant sentences (I), 1 to relevant (R) and 2 to correct answers (C). For instance, "b" is the number of "irrelevant" examples ("0") classified as "relevant" ("1") and "e" the number of correctly classified "relevant" examples.

In Competition 1 only the viewed sentences affect the result. In Competition 2 all sentences are taken into account (including the unseen ones). In Competition 2 the participants should therefore make their best guess of the relevancy of unseen sentences.

Accuracy

Classification results for both the validation and test set are ranked according to accuracy, the amount of correct classifications divided by the total amount of examples in the set, or (a+e+i) divided by the sum of all elements. Note that the exact classification accuracy can be calculated from the confusion matrix.

Balanced Error Rate

In addition to accuracy, Balanced Error Rate (BER) is also calculated for each submitted result. BER is the average of the proportion of wrong classifications in each class, or ((b+c)/(a+b+c)+(d+f)/(d+e+f)+(g+h)/(g+h+i))/3. BER is used to determine the winner in the unlikely case that two (or more) contestants have obtained an equal accuracy.

[an error occurred while processing this directive]