clusteval

CLUSTEVAL

Evaluate clusterings by various performance measures

      E = CLUSTEVAL(LABC,LABT,TYPE)
      E = CLUSTEVAL(LABC,A,TYPE)
      E = LABC*CLUSTEVAL(LABT,TYPE)
      E = LABC*CLUSTEVAL(A,TYPE)

Input
LABC Index array, size [M,N], indices of cluster prototypes for M objects in N clusterings.
LABT Double array, size [M,1] with true object labels.
A PRTools dataset used for obtaining LABC by some clustering procedure.
TYPE String with desired performance measure, see below.

Output
E Evaluation result using the performance measure give by TYPE. See below for possibilities.

Description

Computation of a set of cluster performance measures between estimated cluster labels LABC and true object labels LABT. In case LABC is a multilevel clustering (N>1) the result E is a structure ready to be plotted by PLOTE. If E is omitted (no output) the result is directly plotted. Performance measures that do not generate a curve (see below) are plotted on the screen.

In case LABC is a single clustering (N==1) just scalar results are returned (except for TYPE is 'roc', which generates two values, see CLUSTROC). In case TYPE is omitted the default measure for multilevel clustering is 'actl'. For a single clustering all measures are returned in a structure or printed on the screen.

It is assumed that the cluster labels LABC are indices to cluster prototypes with true labels as given by the correponding entries in LABT. The true labels LABT can be derived as doubles from a PRTools dataset A by LABT = GETNLAB(A);

The following performance measures are available;

ROC
Relative operating characteristics, see CLUSTROC.
AUC
Scalar, area under the ROC curve [e1,e2].
MINE
Scalar, MIN(E1+E2) with E1 and E2 the two errors of the ROC.
ACTL
Classification error of assigning all objects to the true class of the cluster prototypes (active learning).
COMB
Classification error based on the true class of the cluster prototypes and combining multilevel cluster confidences.
NEST
Classification error based on the true class of the cluster prototypes after nested combining cluster levels, see RECLUSTN
ADRI
The adjusted Rand index, see Wikipedia. It is between 0 and 1 and 1 for consistent clusterings.
NMI
Normalised mutual information between 0 and 1.

Evaluate clusterings by various performance measures

Description

See also