Evaluate clusterings by various performance measures
E = CLUSTEVAL(LABC,LABT,TYPE)
E = CLUSTEVAL(LABC,A,TYPE)
E = LABC*CLUSTEVAL(LABT,TYPE)
E = LABC*CLUSTEVAL(A,TYPE)
| LABC|| Index array, size [M,N], indices of cluster prototypes for M objects in N clusterings.|
| LABT|| Double array, size [M,1] with true object labels.|
| A|| PRTools dataset used for obtaining LABC by some clustering procedure.|
| TYPE|| String with desired performance measure, see below.|
| E|| Evaluation result using the performance measure give by TYPE. See below for possibilities.|
Computation of a set of cluster performance measures between estimated cluster labels LABC and true object labels LABT. In case LABC is a multilevel clustering (N>1) the result E is a structure ready to be plotted by PLOTE. If E is omitted (no output) the result is directly plotted. Performance measures that do not generate a curve (see below) are plotted on the screen.
In case LABC is a single clustering (N==1) just scalar results are returned (except for TYPE is 'roc', which generates two values, see CLUSTROC). In case TYPE is omitted the default measure for multilevel clustering is 'actl'. For a single clustering all measures are returned in a structure or printed on the screen.
It is assumed that the cluster labels LABC are indices to cluster prototypes with true labels as given by the correponding entries in LABT. The true labels LABT can be derived as doubles from a PRTools dataset A by LABT = GETNLAB(A);
The following performance measures are available;
- Relative operating characteristics, see CLUSTROC.
- Scalar, area under the ROC curve [e1,e2].
- Scalar, MIN(E1+E2) with E1 and E2 the two errors of the ROC.
- Classification error of assigning all objects to the true class of the cluster prototypes (active learning).
- Classification error based on the true class of the cluster prototypes and combining multilevel cluster confidences.
- Classification error based on the true class of the cluster prototypes after nested combining cluster levels, see RECLUSTN
- The adjusted Rand index, see Wikipedia. It is between 0 and 1 and 1 for consistent clusterings.
- Normalised mutual information between 0 and 1.
datasets, mappings, knnc, cluste, clusth, clustk, clustkh, clustm, clustf, clustr, dcluste, dclustf, dclusth, dclustk, dclustm, dclustr, reclustn, clustcerr, clustc, clustnum, clustroc, plote,
|This file has been automatically generated. If badly readable, use the help-command in Matlab.|