Cluster classification error on indiviual clusterings


 LABC Index array, size [M,K], indices of cluster prototypes for M objects in K clusterings.
 LABT Vector of M elements with true object labels.  Default: true labels used in some other routine.
 A PRTools labeled dataset used for the clustering.

 E Vector with classification errors of the clusterings in LABC.
 N Vector with number of clusters per clustering.


This routine evaluates a clustering of a dataset A by comparing the true  labels LABT of A with cluster labels derived from a cluster prototype  object. This can be understood as evaluating the clusterings by active  labeling.

A is a labeled dataset with M objects. LABT is a vertical vector  containing the true numeric labels of A. LABT = GETNLAB(A). LABC is a set  of K clustering results with indices pointing for every object to a  cluster prototype in A.

E is the fraction of misclassified objects in A if every cluster is  assigned to the class of the object indicated by LABC. This is based on a  training set of N prototypes. The classification error E is based on all  objects, including the prototypes.

In case LABC is an MxK result of a multilevel clustering, E and N are  vectors with K elements. Use MCLUSTCERR for a classification result  that combines clusterings.


 randreset; % take care of reproducability
 A = gendat(mnist8,25000);
 randreset; labc1 = A*clustm(false);   % no nesting
 randreset; labc2 = A*clustm;          % nesting
 [e1,n1] = clustcerr(labc1,A);
 [e2,n2] = clustcerr(labc2,A);
 semilogx(n1,e1); hold on
 title(['Active learning curve: ' getname(A)])
 xlabel('Training set size - number of clusters')
 ylabel('Classification error')

See also

