ClusterTools Contents

ClusterTools User Guide



Hierarchical clustering, faster version


 A Feature based dataset or double array with M objects (rows).
 K Vector with desired numbers of clusters, default sampling of [2:M]
 TYPE Linkage rule: 's', 'a', 'r' or 'c'. Alternatively  'single','average','central' or 'complete'. Default 'single'.
 MSIZE Number of objects (M) above which the dataset is preclustered by
 CLUSTM, reducing it to MSIZE objects. Default MSIZE = 3000. Use
 MSIZE = inf to avoid preclustering.

 LAB Index array, size [M,length(K)], indices of cluster prototypes.  Columns refer to different clusterings and are ranked to  increasing numbers of clusters.
 DEN Dendrogram, see PLOTDG or DENDROGRAM


This routine performs a hierarchical clustering in feature space with a  linkage type given by TYPE. The clusterings with the numbers of clusters  given in the vector K are returned in the columns of LAB. The routine  uses DCLUSTH on the Euclidean distance matrix between the objects. As  this might be prohibitive for large datasets, a preclustering is used,  see below.

The prototypes refer to by LAB are the cluster mediods, except for single  linkage clustering in which case these are the cluster centres.

Use CLUSTHC to create a classifier based on the cluster result which is  consistent with TYPE.

IF K is given its values are reduced to less than M/5 to make the routine  more feasible. Moreover, if M > MSIZE the dataset A is preclustered  by PRECLUST using CLUSTM. Unless specific values of K < 100 are needed it  is recommended for fast processing to use K = []. Speed may be further  increased by using smaller values of MSIZE, e.g. MSIZE = 500;


 randreset;                     % take care of reproducability
 data = gendatclust1(20000);    % generate 20000 objects in 10 clusters
                                % Run Single Linkage clustering
 lab = clusth(data,[2 5 10 18 30 50 100],'s',2000);
                                % Show scatterplot for 10 clusters
 figure; scatn(lab(:,3),data,'Single Linkage'); 
 figure; clusteval(lab,data);   % Evaluation by active learning

See also

datasets, mappings, dclusth, cluste, clustf, clustk, clustm, clustkh, clusteval, clustcerr, clustc, clustnum, clusthc, plotdg, preclust,

ClusterTools Contents

ClusterTools User Guide

This file has been automatically generated. If badly readable, use the help-command in Matlab.