ClusterTools Contents

ClusterTools User Guide

preclust

PRECLUST

Pre-cluster to reduce dataset, cluster and merge

    LAB = PRECLUST(A,CLUSTT,CLUSTP,MSIZE)

Input
 A Feature based dataset or double array with M objects (rows).
 CLUSTT PRTools mapping of target clustering routine asking for  clusterings of sizes K
 CLUSTP PRTools mapping of preclustering routine
 MSIZE Number of objects (M) above which the dataset is preclustered by
 CLUSTM, reducing it to MSIZE objects. Default MSIZE = 5000.

Output
 LAB M*NUMEL(K) array with the results of the multilevel clustering  for the M objects. The columns refer to the clusterings. They  yield for the objects the prototype indices of the clusters they  belong to.

Description

This routine enables various target cluster procedures like CLUSTK to use very large datasets (e.g. 5000 to 10^6 objects) that are  prohibitive otherwise. First a preclustering CLUSTP is performed by a  routine that can handle such datasets, e.g. CLUSTM. Nest, the prototpyes  found by this dataset (at most MSIZE) are then used by the target cluster  routine CLUSTT. Finally the clusters found by CLUSTP are assigned to the  clusters of their prototypes determined by CLUSTT.

Thsi routine is called by CLUSTK, CLUSTH,

Example(s)

               % Generate 10^5 2D objects, 10 clusters
 data = gendatclust1(100000);
               % PRTools mapping of CLUSTH asking for 5 clusterings
               % of [2 5 10 17 30] clusters by Single Linkage 
 clustt = clusth([],[2 5 10 17 30]);
               % PRTools mapping of fast modeseeking clustering
 clustp = clustm;
               % Run PRECLUST, use at most 1000 prototypes for CLUSTK
 lab = preclust(data,clustt,clustp,1000);
               % Scatterplot of data showing 10 clusters
 figure; scatn(lab(:,3),data);
               % Learning curve for active learning
 figure; clusteval(lab,data); fontsize(12)

See also

datasets, mappings, clustk, clusth, clusts, cluste, clustm, clusteval,

ClusterTools Contents

ClusterTools User Guide

This file has been automatically generated. If badly readable, use the help-command in Matlab.