ClusterTools Contents

ClusterTools User Guide

clustm

CLUSTM

Multi-level clustering by kNN mode-seeking.

    LAB = CLUSTM(A,K)
    LAB = A*CLUSTM(K)

Input
 A Feature based dataset with M objects (rows)
 K Scalar or a vector of length N with desired numbers of clusters.  Default is a set of N clusterings with numbers that naturally  arise from the data.

Output
 LAB M*N array with the results of the multi-level clusterings for the
 M objects. The columns refer to the N clusterings. They yield for  the objects the prototype indices of the clusters they belong to.

Description

A kNN modeseeking method is used to assign each object to its nearest  density mode. Object densities are related to the distances to neighbors.  Modes are determined by recusively jumping to objects in the neighborhood  with the highest density. As many clusters are found as there are objects  that are the mode in their own neighborhood.

As modeseeking clustering does not return a predefined number of clusters  a desired number of clusters K is realised by reclustering by RECLUSTK.

Note that finding exactly K clusters is computationally heavy for K > 500 and might be impossible for K > 5000. Users are advised in such cases to  run this procedure with K = [] and find a useful clustering by CLUSTNUM.

This routine is based on the same algorithm as DCLUSTM (for dissimilarity  data), and MODECLUST and MODECLUSTF. It calls DCLUSTM if the distance  matrix fits into memory. Otherwise it calls MODECLUSTF. Users should call  these routines directly for more dedicated parameter settings.

Clusterings can be evaluated by CLUSTEVAL, CLUSTCERR or CLUSTC on the  basis of (some) true labels.

Example(s)

 randreset;                     % take care of reproducability
 data = gendatclust1(20000);    % generate 20000 objects in 10 clusters
                                % Run Mean Shift clustering
 lab = clustm(data,[2 5 10 18 30 50 100]);
                                % Show scatterplot for 10 clusters
 figure; scatn(lab(:,3),data,'Mode Seeking'); 
 figure; clusteval(lab,data);   % Evaluation by active learning

Reference(s)

Cheng, Y. "Mean shift, mode Seeking, and clustering", IEEE Transactions on PAMI, vol. 17, no. 8, pp. 790-799, 1995.

R.P.W. Duin, A.L.N. Fred, M. Loog, and E. Pekalska, Mode Seeking Clustering by KNN and Mean Shift Evaluated, Proc. SSPR & SPR 2012, LNCS, vol. 7626, Springer, 2012, 51-59.

R.P.W. Duin and S. Verzakov, Fast kNN mode seeking clustering applied to active learning, arXiv:1712.07454, 2017, 1-23.

See also

datasets, mappings, dclustm, cluste, clustf, clustk, clusth, clusts, modeclust, modeclustf, reclustn, reclustk, reclusth, clusteval, clustcerr, clustc,

ClusterTools Contents

ClusterTools User Guide

This file has been automatically generated. If badly readable, use the help-command in Matlab.