ClusterTools Contents ClusterTools User Guide
modeclust

MODECLUST

### KNN mode-seeking clustering

[LAB,NNLAB,NDIST] = MODECLUST(A,K,DIST,NEST)

 Input A Dataset of M objects K Vector with numbers of neighbours to search for local mode.  Default: smart sampling. DIST Distance function (name or handle) or mapping to be used  for clustering (optional; default: @DISTM).  If DIST is a function it is expected to take two double arrays  as input arguments: if D = DIST(A1, A2) then SIZE(D) is [SIZE(A1, 1) SIZE(A2, 1)]. If DIST is a mapping then it should  be possible to use it like D = A1*(A2*DIST), i.e. it has to be  a trainable mapping and it should be able automatically convert  double arrays to datasets. E.g., for PROXM it is possible to  call MODECLUST(A,K,PROXM), but for DISTM we need to use MODECLUST(A,K,@DISTM) or MODECLUST(A,K,'distm'). NEST Logical, if TRUE the output set of clusterings (columns of LAB) will be made nested by RECLUSTN. Default: TRUE

 Output LAB Indices of mode samples, size [M,N] with K the number of  clusterings (NUMEL(K)). NNLAB [M,1} vector with indices of nearest neighbors. NDIST Total number of distance calculations.

### Description

A NN modeseeking method is used to assign objects to their nearest mode.  Object densities are defined by one over the distance to the K-th nearest  neighbour. Clusters are defined by recursively jumping for every object  to the object with the highest density in the local neighborhood.

K can be a vector of neighborhood sizes, which is much faster.  Default K: a set of values determined by a geometric series.

Computing time is in the order of M^2 seconds, with M the number of  objects in A times 1000.

### Reference(s)

R.P.W. Duin, A.L.N. Fred, M. Loog, and E. Pekalska, Mode Seeking Clustering by KNN and Mean Shift Evaluated, Proc. SSPR & SPR 2012, LNCS, vol. 7626, Springer, 2012, 51-59.

### Example(s)

delfigs
a = gendatm(5000);      % generate 5000 objects in 8 classes
[lab,k] = modeclust(a);
for j=1:size(lab,2)
nclust = numel(unique(lab(:,j)));
if nclust < 20 & nclust %gt 1
figure; scattern(prdataset(a,lab(:,j)));
title(['K = ' int2str(k(j)) ' --%gt ' int2str(nclust) ' Clusters']);
end
end
showfigs