KNN mode-seeking clustering


 A Dataset of M objects
 K Vector with numbers of neighbours to search for local mode.  Default: smart sampling.
 DIST Distance function (name or handle) or mapping to be used  for clustering (optional; default: @DISTM).  If DIST is a function it is expected to take two double arrays  as input arguments: if D = DIST(A1, A2) then SIZE(D) is
 [SIZE(A1, 1) SIZE(A2, 1)]. If DIST is a mapping then it should  be possible to use it like D = A1*(A2*DIST), i.e. it has to be  a trainable mapping and it should be able automatically convert  double arrays to datasets. E.g., for PROXM it is possible to  call MODECLUST(A,K,PROXM), but for DISTM we need to use
 NEST Logical, if TRUE the output set of clusterings (columns of LAB) will be made nested by RECLUSTN. Default: TRUE

 LAB Indices of mode samples, size [M,N] with K the number of  clusterings (NUMEL(K)).
 NNLAB [M,1} vector with indices of nearest neighbors.
 NDIST Total number of distance calculations.


A NN modeseeking method is used to assign objects to their nearest mode.  Object densities are defined by one over the distance to the K-th nearest  neighbour. Clusters are defined by recursively jumping for every object  to the object with the highest density in the local neighborhood.

K can be a vector of neighborhood sizes, which is much faster.  Default K: a set of values determined by a geometric series.

Computing time is in the order of M^2 seconds, with M the number of  objects in A times 1000.


R.P.W. Duin, A.L.N. Fred, M. Loog, and E. Pekalska, Mode Seeking Clustering by KNN and Mean Shift Evaluated, Proc. SSPR & SPR 2012, LNCS, vol. 7626, Springer, 2012, 51-59.


 a = gendatm(5000);      % generate 5000 objects in 8 classes
 [lab,k] = modeclust(a); 
 for j=1:size(lab,2)
   nclust = numel(unique(lab(:,j)));
   if nclust < 20 & nclust %gt 1
     figure; scattern(prdataset(a,lab(:,j)));
     title(['K = ' int2str(k(j)) ' --%gt ' int2str(nclust) ' Clusters']);

