ClusterTools Contents

ClusterTools User Guide

modeclust_batch

MODECLUST_BATCH

KNN mode-seeking clustering, batch processing for large datasets

    [LAB, K] = MODECLUST_BATCH(A, K, DIST)

Input
 A Dataset
 K Vector with numbers of neighbours to search for local mode.  Default: smart sampling.
 DIST Distance function (name or handle) or mapping to be used  for clustering (optional; default: @DISTM).  If DIST is a function it is expected to take two double arrays  as input arguments: if D = DIST(A1, A2) then SIZE(D) is
 [SIZE(A1, 1) SIZE(A2, 1)]. If DIST is a mapping then it should  be possible to use it like D = A1*(A2*DIST), i.e. it has to be  a trainable mapping and it should be able automatically convert  double arrays to datasets. E.g., for PROXM it is possible to  call MODECLUST(A, K, PROXM()), but for DISTM we need to use
 MODECLUST(A, K, @DISTM) or MODECLUST(A, K, 'distm').

Output
 LAB Indices of mode samples
 K The used input K vector (useful if K was not specified in the  call and was autimatically generated)

Description

A K-NN modeseeking method is used to assign objects to their nearest mode.  Object densities are defined by one over the distance to the K-th nearest  neighbour. Clusters are defined by recursively jumping for every object  to the object with the highest density in the local neighborhood.

K can be a vector of neighborhood sizes, which is much faster.  Default K: a set of values determined by a geometric series.

This routine is about twice as fast as fast as MODECLUST as by storing  distances that are needed on disk.

The multilevel clustering can be made nested by RECLUSTN.

Reference(s)

R.P.W. Duin, A.L.N. Fred, M. Loog, and E. Pekalska, Mode Seeking Clustering by KNN and Mean Shift Evaluated, Proc. SSPR & SPR 2012, LNCS, vol. 7626, Springer, 2012, 51-59.

Example(s)

 delfigs
 a = gendatm(5000);      % generate 5000 objects in 8 classes
 [lab,k] = modeclust(a); 
 for j=1:size(lab,2)
   nclust = numel(unique(lab(:,j)));
   if nclust < 20 & nclust %gt 1
     figure; scattern(prdataset(a,lab(:,j)));
     title(['K = ' int2str(k(j)) ' --%gt ' int2str(nclust) ' Clusters']);
   end
 end
 showfigs

 lab = modeclust_batch(a, [], @distm); % explicitly using DISTM
 lab = modeclust_batch(a, [], proxm([], 'm', 1)); % using PROXM mapping

See also

mappings, datasets, distm, proxm, dclustm, modeclust_batch, modeclustf, reclustn,

ClusterTools Contents

ClusterTools User Guide

This file has been automatically generated. If badly readable, use the help-command in Matlab.