ClusterTools Contents

ClusterTools User Guide

exemplar

EXEMPLAR

Examplar clustering

    LAB = EXEMPLAR(D,P,F)

Input
 D Square dissimilarity matrix, size M*M
 P Vector (length N) with preferred self-distance >= 0
 F Damping factor, default 0.5

Output
 LAB M*N array with in its columns the cluster indices for the M objects

Description

This routine performs a clustering based on message passing between data  points, see [1]. The original code supplied by the authors is followed  with a few exceptions or extensions

  • the dissimilarity matrix D is automatically converted into a  similarity matrix with values between 0 (corresponding with the largest  dissimilarity) and 1 (corresponding with dissimilarities 0).
  • the diagonal of the similarity matrix is replaced by a set of  identical values to one entry in P.
  • the routine is run for all N elements of P, resulting in N clusterings.
  • the number of iterations is set to inf. The iteration loop is broken  when 3 times in a row an identical clustering is found. Alternatively,  updating will be stopped by PRTIME. For larger datasets (M > 1000) and  large self-similarities P updating is time consuming and large values  of PRTIME might be needed (e.g. >500) to obtain useful results.
  • the default value for the damping factor (averaging between two  successive iterations) has been set to 0.5 as proposed by the authors.  Much faster results (less iterations) are obtained by higher damping  factors like 0.95.

    In case P is omitted an attempt is made to generate a set of possible
    values such that a reasonable set of clusterings is obtained between 2
    and M clusters.

Reference(s)

[1] B.J. Frey and D. Dueck, Clustering by passing messages between data points, Science, vol. 315, pp. 972-976, 2007

See also

datasets, mappings, dclustm, dclusth, dclustk, clusteval, clustcerr, clustc,

ClusterTools Contents

ClusterTools User Guide

This file has been automatically generated. If badly readable, use the help-command in Matlab.