exemplar

EXEMPLAR

Examplar clustering

LAB = EXEMPLAR(D,P,F)

Input
D Square dissimilarity matrix, size M*M
P Vector (length N) with preferred self-distance >= 0
F Damping factor, default 0.5

Output
LAB M*N array with in its columns the cluster indices for the M objects

Description

This routine performs a clustering based on message passing between data points, see [1]. The original code supplied by the authors is followed with a few exceptions or extensions

the dissimilarity matrix D is automatically converted into a similarity matrix with values between 0 (corresponding with the largest dissimilarity) and 1 (corresponding with dissimilarities 0).
the diagonal of the similarity matrix is replaced by a set of identical values to one entry in P.
the routine is run for all N elements of P, resulting in N clusterings.
the number of iterations is set to inf. The iteration loop is broken when 3 times in a row an identical clustering is found. Alternatively, updating will be stopped by PRTIME. For larger datasets (M > 1000) and large self-similarities P updating is time consuming and large values of PRTIME might be needed (e.g. >500) to obtain useful results.
the default value for the damping factor (averaging between two successive iterations) has been set to 0.5 as proposed by the authors. Much faster results (less iterations) are obtained by higher damping factors like 0.95.

    In case P is omitted an attempt is made to generate a set of possible
    values such that a reasonable set of clusterings is obtained between 2
    and M clusters.

Reference(s)

[1] B.J. Frey and D. Dueck, Clustering by passing messages between data points, Science, vol. 315, pp. 972-976, 2007

Examplar clustering

Description

Reference(s)

See also