
Problem Description

The clustering problem reads as follows: given a dataset of points in some d-dimensional Euclidean space, partition the data into K clusters such that some empirical loss function (performance measure) is minimized. A typical loss function is the sum of the squared distances between the points and their respective cluster centres. Clustering problems are described in Section 8.3 of [7].


This program finds a set of cluster means via the CE method with (independent) Normal updating.
Call the program from MATLAB, with the following syntax:


Example: [mu,count,score]=NCE(1000,10,0.7,5,DATA,0,1)
N - Number of samples to generate each iteration
g - Number of these samples to use to update parameters
alpha - Smoothing parameter
k - Number of cluster means to find
data - The data we are trying to fit means to (Should be n x d,
    where there are n points, and d dimensions)
modif - If 1, use modified smoothing,
    otherwise use standard smoothing
drplot - If 1, draws the cluster means
    and the data (for 2-dimensions)
c - Optional starting centroids
sigma0 - Optional starting standard deviation
mu - The centroids found via the CE method, using Normal updating
    with the parameter set
count - The number of iterations taken
score - The final score of these centroids
genNCE.m This program is used internally to generate the cluster means.
scoreNCE.m This program is used internally to evaluate the performance of a particular set of means against the data.
This program finds a set of cluster means via the CE, looking at clustering as a ``mincut'' type problem.
Call the program from MATLAB, with the following syntax:


Example: x=MCE(2000,0.01,0.6,3,DATA)
N - Number of samples to generate each iteration
rho - The fraction of samples used to update the probabilities
alpha - Smoothing parameter
k - Number of clusters to assign points to
data - The data we are trying to assign to clusters (Should be n x d,
    where there are n points, and d dimensions)
x - The best found assignment of the data points
genMCE.m This program is used internally to assign data points to clusters.

This is slight modification of the above program, using the ``injection'' idea (due to Zdravko Botev). It generally produces superior results to the unmodified MCE method.
Call the program from MATLAB, with the following syntax:


Example: x=MCEJ(1400,0.03,0.75,4,DATA)
N - Number of samples to generate each iteration
rho - The fraction of samples used to update the probabilities
alpha - Smoothing parameter
k - Number of clusters to assign points to
data - The data we are trying to assign to clusters (Should be n x d,
    where there are n points, and d dimensions)
x - The best found assignment of the data points

scoreMCE.m This program is used internally to evaluate the performance of a particular assignment against the data.
clNCE.m This is a small program which labels the points in a dataset according to a set of cluster means.
Call the program from MATLAB, with the following syntax:


Example: x=clNCE(mu,DATA,5)
c - A set of cluster means
data - The data we are trying to assign to clusters (Should be n x d,
    where there are n points, and d dimensions)
k - Number of clusters to assign points to
x - The assignment of the data points to clusters
cMCE.m This is a small program which calculates cluster means from a given cluster labelling.
Call the program from MATLAB, with the following syntax:


Example: mu=cMCE(x,data,6)
x - A labelling of data points
data - The data we are trying to fit means to (Should be n x d,
    where there are n points, and d dimensions)
k - Number of cluster means
x - The cluster means calculated for this labelling of data points

imim.mat - this is the image data file : it contains 2 variables, I : the 20 by 20 subimage (ie the actual image), Ib: the 256 by 25 set of all 5x5 subimages of I.

NCEICJ.m - nce with component upd. and injection

clim.m - create all pxp subimages from a given image (ie to generate Ib, can use this)

mu2im.m - converts the K cluster means back into pxp matrices so its easier to plot.

cetoolbox www user