Understanding concept of Gaussian Mixture Models

Question

asked Jul 15, 2019 in Machine Learning by ParasSharma1 (19k points)

I'm trying to understand GMM by reading the sources available online. I have achieved clustering using K-Means and was seeing how GMM would compare to K-means.

Here is what I have understood, please let me know if my concept is wrong:

GMM is like KNN, in the sense that clustering is achieved in both cases. But in GMM each cluster has it's own independent mean and covariance. Furthermore, k-means performs hard assignments of data points to clusters whereas in GMM we get a collection of independent Gaussian distributions, and for each data point, we have a probability that it belongs to one of the distributions.

To understand it better I have used MatLab to code it and achieve the desired clustering. I have used SIFT features for the purpose of feature extraction. And have used k-means clustering to initialize the values. (This is from the VLFeat documentation)

%images is a 459 x 1 cell array where each cell contains the training image
[locations, all_feats] = vl_dsift(single(images{1}), 'fast', 'step', 50); %all_feats will be 128 x no. of keypoints detected
for i=2:(size(images,1))
    [locations, feats] = vl_dsift(single(images{i}), 'fast', 'step', 50);
    all_feats = cat(2, all_feats, feats); %cat column wise all features
end
numClusters = 50; %Just a random selection.
% Run KMeans to pre-cluster the data
[initMeans, assignments] = vl_kmeans(single(all_feats), numClusters, ...
    'Algorithm','Lloyd', ...
    'MaxNumIterations',5);
initMeans = double(initMeans); %GMM needs it to be double
% Find the initial means, covariances and priors
for i=1:numClusters
    data_k = all_feats(:,assignments==i);
    initPriors(i) = size(data_k,2) / numClusters;
    if size(data_k,1) == 0 || size(data_k,2) == 0
        initCovariances(:,i) = diag(cov(data'));
    else
        initCovariances(:,i) = double(diag(cov(double((data_k')))));
    end
end
% Run EM starting from the given parameters
[means,covariances,priors,ll,posteriors] = vl_gmm(double(all_feats), numClusters, ...
    'initialization','custom', ...
    'InitMeans',initMeans, ...
    'InitCovariances',initCovariances, ...
    'InitPriors',initPriors);

Based on the above I have means, covariances, and priors. My main question is, What now? I am kind of lost now.

Also the means, covariances vectors are each of the size 128 x 50. I was expecting them to be 1 x 50 since each column is a cluster, won't each cluster have only one mean and covariance? (I know 128 are the SIFT features but I was expecting means and covariances).

In k-means I used the MatLab command knnsearch(X,Y) which basically finds the nearest neighbour in X for each point in Y.

So how to achieve this in GMM, I know its a collection of probabilities, and of course the nearest match from that probability will be our winning cluster. And this is where I am confused. All tutorials online have taught how to achieve the means, covariances values, but do not say much in how to actually use them in terms of clustering.

Thank you

1 Answer

Anurag · Answer 1 · 2019-07-15T14:44:01+0000

A Gaussian Mixture is a function that is comprised of many Gaussians, each identified by k ∈ {1,…, K}, where K is the number of clusters of our dataset. Each Gaussian k in the mixture is comprised of the following parameters:

A mean μ that defines its center.
A covariance Σ that defines its width. This would be equivalent to the dimensions of an ellipsoid in a multivariate scenario.
A mixing probability π that defines how big or small the Gaussian function will be.

Code:

% load Fisher Iris dataset
load fisheriris
% project it down to 2 dimensions for the sake of visualization
[~,data] = pca(meas,'NumComponents',2);
mn = min(data); mx = max(data);
D = size(data,2); % data dimension
% inital kmeans step used to initialize EM
K = 3; % number of mixtures/clusters
cInd = kmeans(data, K, 'EmptyAction','singleton');
% fit a GMM model
gmm = fitgmdist(data, K, 'Options',statset('MaxIter',1000), ...
'CovType','full', 'SharedCov',false, 'Regularize',0.01, 'Start',cInd);
% means, covariances, and mixing-weights
mu = gmm.mu;
sigma = gmm.Sigma;
p = gmm.PComponents;
% cluster and posterior probablity of each instance
% note that: [~,clustIdx] = max(p,[],2)
[clustInd,~,p] = cluster(gmm, data);
tabulate(clustInd)
% plot data, clustering of the entire domain, and the GMM contours
clrLite = [1 0.6 0.6 ; 0.6 1 0.6 ; 0.6 0.6 1];
clrDark = [0.7 0 0 ; 0 0.7 0 ; 0 0 0.7];
[X,Y] = meshgrid(linspace(mn(1),mx(1),50), linspace(mn(2),mx(2),50));
C = cluster(gmm, [X(:) Y(:)]);
image(X(:), Y(:), reshape(C,size(X))), hold on
gscatter(data(:,1), data(:,2), species, clrDark)
h = ezcontour(@(x,y)pdf(gmm,[x y]), [mn(1) mx(1) mn(2) mx(2)]);
set(h, 'LineColor','k', 'LineStyle',':')
hold off, axis xy, colormap(clrLite)
title('2D data and fitted GMM'), xlabel('PC1'), ylabel('PC2')

Hope this answer helps.

Understanding concept of Gaussian Mixture Models

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources