The Wikipedia article on determining the number of clusters in a dataset indicated that I do not need to worry about such a problem when using hierarchical clustering. However, when I tried to use sci-kit-learn's agglomerative clustering I see that I have to feed it the number of clusters as a parameter "n_clusters" - without which I get the hardcoded default of two clusters. How can I go about choosing the right number of clusters for the dataset in this case? Is the wiki article wrong?

Following are the steps to perform Hierarchical Clustering involved in agglomerative clustering:

  • At the starting point, treat each data point as one cluster. Therefore, the number of clusters at the start will be K, while K is an integer representing the number of data points.

  • Form one cluster by joining the two closest data points resulting in K-1 clusters.

  • Form more clusters by joining the two closest clusters resulting in K-2 clusters.

  • Now just repeat the above three steps until one big cluster is formed.

  • Once a single cluster is formed, dendrograms are used to divide into multiple clusters depending upon the problem. We will study the concept of dendrogram in detail in an upcoming section.

