Clustering
The goal of clustering is to partition a dataset into subgroups of similar or homogeneous data points.
Definition of similarity really depends on the domain and application.
Table of contents
K-Means Clustering
Where the number of clusters $K$ is pre-specified.
- There is no definite way to find the optimal pre-specified $K$.
- Hierarchical clustering does not require a pre-specified cluster number.
- You could do hierarchical clustering first to get a sense of how many clusters you want and then do K-means.