Starting from:
$30

$24

Lab 1: K-Means Clustering Solution

Problem Description




Implement (in C++) the K-means clustering algorithm [MacQueen, 1967] with a Euclidean distance metric. See online tutorials at:




http://www.saedsayad.com/clustering kmeans.htm







Use the implemented K-means algorithm to cluster the following 8 examples (table 1) into 3 clusters.




When running K-means, set the initial seeds (initial centroid of each cluster)




as examples 1, 4 and 7.



Table 1: Data (examples have two attributes: X, Y , both in range: [1, 10]).




Example Number
X
Y






1
2
10






2
2
5






3
8
4






4
5
8






5
7
5






6
6
4






7
1
2






8
4
9












Question 1: How many iterations are needed for k-means to converge?




In a text le output the results of each iteration (for each cluster, list the examples that fall into each cluster), and the centroids of each cluster, e.g.:




Iteration 1




Cluster 1: 1, 2, 3




Centroid: (3.0, 9.5)




Cluster 2: 4, 5, 6




Centroid: (6.5, 5.25)




Cluster 3: 7, 8




Centroid: (1.5, 3.5)










Iteration N




Cluster 1: 8, 7, 6




Centroid: (1.5, 3.5)




Cluster 2: 5, 4, 3




Centroid: (6.5, 5.25)




Cluster 3: 2, 1




Centroid: (3.0, 9.5)




In a ZIP le, place the source code, make le, and the output text le (answer to question 1). Upload the ZIP le to Vula before 10.00 AM.


References




[MacQueen, 1967] MacQueen, J. (1967). Some methods for classi cation and analysis of multivari-ate observations. In Proceedings of the Berkeley Symposium on Mathematics, Statistics and Probability, pages 281{297, Berkeley, USA. University of California Press.




























































































































































3

More products