$23.99
Note: The assignment will be autograded. It is important that you do not use additional libraries, or change the provided functions input and output.
Part 1: Setup
• Remote connect to an EWS machine.
ssh (netid)@remlnx.ews.illinois.edu
• Load python module, this will also load pip and virtualenv
module load python/3.4.3
• Reuse the virtual environment from mp1.
source ~/cs446sp_2018/bin/activate
• Copy mp8 into your svn directory, and change directory to mp8.
cd ~/netid
svn cp https://subversion.ews.illinois.edu/svn/sp18-cs446/_shared/mp8 . cd mp8
• Install the requirements through pip.
pip install -r requirements.txt
• Unzip assingment8 data.zip (You’ll find this in your svn)
unzip assingment8_data.zip -d data/
• Prevent svn from checking in the data directory.
svn propset svn:ignore data .
Part 2: Exercise
In this exercise you will write down your own code to do K-Means clustering. We provide
you the Iris Dataset that contains data for different kinds of flowers based on four features, sepal length, sepal width, petal length, petal width. For more information on the dataset refer to https://archive.ics.uci.edu/ml/datasets/iris. (You have to use the data as provided
1
2
in the data file. Do not download the data from the website.)
The file k means.py has a skeleton of the code you have to write.
The dataset is in data/iris.data . You’re expected to return the final cluster centers, upto an error of 10e−3.
Part 3: Testing the Code
In test.py we have provided the basic test case. If your code is correct it should return
ok. To test the code, run
nose2
Part 4: Submit
Submitting the code is equivalent to committing the code. This can be done with the follow
command:
svn commit -m "Some meaningful comment here."
Lastly, double check on your browser that you can see your code at
https://subversion.ews.illinois.edu/svn/sp18-cs446/(netid)/mp8/