Starting from:
$35

$29

Data Science Lab Exercise (kNN) Solution


    I. In this lab, you are going to learn how to classify data points using kNN classifier. Iris data set is given which consists of 3 classes and 150 data points.

    • Load libraries import pandas

from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report from sklearn.metrics import confusion_matrix from sklearn.metrics import accuracy_score
from sklearn.neighbors import KNeighborsClassifier




        (a) Load data set using pandas library

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class'] dataset = pandas.read_csv("iris.data", names=names)


        (b) Print the size of data set e.g. size should be [150,5] (4 Features and 1 class). Use dataset.shape to print

        (c) Display the class distribution

Use dataset.groupby('class').size()

        (d) Now, divide your data using hold out approach (80% for training and 20% for testing)

# train / test dataset

array = dataset.values

X = array[:,0:4]

Y = array[:,4]

t_size = 0.20

seed = 7

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=t_size, random_state=seed)


    (e) Apply knn classifier. See the documentation below. You need to import necessary classes

http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html
# Make predictions

knn = KNeighborsClassifier()

knn.fit(X_train, Y_train)

predictions = knn.predict(X_test)

print(accuracy_score(Y_test, predictions))

print(confusion_matrix(Y_test, predictions))

print(classification_report(Y_test, predictions))

    (f) Repeat (e) by changing the value of k (k=1, 2, 3,…., 10). Print only accuracy














        (g) Repeat (e) by changing the value of seed (seed = 1, 2, 3, …. , 10). Print only accuracy




    II. Repeat (I) using Occupancy Detection dataset. Ignore Date Attribute. Off course, steps (d) and (g) are not applicable since training / test data is given. http://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+

    III. Now instead of using build in library, write your own code for kNN classifier in any language and repeat I and II. You must use the following chi squared distance function

More products