$29
Problem 1
(Logistic Regression, 60pts) Implement logistric regression. For your training data, generate 1000 training instances in two sets of random data points (500 in each) from multi-variate normal distribution with
1
= [1; 0]; 2
= [0; 1:5]; 1
=
0:75
1
; 2
=
0:75
0:1
(1)
1
0:75
1
75
and label them 0 and 1. Generate testing data in the same manner but include 500 instances for each class, i.e., 1000 in total. You will implement a logistic regression from scratch. Use sigmoid function for your activation function and cross entropy for your objective function. Stop your training when the l1-norm of your gradient is less than 0:001 (i.e., close to 0) or the number of iteration reaches 100000. Don’t forget to include bias term!
1. (30pt) Perform batch training using gradient descent. Divide the derivative with the total number of training dataset as you go through iteration (it is very likely that you will get NaN if you don’t do
this.). Change your learning rate as = f1; 0:1; 0:01; 0:001g. Your report should include: 1) scatter plot of the testing data and the trained decision boundary, 2) gure of changes of training error (cross entropy) w.r.t. iteration, 3) gure of changes of norm of gradient w.r.t. iteration. Also, report the number iterations it took for training and the accuracy that you have.
2. (30pt) Perform online training using gradient descent. Here, you do not need to normalize the gradient with the training dataset size since each iteration takes one training sample at a time. Try various
learning rate as = f1; 0:1; 0:01; 0:001g. Set your maximum number of iterations to 100000. Your
report should include: 1) scatter plot of the testing data and the trained decision boundary, 2) gure of changes of training error (cross entropy) w.r.t. iteration, 3) gure of changes of norm of gradient w.r.t. iteration. Also, report the number iterations it took for training and the accuracy that you have. Write your brief observation comparing this result from the result from batch training.
Instructor: W. H. Kim (won.kim@uta.edu), TA: Xin Ma (xin.ma@mavs.uta.edu) Page 1 of 2
CSE4334/5334 Data Mining Assignment 3
Problem 2
(Tensor ow and Keras, 40pts) Try out the tutorial for Deep Learning using Tensor ow at https://www.
tensorflow.org/tutorials. It has the following lines of code.
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ])
model.compile(optimizer=’adam’,loss=’sparse_categorical_crossentropy’, metrics=[’accuracy’])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
1. (10pt) In the report, write comments for each line of code given above and explain what this framework is doing.
2. (10pt) Change the number of hidden nodes to 5, 10, 128 and 512. Report how the testing accuracy changes for the testing data. Report the result and your observation in the report.
3. (20pt) Now, remove the hidden layer in the code and train the model. The trained model contains the weights that it has learned from training. Plot the \new representation" that it has learned for each number from training and include them in the report. That is, reshape the learned weights (i.e., vector) to the image dimension (in 2D, i.e., 28x28) and show them. You will see some number-like features.
Instructor: W. H. Kim (won.kim@uta.edu), TA: Xin Ma (xin.ma@mavs.uta.edu) Page 2 of 2