Starting from:
$30

$24

Neural Networks Assignment 1 Solution

Question 1. [20 points]

A single neuron receives input from m input neurons with weights wi, where i 2 [1 m]. The neuron is expected to predict the probability that the output t belongs to Class A (t = 1) versus Class B (t = 1). A datasets of training samples are available with inputs xn and outputs yn (n 2 [1 N]). You are told that the maximum a posteriori estimate for the network weights are obtained by solving the following optimization problem:

min
Xn

yn   h
xn; W

2


w2

arg
W

(
(

))

+
Xi
i
(1)
where W is the vector of weights wi, is a scalar constant, and h(:) is the output of the neuron. According to this estimate, derive the prior probability distribution of the network weights analytically.
Question 2. [25 points]

An engineer would like to design a neural network with a single hidden layer with four input neurons (with binary inputs) and a single output neuron to implement: (X1 OR NOT X2) XOR (NOT X3 OR NOT X4)

Assume a hidden layer with four hidden units, and a unipolar activation function (i.e., the step function). Answer the questions below.


    a) For each hidden unit, analyically derive the set of inequalities based on which a set of weights and an activation threshold can be selected.


    b) Choose a particular weight vector (including the bias term), and show that the designed network achieves 100% performance in implementing the desired logic.


    c) Now assume that the input data samples are subject to small random uctuations due to noise. Will the network you designed in part a function robustly under noisy conditions? Find the set of weights and the activation threshold for the most robust decision boundary.


    d) Generate 100 input samples by rst concatenating 25 samples from each input vector. Generate a random noise vector of length 2 for each training sample, assuming a zero-mean Gaussian distribution with an std of 0.2. Form validation samples for testing the NNs by linearly superposing the input samples and the random noise samples. Evaluate the classi cation performance (i.e., percentage correct) of the networks designed in parts a and c on the validation samples. Interpret your results.
Question 3. [35 points]

A researcher would like to process images of alphabet letters with a perceptron. A collection of images were compiled for training and testing the perceptron. The le assign1_data1.h5 contains variables trainims (training images) and testims (testing images) along with the ground truth labels in trainlbls and testlbls. Answer the questions below.


    a) Visualize a sample image for each class. Find correlation coe cients between pairs of sample images that you have selected. Display the correlations in matrix format. Discuss the degree of within-class versus across-class variability.


    b) Design a single-layer perceptron with an output neuron for each digit, using the train-ing data. Set the initial network weights w and bias term b as random numbers drawn from a Gaussian distribution N(0; 0:01), assume a sigmoid activation function. Your im-plementation should not train each output neuron separately, but a compound matrix W and a compound vecor b should be de ned and used to simultaneously update all connec-tions. The online training algorithm should perform 10000 iterations. At each iteration, a sample image should be randomly selected from the training data, the network should be updated according to the gradient-descent learning rule, and W , b, and the mean-squared error (MSE) should be recorded. Tune the learning rate in order to minimize the nal value of the MSE. Display the nal network weights for each digit as a separate image, and describe the visual characteristics.


    c) Now separately repeat the training process using a substantially higher and a subtantially lower value thant . On a single gure, plot the MSE curves (across all 10000 iterations) for high, low and . Discuss your results.


    d) Validate the performance of the trained networks using all samples in the test data. Report the performance values for the three networks with high, low and .
Question 4. [20 points]

The goal of this question is to introduce you simple two-layer neural networks, and to let you examine the e ects of various hyperparameter selections on these classical model. You will be experimenting with a Python demo on a network model. Download demo_tln.zip from Moodle and unzip it. The demo is given asa Jupyter Notebook along with relevant code and data. The easiest way to install Jupyter with all Python and related dependencies is to install Anaconda. After that you should be able to run through the demo in your browser easily. The point of this demo is that it takes you through the training algorithms step by step, and you need to inspect the relevant snippets of code for each step to learn about implementation details.

The notebook two_layer_net.ipynb contains demonstrations on a simple two-layer net-work model. You need to run the demo till the end without any errors. You may have to debug the code in case of any fatal errors. You are supposed to convert the outputs of the completed demo to a PDF le, and attach it to the project report. You should also comment on your results and answer any inline questions that are provided in the notebook.

More products