Starting from:
$30

$24

Convolutional Neural Networks and Recurrent Neural Networks Solution

1    Theory (50pt)

1.1    Convolutional Neural Netoworks (30 pts)

    (a) (5 pts) Given an input image of dimension 10 £ 11, what will be output dimension after applying a convolution with 3£3 kernel, stride of 2, and no padding?

    (b) (5 pts) Given an input of dimension C £ H £W, what will be the dimension of the output of a convolutional layer with kernel of size K £ K, padding P, stride S, dilation D, and F filters. Assume that H ‚ K, W ‚ K.
    (c) (20 pts) For this section, we are going to work with 1-dimensional convolu-tions. Discrete convolution of 1-dimensional input x[n] and kernel k[n] is defined as follows:

X
s[n] ˘ (x ⁄ k)[n] ˘    x[n ¡ m]k[m]
m

However, in machine learning convolution usually is implemented as a cross-correlation, which is defined as follows:

X
s[n] ˘ (x ⁄ k)[n] ˘    x[n ¯ m]k[m]
m

Note the difference in signs, which will get the network to learn an “flipped” kernel. In general it doesn’t change much, but it’s important to keep it in mind. In convolutional neural networks, the kernel k[n] is usually 0 everywhere, except a few values near 0: 8jnj¨ M k[n] ˘ 0. Then, the formula becomes:

M
s[n] ˘ (x ⁄ k)[n] ˘
X

x[n ¯ m]k[m]

m˘¡M
Let’s consider an input x[n], x : {1,2,3,4,5} ! R2 of dimension 5, with 2 channels, and a convolutional layer fW with one filter, with kernel size 3, stride of 2, no dilation, and no padding. The only parameters of the convolutional layer is the weight W, W 2 R1£2£3, there’s no bias and no non-linearity.

(i)    (4 pts) What is the dimension of the output fW (x)? Provide an expres-sion for the value of elements of the convolutional layer output fW (x).

Example answer format here and in the following sub-problems: fW (x) 2 R42£42£42, fW (x)[i, j, k] ˘ 42.
(ii) (4 pts) What is the dimension of @fW (x) ? Provide an expression for the
@W
values of the derivative @fW (x) .
@W
(iii) (6 pts) What is the dimension of @fW (x) ? Provide an expression for the
@x
values of the derivative @fW (x) .
@x

2





(iv)    (6 pts) Now, suppose you are given the gradient of the loss ‘ w.r.t. the output of the convolutional layer fW (x), i.e.   @‘  . What is the

@fW (x)
dimension of @@‘W ? Provide an expression for @@‘W . Explain similarities and differences of this expression and expression in (i).


1.2    Recurrent Neural Networks (20 pts)

In this section we consider a simple recurrent neural network defined as follows:

c[t] ˘ ¾(Wc x[t] ¯Wh h[t ¡1])
(1)
h[t] ˘ c[t] fl h[t ¡1] ¯(1 ¡ c[t]) flWx x[t]
(2)

where ¾ is element-wise sigmoid, x[t] 2 Rn, h[t] 2 Rm, Wc 2 Rm£n, Wh 2 Rm£m, Wx 2 Rm£n, fl is Hadamard product, h[0] ˘. 0.

    (a) (5 pts) Draw a diagram for this recurrent neural network, similar to the diagram of RNN we had in class. We suggest using diagrams.net.

    (b) (2pts) What is the dimension of c[t]?

    (c) (10 pts) Suppose that we run the RNN to get a sequence of h[t] for t from 1 to K. Assuming we know the derivative @h@‘[t] , provide dimension of and

an expression for values of  @‘ . What are the similarities of backward pass

@Wx
and forward pass in this RNN?

    (d) (3pts) Can this network be subject to vanishing or exploding gradients? Why?


2    Implementation (50pt)

There are two notebooks in the practical part:

    • Convolutional Neural Networks notebook: hw3_cnn.ipynb

    • Recurrent Neural Networks notebook: hw3_rnn.ipynb

Plase use your NYU account to access the notebooks. Both notebooks con-tain parts marked as TODO, where you should put your code. These notebooks are Google Colab notebooks, you should copy them to your drive, add your solutions, and then download and submit them to NYU Classes.




 

More products