Deep Learning Homework #1 Solution

Starting from:

~~$35~~

$29

1. (25 points) Linear algebra refresher.

(a) (12 points) Let A be a square matrix, and further let AAT = I.

i. (3 points) Construct a 2 2 example of A and derive the eigenvalues and eigen-vectors of this example. Show all work (i.e., do not use a computer’s eigenvalue decomposition capabilities). You may not use a diagonal matrix as your 2 2 example. What do you notice about the eigenvalues and eigenvectors?

ii. (3 points) Show generally that A has eigenvalues with norm 1.

iii. (3 points) Show generally that the eigenvectors of A corresponding to distinct eigenvalues are orthogonal.

iv. (3 points) In words, describe what may happen to a vector x under the transfor-mation Ax.

(b) (8 points) Let A be a matrix.

i. (4 points) What is the relationship between the singular vectors of A and the eigenvectors of AAT ? What about AT A?

ii. (4 points) What is the relationship between the singular values of A and the eigen-values of AAT ? What about AT A?

(c) (5 points) True or False. Partial credit on an incorrect solution may be awarded if you justify your answer.

i. Every linear operator in an n-dimensional vector space has n distinct eigenvalues.

ii. A non-zero sum of two eigenvectors of a matrix A is an eigenvector.

iii. If a matrix A has the positive semide nite property, i.e., xT Ax 0 for all x, then its eigenvalues must be non-negative.

iv. The rank of a matrix can exceed the number of non-zero eigenvalues.

v. A non-zero sum of two eigenvectors of a matrix A corresponding to the same eigenvalue is always an eigenvector.

2. (22 points) Probability refresher.

(a) (9 points) A jar of coins is equally populated with two types of coins. One is type \H50" and comes up heads with probability 0:5. Another is type \H60" and comes up heads with probability 0:6.

i. (3 points) You take one coin from the jar and ip it. It lands tails. How likely is the coin to be type H50?

1

ii. (3 points) You put the coin back, take another, and ip it 4 times. It lands T, H, H, H. How likely is the coin to be type H50?

iii. (3 points) A new jar is now equally populated with coins of type H50, H55, and H60 (with probabilities of coming up heads 0:5, 0:55, and 0:6 respectively). You take one coin and ip it 10 times. It lands heads 9 times. How likely is the coin to be of each possible type?

(b) (3 points) Consider a pregnancy test with the following statistics.

If the woman is pregnant, the test returns \positive" (or 1, indicating the woman is pregnant) 99% of the time.

If the woman is not pregnant, the test returns \positive" 10% of the time. At any given point in time, 99% of the female population is not pregnant.

What is the probability that a woman is pregnant given she received a positive test? The answer should make intuitive sense; given an explanation of the result that you nd.

(c) (5 points) Let x1; x2; : : : ; xn be identically distributed random variables. A random vector, x, is de ned as

2
x1
3

x =
6 x...2
7

6
x
7

6
n
7

4

5

What is E (Ax + b) in terms of E(x), given that A and b are deterministic?
(d) (5 points) Let

cov(x) = E (x
Ex)(x Ex)T

What is cov(Ax + b) in terms of cov(x), given that A
and b are deterministic?

3. (13 points) Multivariate derivatives.

(a) (2 points) Let x 2 Rn, y 2 Rm, and A 2 Rn m. What is rxxT Ay?

(b) (2 points) What is ryxT Ay?
(c) (3 points) What is rAxT Ay?
(d) (3 points) Let f = xT Ax + bT x. What is rxf?
(e) (3 points) Let f = tr(AB). What is rAf?

4. (10 points) Deriving least-squares with matrix derivatives.

In least-squares, we seek to estimate some multivariate output y via the model

y^ = Wx

In the training set we’re given paired data examples (x(i); y(i)) from i = 1; : : : ; n. Least-squares is the following quadratic optimization problem:

min
W

n
1 X

2
i=1

y(i)

2
Wx(i)

2

Derive the optimal W.

Hint: you may nd the following derivatives useful:

@tr(WA) = AT @W

◦ tr(WAWT ) = WAT + WA @W

5. (30 points) Hello World in Jupyer.

Complete the Jupyter notebook linear regression.ipynb. Print out the Jupyter notebook and submit it to Gradescope.

3

More products

$6.00 OFF

Assignment #2: CIFAR 10 Image Classification using Fully Connected Neural Network Solution

$30

$24

Buy now

$6.00 OFF

Program 6 - Binary Search Trees solution

$35

$29

Buy now

$6.00 OFF

Program 5 Sequential Collections solution

$35

$29

Buy now