Homework 1

Starting from:

~~$25~~

$19

Problem 1: Gradient Calculation. NOTE: This is not a programming assignment, so you may NOT use programming tools to help solve this problem. Show your work.

In this question you are required to calculate gradients for 2 scalar functions.

(1) Calculate the gradient of the function f(x; y) = x2 + ln(x) + xy + y3. What is the gradient value for (x; y) = (1; 1)?

(2) Calculate the gradient of the function f(x; y; z) = tanh(x3y3) + sin(z). What is the gradient
value for (x; y; z) = ( 1; 0; =2)?

Solution.

(1)

rf(x; y) = (
@f

@f

;

)
(1)

@x

@y

@f
= 2x +

1

+ y
(2)

x

@x

@f
= x + 3y2
(3)

@y

1

+ y; x + 3y2):

rf(x; y) = (2x +

(4)

x

rf(1; 1) = (2 + 1

1;1+3) = (2;4):
(5)

1
(2)

rf(x; y; z) = (
@f
@f

@f

;

;

)
(6)

@x

@y

@z

@f
=

3x2y3

(7)

@x

cosh2(x3y3)

@f
=

3x3y2

(8)

@y

cosh2(x3y3)

@f

= cos(z)

(9)

@z

rf(x; y; z) = (

3x2y3
3x3y2

;

; cos(z))
(10)

cosh2(x3y3)

cosh2(x3y3)

r
f(
1; 0;

) = (
3 0

;

1 0

; cos(

)) = (0; 0; 0):
(11)

2

cosh2( 1

0)

2

0) cosh2
( 1

Problem 2: Matrix Multiplication. NOTE: This is not a programming assignment, so you may NOT use programming tools to help solve this problem. Show your work.

In this question you are required to perform matrix multiplication.

(1)

2 9

0
8 132

0
213 =
2 41 223

6
1

1
6

7

6

7
6
65
31
7

10

4
0

17 6
11
4

71
20

6
8
5
2

7 6

3
0
7
6
21
9
7

4

3

5
4

5

5 4

Work: Calculating the individual matrix coe cients:

c1;1 = (1 6) + (
1 0)+(6

3)+(7 11)=65

c1;2 = (1 2) + (
1
1)+(6 0)+(7 4)=31

c2;1 = (9 6) + (0 0) + (8
3)+(1 11)=41

c2;2 = (9 2) + (0
1)+(8 0)+(1 4)=22

c3;1 = ( 8 6) + (5 0) + (2

3)+(3 11)=
21

c3;2 = ( 8 2) + (5 1) + (2 0) + (3 4) = 9

c4;1 = (10 6) + (4 0) + (0
3)+(1 11)=71

c4;2 = (10 2) + (4

1)+(0 0)+(1 4)=20

(2)
2
437301=
24[7 3 0 1]3

2281204
3

=

10
7

6
10 [7

3
0
1]

70
30
0
10

6
8

8[7 301]7 6
562408
7

6
1
7

6
1 [7
3
0
7

6
7

3
0

7

4

5

4

1]

4

1

5

5

2
(3)

2
33

9
3166
4
= (9 3)+( 3 4)+(1 9)+(6 0) =
48

097

6
7

4
5

3
Problem 3: Vector Norms. NOTE: This is not a programming assignment, so you may NOT use programming tools to help solve this problem. Show your work.

Consider these two points in the 3-dimensional space:
a = 2 0 3
, b =
293
, (a-b) =
2
93
6
5

7
7

6
2
7

417

62

2

6
7

6 7

6
6
7
4
5

5

4

5

4 5

Calculate their distance using the following norms:

(1)
‘0
: ka
bk0
= 0
2

3

2

(2)
‘1
: ka
bk1
= k
6
9
7
k1 = j
2j + j
9j + j 6j + j2j = 19

26

6

7

4

5

: ka
bk2
= p

= p

= 5p

(3)
‘2

(
2)2+( 9)2+(
6)2 + (2)2

125

5.
(4)
‘1 : ka
bk1 = 9

Problem 4: Probability Calculation. NOTE: This is not a programming assignment, so you may NOT use programming tools to help solve this problem. Show your work.

Consider a problem where we are rolling 2 dices where each dice has 6 faces numbered from 1 to 6. Answer the following questions:

(1) What is the sample space? (Written in the form f rst roll, second rollg): f(1, 1), (1, 2), (1, 3), ..., (1, 6), (2, 1), (2, 2), ..., (2, 6), ..., (6, 6)g
(2) If the event we are interested in is the sum being 10, what would be the probability of observing such an event? All outcomes where the sum is 10: f(4, 6), (5, 5), (6, 4)g. The size of this set is 3. Number of total outcomes: 6 6 = 36. Probability of rolling a 10-sum: 363 = 121 = 8:3%.

(3) If the event we are interested in is the sum being 6, what would be the probability of observing such an event? All outcomes where the sum is 6: f(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)g. The size of this set is 5. Number of total outcomes: 6 6 = 36. Probability of rolling a 6-sum:

365 = 13:8%.

Problem 5: Mean/Variance Calculation. NOTE: This is not a programming assign-ment, so you may NOT use programming tools to help solve this problem. Show your work.

Assume we have a random variable X with a Uniform probability density function. Uniform probability density is de ned as:
fX (x) =

1
o:w:
x

b

0

b a
if a

4
(1) What is the mean of X?

R
b xdx
1
R
b
1

b2
a2
1

b2
a2

1

(b+a)(b a)

a+b
the

a b a = b
a

a (x dx) = b

a [ 2

2 ] = b
a
2
(b a)2
a
2
= 2 . (2) What is
E(x) =

= b

standard deviation of X?

The equation for the variance of the normal distribution is

.

12

b
a

So by taking the square root of that, we get the standard deviation:
p

.

12

5
Problem 6: Classi cation Quality Metric Computation. NOTE: This is not a pro-gramming assignment, so you may NOT use programming tools to help solve this problem. Show your work.

Following up on the example presented in the class about Taqueria El Tio, assume Martin and Jose have decided to take it to the next level and they have bought a microwave avocado detector to detect tacos with no avocados inside. Here is the confusion matrix of the microwave avocado detector:

ground truth

avocado
no avocado

Avocado detector
avocado

37

23

no avocado

45

55

(1)
What is the accuracy of the detector?

A+D

=

37+55

=
92
= 0:575.

A+B+C+D

37+23+45+55

160

37

(2)
What is the balanced accuracy of the detector?

Avocado accuracy:

= 0:6167. No

37+23

Avocado accuracy:
55
= 0:55. Balanced accuracy =
0:6167+0:55
= 0:5834

55+45

2

(3)
What is the precision of the detector?

T P

=

37
= 0:4512.

TP+FP

37+45

(4)
What is the recall of the detector?

T P

=

37

= 0:6167.

TP+FN

37+23

2 0:4512 0:6167

What is the F-measure of the detector? 2
P recision Recall

(5)

=

= 0:5211.

P recision+Recall

0:4512+0:6167

Problem 7: ROC Computation. NOTE: This is not a programming assignment, so you may NOT use programming tools to help solve this problem. Show your work.

In Problem 6, assume that their microwave avocado detector does not give a binary output regarding the existence of avocados inside the taco. Alternatively, it outputs a probability of such an event. Jose, a CS sophomore who wants to put his knowledge to practice, wants to approximate the AUROC of the detector using 5 points as candidate thresholds: f0; 0:25; 0:5; 0:75; 1g. In a few tests that they ran, the probabilities and their corresponding ground truths were as follows:

predicted
ground truth

10%
0
5%
0
70%
1
50%
0
90%
1
65%
1
35%
1
60%
0
15%
1
20%
0

Please help him by computing the following:

(1) What would be the ROC value for threshold = 0? (1, 1)

(2) What would be the ROC value for threshold = 0:25? (0.4, 0.8)

(3) What would be the ROC value for threshold = 0:5? (0.4, 0.6)

6
(4) What would be the ROC value for threshold = 0:75? (0, 0.2)

(5) What would be the ROC value for threshold = 1? (0, 0)

(6) What would be the AUROC approximation using the above results? (HINT: remember

Riemann sum): The best Riemann sum I can give for this curve is: 1 (0:2 + 0:6 + 0:8 + 0:9 + 1) = 5

First, plot the outputs on varying thresholds using values (0, 0.25, 0.5, 0.75, 1):

Then, use these values to calculate the confusion matrix values for each threshold, and calculate

TPR and FPR:

I used these calculations to arrive at the results above.

7
Problem 8: Coding K-NN.

This is a coding assignment. Throughout the course, you will have several coding assignments. You are free to choose any language you prefer but our preference, and our hints, are directed towards Python. This can be a good place to get you going with Python if you haven’t already.

In this assignment we will implement k-NN. More speci cally, we are interested in seeing the e ect of varying k on the performance.

The dataset we will use in this assignment is named Smarket (can be downloaded from https://github.com/jcrouser/islr-python/blob/master/data/Smarket.csv).

Your submission should include a script which can be run seamlessly and performs all the following steps one after another. Any submission with a runtime error would result in lost points.

You may use libraries and you do not need to implement anything from scratch.

(1) Download and read the data. For Python, you may use pandas library and use read csv function

(2) Print the data. How does the data look like? (For Python, you may use head() function in pandas library)

(3) Print the shape of the data. Shape means the dimensions of the data. (In Python, pandas dataframe instances have a variable shape)

(4) Extract the features and the label from the data. The features we are interested in are Lag1 and Lag2 and the label is Direction.

(5) Split the data into a train/test split. (In Python, you can use train test split from sklearn library.)

(6) Apply k-NN to the data. (In Python, you can use the KNeighborsClassi er function from sklearn library.)

(7) Plot the accuracy of your implementation for k 2 1; 2; 3; 4; 5; 6; 7; 8; 9; 10.

8

More products

$6.00 OFF

Lab 2 (Graded) Understanding System Calls in Linux Solution

$6.00 OFF

$6.00 OFF

Lab 8: Putting it all together Solution

$30

$24

Buy now