Starting from:
$35

$29

A Homework 1



No extension of the deadline is allowed. Late submission will lead to 0 credit.

Discussion is encouraged on Piazza as part of the Q/A. However, all as-signments should be done individually.

Instructions

This assignment has no programming, only written questions.

We will be using Gradescope this semester for submission and grading of assignments.

Your write up must be submitted in PDF form, you may use either Latex or markdown, whichever you prefer. We will not accept handwritten work.

Please make sure to start answering each question on a new page. It makes it more organized to map your answers on GradeScope. When submitting your assignment, you must correctly map pages of your PDF to each question/subquestion to re ect where they appear. Im-properly mapped questions may not be graded correctly.

Please show the calculation process used to arrive at the answer. Sub-missions with only the nal answer and no derivation/calculation process will receive 0 credit

    • Linear Algebra [30pts]

1.1    Determinant and Inverse of Matrix [15pts]

Given a matrix M:

    • 3
r  6  0
M = 4
2
3
r
5

4
7
3


1





    (a) Calculate the determinant of M in terms of r. [4pts]

    (b) For what value(s) of r does M 1 not exist? Why? What does it mean in terms of rank and singularity of M for these values of r? [3pts]

    (c) Calculate M 1 by hand for r = 4. [5pts] (Hint 1: Please double check your answer and make sure M M 1 = I)

    (d) Find the determinant of M  1 for r = 4. [3pts]

1.2    Characteristic Equation [5pts]

Consider the eigenvalue problem:

Ax =  x; x 6= 0

where x is a non-zero eigenvector and is eigenvalue of A. Prove that the de-terminant jA Ij = 0.

1.3    Eigenvalues and Eigenvectors [10pts]

Given a matrix A:

x  3 A =  1  x

    (a) Calculate the eigenvalues of A as a function of x [5 pts]

    (b) Find the normalized eigenvectors of matrix A [5 pts]

    • Expectation, Co-variance and Independence [18pts]

Suppose X; Y and Z are three di erent random variables. Let X obey a Bernouli Distribution. The probability disbribution function is

0:5    x = c
p(x) =
0:5    x =    c:

c is a constant here. Let Y obey a standard Normal (Gaussian) distribution, which can be written as Y N(0; 1). X and Y are independent. Meanwhile, let Z = XY .

    (a) Show that Z also follows a Normal (Gaussian) distribution. Calculate the Expectation and Variance of Z. [9pts] (Hint: Sum rule and conditional probability formula)


2





    (b) How should we choose c such that Y and Z are uncorrelated(which means Cov(Y; Z) = 0)? [5pts]

    (c) Are Y and Z independent? Make use of probabilities to show your con-clusion. Example: P (Y 2 ( 1; 0)) and P (Z 2 (2c; 3c)) [4pts]

    • Optimization [15 pts]

Optimization problems are related to minimizing a function (usually termed loss, cost or error function) or maximizing a function (such as the likelihood) with respect to some variable x. The Kuhn-Tucker conditions are rst-order conditions that provide a uni ed treatment of constraint optimization. In this question, you will be solving the following optimization problem:


max f(x; y) = 2x2 + 3xy
x;y

s.t. g1(x; y) = 12 x2 + y    4

g2(x; y) =    y    2

    (a) Specify the Legrange function [2 pts]

    (b) List the KKT conditions [2 pts]

    (c) Solve for 4 possibilities formed by each constraint being active or inactive [5 pts]

    (d) List all candidate points [4 pts]

    (e) Check for maximality and su  ciency [2 pts]

    • Maximum Likelihood [10 + 25 pts]

4.1    Discrete Example [10 pts]

Suppose we have two types of coins, A and B. The probability of a Type A coin showing heads is . The probability of a Type B coin showing heads is 2 . Here, we have a bunch of coins of either type A or B. Each time we choose one coin and ip it. We do this experiment 10 times and the results are shown in the chart below. (Hint: The probabilities aforementioned are for the particular sequence below.)







3





Coin Type    Result

    • Tail

    • Tail

    • Tail

    • Tail

    • Tail

    • Head

    • Head

        ◦ Head

        ◦ Head

        ◦ Head

    (a) What is the likelihood of the result given  ? [4pts]

    (b) What is the maximum likelihood estimation for  ? [6pts]




4.2    Normal distribution [15 pts](Bonus for Undergrads)

Suppose that we observe samples of a known function g(t) = t3 with unknown amplitude at (known) arbitrary locations t1; : : : ; tN ; and these samples are corrupted by Gaussian noise. That is, we observe the sequence of random variables
Xn =  t3n + Zn;    n = 1; : : : ; N
where the Zn are independent and Zn    Normal  0;  2

(a) Given X1 = x1; : : : ; XN = xN ; compute the log likelihood function

‘ ( ; x1; : : : ; xN ) = log fX1;:::;XN (x1; : : : ; xN ; ) = log (fX1 (x1; ) fX2 (x2; )   fXN (xN ; ))

Note that the Xn are independent (as the last equality is suggesting) but not identically distributed (they have di erent means). [9pts]

(b) Compute the MLE for  . [6pts]



4.3    Bonus for undergrads [10 pts]

The C.D.F of independent random variables X1; X2; :::; Xn is




8
0;


x < 0
P (X
i
x  ;  ) =

(
x
) ;  0

x















j
>









>










<










>
1;


x >



>







:


4





where    0,    0.

    (a) Write down the P.D.F of above independent random variables. [4pts]

    (b) Find the MLEs of   and  . [6pts]


    • Information Theory [32pts]

5.1    Marginal Distribution [6pts]

Suppose the joint probability distribution of two binary random variables X and Y are given as follows.
XjY
1
2
0

1
1









3
3




1
0
1







3






    (a) Show the marginal distribution of X and Y , respectively. [3pts]

    (b) Find mutual information for the joint probability distribution in the pre-vious question [3pts]

5.2    Mutual Information and Entropy [19pts]

Given a dataset as below.

Sr:N o:
Age
Immunity
T ravelled?
U nderlyingConditions
Self
quarantine?







1
young
high
no
yes

no







2
young
high
no
no

no







3
middleaged
high
no
yes

yes







4
senior
medium
no
yes

yes







5
senior
low
yes
yes

yes







6
senior
low
yes
no

no







7
middleaged
low
yes
no

yes







8
young
medium
no
yes

no







9
young
low
yes
yes

no







10
senior
medium
yes
yes

yes







11
young
medium
yes
no

yes







12
middleaged
medium
no
no

yes







13
middleaged
high
yes
yes

yes







14
senior
medium
no
no

no








We want to decide whether an individual working in an essential services industry should be allowed to work or self-quarantine. Each input has four features (x1, x2, x3, x4): Age, Immunity, Travelled, Underlying Conditions. The decision (quarantine vs not) is represented as Y .


5





    (a) Find entropy H(Y ). [3pts]

    (b) Find conditional entropy H(Y jx1), H(Y jx4), respectively. [8pts]

    (c) Find mutual information I(x1; Y ) and I(x4; Y ) and determine which one (x1 or x4) is more informative. [4pts]

    (d) Find joint entropy H(Y; x3). [4pts]

5.3    Entropy Proofs [7pts]

    (a) Suppose X and Y are independent. Show that H(XjY ) = H(X). [2pts]

    (b) Suppose X and Y are independent. Show that H(X; Y ) = H(X) + H(Y ). [2pts]

    (c) Prove that the mutual information is symmetric, i.e., I(X; Y ) = I(Y; X) and xi 2 X; yi 2 Y [3pts]

    • Bonus for All [10 pts]

        (a) If a random variable X has a Poisson distribution with mean 8, then calculate the expectation E[(X + 2)2] [2 pts]

        (b) A person decides to toss a fair coin repeatedly until he gets a head. He will make at most 3 tosses. Let the random variable Y denote the number of heads. Find the variance of Y. [4 pts]

        (c) Two random variables X and Y are distributed according to

(

fx;y(x; y) =
(x + y);    0    x    1; 0    y    1

0;    otherwise

What is the probability P(X+Y    1)? [4 pts]


















6

More products