Starting from:
$35

$29

Intro to Big Data Science: Assignment 3 Solution


Exercise 1

Log into “cookdata.cn”, and enroll the course “êâ‰Æ Ú”. Finish the online exer-cise there.

Exercise 2

We have a dataset with n records in which the i -th record has one real-valued input attribute xi and one real-valued output response yi . We have the following model with one unknown parameter w which we want to learn from data: yi » N (ew xi , 1). Note that the variance is known and equal to one.

        1. Is the task of estimating w a linear regression problem or a non-linear regression problem?

        2. Suppose you decide to do a maximum likelihood estimation of w. You do the math and figure out that you need w to satisfy one of the following equations. Which one?
(B)
n

w xi


n



w xi

Pn
xi e2w xi

P
n


xi yi ew xi
(A)
i
˘1 xi e


˘
i ˘1 xi yi e

(C)
Pn˘
x2ew xi

˘ Pn ˘
xi yi ew xi

i
1
xi2ew xi

˘

i
1


(D)
Pn˘



Pn˘
1
xi yi ew xi /2

i
1





i





(E)
Pn˘
ew xi


nP
yi ew xi


i
1
i



˘
i
˘1




Pi
˘1


˘
Pi ˘1





Exercise 3 (Linear regression as a projection)

Consider a multivariate liner model y ˘ Xw ¯† with y 2 Rn£1, X 2 Rn£(d¯1), w 2 R(d¯1)£1, and † 2 Rn£1, where † » N (0, ¾2I), follows the normal distribution.


1

    1. Show that the linear regression predictor is given by yˆ ˘ X(XT X)¡1XT y.

    2. Let P ˘ X(XT X)¡1XT , show that P has only 0 and 1 eigenvalues.

    3. Show that wˆ ˘ (XT X)¡1XT y is an unbiased estimator of w, i.e., E(wˆ) ˘ w. Also show that Var(wˆ) ˘ (XT X)¡1¾2. (Note that by definition, Var(wˆ) ˘ E[(wˆ¡E(wˆ))(wˆ¡ E(wˆ))T ]).
4. Recall the definition of R2
score: R2
:˘ 1
¡
SSr es
, where SSt ot
˘ P
n
(yi ¡ y¯)2,




SSt ot


i ˘1

    SS r eg ˘ Pni˘1(yˆi ¡ y¯)2, and SSr es ˘ Pni˘1(yi ¡ yˆi )2. Prove that for linear regression,
SSt ot ˘ SSr eg ¯SSr es . (So that R2 score can also be defined as R2
˘
SSr eg
)


SSt ot

Exercise 4 (Generalized Cross-Validation) Consider ridge regression:

    • i
min (y
¡
Xw)T (y
¡
Xw)
¯

w 2
w





k
k2
It has the solution wˆ ˘ (XT X ¯‚I)¡1XT y and prediction yˆ ˘ X(XT X ¯‚I)¡1XT y ˘ Py with P ˘ X(XT X ¯‚I)¡1XT be the projection matrix.
1. Define the leave-one-out cross validation estimator as
w
[k]
˘ arg  w
h
n

w)
2
¯

kwk2
i.




i ˘1,i 6˘k(yi ¡xi






ˆ

min

X
T




2













Show that wˆ[k] ˘ (XT X ¯‚I ¡xk xTk )¡1(XT y ¡xk yk )











squared error as V
(‚)


1

n

T ˆ
[k]

2. Define the ordinary cross-validation (OCV) mean









0



˘

kP˘1
(xk w

¡


n
yˆk
¡
yk  2








n




2









1
k˘1 ‡




·

, where yˆk




n




yk )
. Show that V0(‚) can be rewritten as V0(‚) ˘
n

1¡pkk



˘
j ˘1 pk j y j

and pk j is the (k, j )-entry of P.






P
















P





(Hint: You may need to use the Sherman-Morrison Formula for nonsingualar ma-


trix A and vectors x and y with yT A¡1x
1: (A

xyT )¡1


A¡1


A¡1xyT A¡1
·


















yT A 1x










k ˘ ‡
1 1 ¡pkk

2

6˘ ¡
¯




˘



¡ 1¯n

¡











3. Define weights as w




and weighted OCV as V (‚)


1


w


(xT wˆ[k]









˘ n k˘1

k

¡





n tr (I¡P) ·

















k



























P















    • k )2. Show that V (‚) can be written as

    • k(I ¡A)yk2
V (‚) ˘  n

        ◦ i2
1 ¡ t r (P)/n

Exercise 5 (Solving LASSO by ADMM) The alternating direction method of multipliers (ADMM) is a very useful algorithm for solving the constrained optimization problem:

min f (µ) ¯ g (z),    subject to    Aµ ¯Bz ˘ c.
    • ,z

The algorithm is given by using Lagrange multiplier u for the constraint. The detail is as follows:



2

1. µ(k¯1) ˘ arg min L(µ, z(k), u(k));
µ

2. z(k¯1) ˘ arg min L(µ(k¯1), z, u(k));
z
    3. u(k¯1) ˘ u(k) ¯Aµ(k¯1) ¯Bz(k¯1) ¡c;

where L is the augmented Lagrange function defined as

L(µ, z, u) ˘ f (µ) ¯ g (z) ¯uT (Aµ ¯Bz ¡c) ¯ 1 kAµ ¯Bz ¡ck2.

2    2

An advantage of ADMM is that no tuning parameter such as the step size in the gradient algorithm is involved. Please write down the ADMM steps for solving LASSO problem:
    • i
min ky ¡Xwk2 ¯‚kwk1  .
w    2

(Hint: In order to use ADMM, you have to introduce an auxiliary variable and a suit-able constraint. Please give the explicit formulae by solving “arg min” in each step of ADMM.)







































3

More products