$18.99
1. [6 points] Linear Regression Basics
Consider a linear model of the form yˆ(i) = w| x(i) + b, where w, x ∈ RK and b ∈ R. Next, we are given a training dataset, D = {(x(i), y(i) )} denoting the corresponding input-target example pairs.
(a) What is the loss function, L, for training a linear regression model? (Don’t forget the
1
2 )
Your answer:
(i)
(b) Compute ∂L .
∂yˆ
Your answer:
∂w
k
(c) Compute ∂yˆ(i) , where w
k
Your answer:
denotes the kth element of w.
(d) Putting the previous parts together, what is ∇w L ?
∂b
Your answer: (e) Compute ∂L .
Your answer:
(f ) For convenience, we group w and b together into u, then we denote z = [x 1]. (i.e . yˆ = u| [x, 1] = w| x + b). What are the optimal parameters u∗ = [w∗ , b∗ ]? Use the notation Z ∈ R|D|×(K +1) and y ∈ R|D| in the answer. Where, each row of Z, y denotes an example input-target pair in the dataset.
Your answer:
2. [2 points] Linear Regression Probabilistic Interpretation
Consider that the input x(i) ∈ R and target variable y(i) ∈ R to have to following relationship.
y(i) = w · x(i) + (i)
where, is independently and identically distributed according to a Gaussian distribution with zero mean and unit variance.
(a) What is the conditional probability p(y(i)|x(i) , w).
Your answer:
(b) Given a dataset D = {(x(i) , y(i) )}, what is the negative log likelihood of the dataset according to our model? (Simplify.)
Your answer: