$24
(50 pts) In this computer experiment, we will implement the gradient descent method and Newton’s method. Let f (x, y) = − log(1−x−y)−log x−log y with domain D = {(x, y) : x+y < 1, x 0, y 0}.
Find the gradient and the Hessian of f on paper.
Begin with an initial point in w0 ∈ D with η = 1 and estimate the global minimum of f using the Gradient descent method, which will provide you with points w1, w2, . . . ,. Report your initial point w0 and η of your choice. Draw a graph that shows the trajectory followed by the points at each iteration. Also, plot the energies f (w0), f (w1), . . . , achieved by the points at each iteration. Note: During the iterations, your point may “jump” out of D where f is undefined. If that happens, change your initial starting point and/or η.
Repeat part (b) using Newton’s method.
Compare the speed of convergence of gradient descent and Newton’s method, i.e. how fast does each method approach the estimated global minimum?
(50 pts) Perform the following steps:
Let xI = i, i = 1, . . . , 50.
Let yI = i + uI, i = 1, . . . , 50, where each uI should be chosen to be an arbitrary real number between −1 and 1.
Find the linear least squares fit to (xI, yI), i = 1, . . . , 50 analytically using the matrix pseudoin-
verse. Note that the linear least squares fit is the line y = w0 + w1x, where w0 and w1 should be
chosen to minimize P
50
(yI − (w0
+ w1xI))2.
I=1
(d) Plot the points (xI, yI), i = 1, . . . , 50 together with their linear least squares fit.
50
+ w1xI))2 (derivatives with respect to w0
(e) Find (on paper) the gradient of PI=1
(yI − (w0
and w1).
(Re)find the linear least squares fit numerically using the gradient descent algorithm. Compare with (c).
1