$24
Give an example of a sequence fxkg of positive real numbers, that decreases to zero Q-superlinearly but not Q-quadratically. (Refer to Appendix A.2 of Numerical Optimization, 2nd Ed on convergence rates.)
Find positive values of , 1, and 2 such that the Gauss-Southwell choice dk = [rf(xk)]ik , where ik = arg maxi=1;2;:::;n j[rf(xk)]ij satis-es the conditions on dk required on dk in Section 3.4 of the notes.
Write a Matlab code to apply various rst-order methods to the con-vex quadratic function f(x) = (1=2)xT Ax, where the positive de nite matrix A is generated by the Matlab code fragment below (modi ed where necessary) to have eigenvalues in a prescribed range [ ; L], with 0 < L. (x = 0 is obviously a global minimizer.) In all cases, start from a point x0 generated by the Matlab command randn(n,1), and run until f(xk) f(x ) , where = 10 6. Implement the following methods:
Steepest descent with k 1=L.
Steepest descent with exact line search.
Nesterov’s optimal method from Chapter 4 of the notes.
Conjugate gradient method from p.112 of Numerical Optimiza-tion, 2nd Ed .
After you have debugged, make a version of your code called firstorder.m, that does the 10 trial runs and averages the num-ber of iterates required for convergence, that does not make a plot, and that prints nothing except four lines of output. Use these lines of code as the last four lines of your Matlab le, to generate the required output:
1
fprintf(1,’ steepest descent - fixed steps : %7.1f\n’,
av_sd);
fprintf(1,’ steepest descent - exact steps : %7.1f\n’,
av_sde);
fprintf(1,’ Nesterov
: %7.1f\n’,
av_nest);
fprintf(1,’ conjugate gradient
: %7.1f\n’,
av_cg);
where av sd, av sde, av nest, av cg are the average numbers of iterates (over the 10 random trials) for each of the four algorithms, computed by your code.
(b) Draw a plot of the convergence behavior on a typical run, plot-ting iteration number against log10(f(xk) f(x )). Use a single plot, with di erent colors for the di erent algorithms. You can use the \legend" command in Matlab to indicate which line belongs to
which algorithm, e.g. legend(’SD:const’,’SD:exact’,’Nesterov’,’CG’).
Discuss your results. In particular, say whether the observed con-vergence behavior is consistent with the rates that we proved in class for some of these methods.
Here’s the Matlab code fragment to generated the matrix A.
mu=0.01; L=1; kappa=L/mu;
n=100;
A = randn(n,n); [Q,R]=qr(A);
D=rand(n,1); D=10.^D; Dmin=min(D); Dmax=max(D);
D=(D-Dmin)/(Dmax-Dmin);
D = mu + D*(L-mu);
A = Q’*diag(D)*Q;
x0 = randn(n,1); % use a different x0 for each trial
Hand-in Instructions:
Hand in your code firstorder.m, and its output (detailed in (a) and (b)).
Hand in your discussion in part (c) and the answers to the other ques-tions as a pdf le, preferably typeset, but at least neatly written out.
Submit through Canvas.
2