$24
The purpose of this homework is to see the relationship between the condition number and the numerical error when solving linear least-squares problems.
First, implement the following methods for least-squares, which you’ll use in the next exercise.
1. Method of Normal Equations (uses the Cholesky factorization)
2. Method based on the Thin QR factorization
Next, load the given matrix (download from Canvas) into memory. Call the matrix A,
A = a1 · · · an ,
(1)
where ai ∈ Rm is the ith column of A. Define the matrices A1, . . . , An as:
Ak = a1 · · · ak , k = 1, . . . , n.
(2)
That is, Ak contains the first k columns of the matrix A (that you loaded into memory).
Now, generate the error data that you’ll analyze. For k from kmin = 40 to kmax = 65:
1. Report the size, rank, and condition number of Ak.
2. Generate 100 random vectors bi ∈ Rm. For each bi,
(a) Use the built-in equation solver (numpy.linalg.lstsq) to com-pute the least-squares minimizers given Ak and bi. Call this vector xtrue, because we’re just gonna trust the software on this one.
(b) Use your Normal Equation solver to compute the least-squares minimizer, xNE. Compute the relative error with xtrue.
(c) Use your QR solver to compute the least-squares minimizer, xQR. Compute the relative error with xtrue.
3. For each of QR and Normal Equations, compute the average error over all the bi.
Make two plots on a semilogy scale:
1
• the average error versus k (how many columns in the matrix) for both QR and the Normal Equations,
• the condition number of Ak versus k.
Now tell me what’s going on. More specifically:
1. What is the relationship between the error using QR versus the Normal Equations?
2. What is the relationship between the errors and the condition number of Ak?
3. Suppose your matrix A is ill-conditioned. Which method is more fa-vorable?
2