$23.99
In this assignment, you will implement a Matlab function to decompose a matrix A into the product of two matrices A = QR, where Q has orthonormal columns and R is upper triangular. You will then use your decomposition to solve a shape fitting problem.
You will need the files back_sub.m for Part 4 and generate_data.m and visualize.m for Part 5. These can be found in the zip file provided on the Moodle assignment page.
1 Submission Guidelines
You will submit a zip file that contains the following .m files on Moodle:
• ortho_decomp.m
• my_qr.m
• least_squares.m
• my_pack.m
• my_unpack.m
• design_matrix.m
• affine_fit.m
as well as any helper functions that are needed by your implementation. As with previous assignments, please note the following rules:
1. Each file must be named as specified above.
2. Each function must return the specified value(s).
3. You may use Matlab’s array manipulation syntax, e.g. size(A), A(i,:), and [A,B], and basic operations like addition, multiplication, and transpose of matrices and vectors. High-level linear algebra functions such as inv, qr, and A\b are not allowed except where specified. Please contact the instructor with further questions.
4. This is an individual assignment, and no collaboration is allowed. Any in-person or online discussion should stop before you start discussing or designing a solution.
2 Orthogonal decomposition
Suppose you have an orthonormal basis {u1 , . . . , un } for a subspace H ⊆ Rm . Any vector v ∈ Rm can be decomposed into the sum of two orthogonal vectors,
vˆ ∈ H and vˆ⊥ ∈ H ⊥ . Furthermore, since vˆ
∈ H , it can be expressed as
vˆ = c1 u1 + · · · + cn un for some weights c1 , . . . , cn . Implement a function to perform this decomposition.
Specification:
function [c, v_perp] = ortho_decomp(U, v)
Input: an m × n matrix U = u1 · · · un with orthonormal columns, and a vector v.
Output:
c1
c: a vector c = . such that v = c u
+ · · · + c u
+ vˆ⊥
.
cn
1 1 n n
v_perp: the residual vector v⊥ such that ui · vˆ⊥ = 0 for all i.
Implementation notes:
Here vˆ is simply the orthogonal projection of v onto the subspace H . Since we have an orthogonal basis of H , this should be easy to compute. In fact, since the basis is orthonormal, it is possible to compute the projection in just one line of code without any for loops. Hint: What are the entries of UT v?
Test cases:
1 0
2
2
0
3
1. U = 0 0 , v = 3
→ c =
, vˆ⊥ =
0 1
4
4 0
0 0 5 5
0.6
1
0.16
2. U = 0.8 , v = 1 → c = 1.4 , vˆ⊥ = −0.12
0 1 1
3. Generate a random orthonormal set of 5 vectors in R10 using the com- mand [U,~] = qr(randn(10,5),0), and generate another random integer vector v = randi([-2,2], 10,1). Compute your orthogonal decomposi- tion, [c, v_perp] = ortho_decomp(U, v). Verify that U*c + v_perp is nearly the same as v, and U’*v_perp is nearly zero (both to within 10−12 ).
3 QR decomposition
Suppose you have an m × n matrix A which is “tall and thin”, i.e. with m n, and the columns of A are linearly independent. The Gram-Schmidt process corresponds to a factorization A = QR, where Q is an m × n matrix with orthonormal columns, and R is an n × n upper triangular matrix.1
Specification:
function [Q, R] = my_qr(A)
Input: an m × n matrix A with m n.
Output: an m × n matrix Q with orthonormal columns, and an n × n upper triangular matrix R, such that A = QR.
Implementation notes:
I recommend starting your implementation by first having your function compute
Q correctly. After that, you can modify your code to also compute R.
The columns of Q are essentially obtained by performing the Gram-Schmidt process with normalization on the columns of A:
q1 = a1 /ka1 k,
2
aˆ⊥ = a2 − projH1
a2 where H1 = span{q1 },
q2 = aˆ⊥ /kaˆ⊥ k,
2 2
3
aˆ⊥ = a3 − projH2
a3 where H2 = span{q1 , q2 },
q3 = aˆ⊥ /kaˆ⊥ k,
3 3
.
You can carry this out using your orthogonal decomposition function. For each column i = 1, . . . , n, call ortho_decomp with an appropriate set of inputs, and
1 Technically, what we are computing here is known as the “thin” or “reduced” QR decom- position. In the full QR decomposition, Q is square and orthogonal, and R is an m × n upper triangular matrix whose lower (m − n) rows are all zero. In practice, it is also not computed with the Gram-Schmidt process, but with other methods that are more numerically stable.
use the resulting vˆ⊥ to fill in the ith column of Q. In the end, the columns of
Q should be an orthonormal set of vectors {q1 , . . . , qn } which span Col A.
Once you can compute Q correctly, modify your function to compute R as well. In principle, since QT Q = I, you could obtain R simply via QT A = QT QR = R. However, this is more work than necessary, because you can actually fill in the entries of R as you compute each column of Q. When you compute the ith column of Q, the ortho_decomp call gives you both c and vˆ⊥ ; can you use these to fill in the i nonzero entries in the ith column of R? Hint: Consider the case
1 2 3
when A is already an upper triangular matrix, say A = 0 4 5, and work
0 0 6
out by hand what c and vˆ⊥ will be for each column.
Test cases:
3. Generate a random 10 × 5 integer matrix A = randi([-2,2], 10,5) and compute your QR decomposition, [Q, R] = my_qr(A). Verify that istriu(R) is true, Q’*Q is nearly the identity matrix, and Q*R is nearly the same as A (both to within 10−12 ).
4 Least-squares problems
Use the QR decomposition to solve the least-squares problem Ax ≈ b.
Specification:
function x = least_squares(A, b)
Input: an m × n matrix A and a vector b ∈ Rm .
Output: a vector x ∈ Rn which is the least-squares solution of the over- determined system Ax ≈ b, i.e. such that Ax − b ∈ (Col A)⊥ .
Implementation notes:
Do not solve the normal equations AT Ax = AT b directly, as this is a very numerically unstable approach. Instead, compute the QR factorization of A and use it to solve the least-squares problem with only a matrix-vector multiplication and a back-substitution, as described in the textbook. Use the back_sub function provided with this assignment to perform the back-substitution.
If you cannot get your my_qr function to work, you may use Matlab’s built-in qr function here so you can still attempt Part 5. Call qr(A,0) to get a rectangular Q and square R as desired.
Test cases:
3. Generate a random 10 × 5 integer matrix A = randi([-2,2], 10,5) and a random integer vector b = randi([-2,2], 10,1). Compute your least-squares solution, x = least_squares(A, b), and obtain the residual r = b - A*x. Verify that A’*r is nearly zero (to within 10−12 ).
5 Best-fitting transformations
An affine transformation is a combination of a linear transformation and a translation (i.e. displacementwhile retaining parallelism, i.e., parallel lines
x
remain parallel after the transformation. For example, a point y
x
on A (orange
square) in Figure 1 can be transformed to y
m11 m12
on B (red parallelogram) via a
linear transform M =
m21 m22
x
, i.e.,
m11 m12 x .
y = m21 m22 y
Note that the parallel lines stay parallel. This transform can be combined with a translation, moving the red parallelogram to blue parallelogram (C). This composite transformation can be written as:
x
e =
y
e
x tx
y + ty
= m11 m12
m21 m22
x + tx
y ty
x
= M y
+ t, (1)
where t =
tx
ty
is the translation vector.
Figure 1: Affine transform.
In sum, Equation (1) can be written as
x
e =
y
e
m11 x + m12 y + tx
m22 x + m22 y + ty
. (2)
5.1 Linear Equation from a Single Correspondence
xi yi
Your task is given many correspondences2 (xi , yi ) ↔ (e , e ) where i is the index for the correspondence, to compute the best affine transform parameters, m11 , m12 , m21 , m22 , tx , ty (unknowns): We can rewrite Equation (2) by arranging it with respect to the unknowns for, say, the first correspondence:
m11
m12
x1
e = x1 y1 0 0 1 0
y1 0 0 x1 y1 0 1
m21 . (3)
m22
e
| {z }
| {z } tx
1
p A1
e ty
| {βz }
or simply,
e
p1 = A1 β. (4)
Note that A1 depends on the original position of the point, p1 =
x1
y1
. The
correspondence produces 2 linear equations while the number of unknowns is
6, so at least 3 correspondences are needed to uniquely determine the affine transform parameters.
Implement a function beta = my_pack(M, t) to obtain β from M =
tx
m11 m12
m21 m22
and t =
ty , and its inverse function [M, t] = my_unpack(beta).
function beta = my_pack(M, t)
Input: a 2 × 2 matrix M and a vector t.
Output: a vector β ∈ R6 containing the entries of M and t.
function [M, t] = my_unpack(beta)
Input: a vector β ∈ R6 .
Output: a 2 × 2 matrix M and a vector t ∈ R2 such that my_pack(M, t) = β.
Finally, construct the matrix Ai given pi .
function Ai = design_matrix(pi)
Input: a vector pi =
xi
yi
2
∈ R .
Output: the matrix Ai given by Equation (3).
x, y)
2 A point (x, y) in A is transformed to a point (e e
in C. This pair of points forms a
correspondence.
5.2 Solving Linear System from n Correspondences
Equation (3) can be extended to include n correspondences as follow:
1
x
e
x1 y1 0 0 1 0 m11
y1 0 0 x1 y1 0 1
m12
e
m
. . . .
. . .
21
. = .
. . .
. . m22 , (5)
n n
xn
x y 0 0 1 0
e
tx
n
y
e
or again simply,
0 0 xn yn 0 1 ty
1
p
e
.
A1
.
. = .
n
n
p A
e
β, (6)
If all correspondences are noise-free, Equation (6) will be always satisfied. Due to correspondence noise in practice, it cannot be satisfied, and therefore,
1
p
e
.
A1
.
. ≈ .
n
n
p A
e
β, (7)
Now we will compute the best affine transform parameters using Equation (7).
function [M, t] = affine_fit(P, P_tilde)
p1 pk
Input: 2×k correspondence matrices P = p1 · · · pk and Pe = e
containing the reference points and transformed points as columns.
· · · e
Output: a 2 × 2 matrix M and a vector t ∈ R2 such that the affine transfor- mation Mx + t best matches the given data. To compute this, you will have to set up and solve the least-squares problem in Equation (7) to obtain β, then unpack it to get M and t out.
Test cases:
For the first two tests, pick an arbitrary M and t, say M =
1 2
3 4
and t =
5
6 .
1. Check that [M_new, t_new] = my_unpack(my_pack(M, t)) recovers the original values of M and t.
0
2. Verify that when x =
1
0 , design_matrix(x)*my_pack(M,t) gives back
0
t. Similarly, x = 0
and x = 1
should give you m1 + t and m2 + t
respectively, where m1 and m2 are the columns of M.
pi to Mpi + t
4. We have provided a function [P, P_tilde, M, t] = generate_data() that produces some random test data. It does so by filling random values in M and t, choosing random points pi , then setting each e
plus a small amount of random noise. You can visualize the data by calling
the provided function visualize(P, P_tilde).
Call affine_fit(P, P_tilde) to obtain your own estimates of M and t. The result should be close to the “ground truth” transformation returned by generate_data, although due to the added noise it will not be exactly identical. Call visualize(P, P_tilde, M, t) to see what your fit looks like.
Note: If you want to test on a smaller data set, just use the first few columns
of P and Pe , for example P = P(:, 1:10) and P_tilde = P_tilde(:, 1:10).