Starting from:
$29.99

$23.99

QR Decomposition Solution


In  this  assignment, you  will implement a Matlab function  to  decompose  a matrix  A into the product  of two matrices  A = QR, where Q has orthonormal columns  and  R is upper  triangular. You will then  use your  decomposition to solve a shape fitting  problem.

 

You  will  need  the  files  back_sub.m for  Part 4  and  generate_data.m and visualize.m for Part 5.  These  can  be found  in the  zip file provided on the Moodle assignment page.

 

 

1    Submission Guidelines

 

 

You will submit  a zip file that contains  the following .m files on Moodle:

 

•  ortho_decomp.m

 

•  my_qr.m

 

•  least_squares.m

 

•  my_pack.m

 

•  my_unpack.m

 

•  design_matrix.m

 

•  affine_fit.m

 

as well as any helper functions  that are needed by your implementation. As with previous  assignments, please note the following rules:

1.  Each  file must  be named  as specified above.

 

2.  Each  function  must  return the specified value(s).

 

3.  You may use Matlab’s array  manipulation syntax,  e.g. size(A), A(i,:), and [A,B], and basic operations like addition,  multiplication, and transpose of matrices and  vectors.  High-level linear  algebra  functions  such as inv, qr, and  A\b are  not  allowed  except  where  specified.  Please  contact the instructor with further  questions.

 

 

 

 

 

4.  This  is an  individual assignment, and  no collaboration is allowed.   Any in-person  or online discussion  should  stop  before you start discussing  or designing a solution.

 

 

2    Orthogonal decomposition

 

 

Suppose  you have  an orthonormal basis {u1 , . . . , un } for a subspace  H ⊆ Rm . Any vector v ∈ Rm can be decomposed into the sum of two orthogonal  vectors,

vˆ  ∈  H  and  vˆ⊥  ∈  H ⊥ .   Furthermore, since  vˆ


∈  H ,  it  can  be  expressed  as

vˆ  = c1 u1 + · · · + cn un for some weights  c1 , . . . , cn .  Implement  a function  to perform this decomposition.

 

 

Specification:

 

function [c, v_perp] = ortho_decomp(U, v)

Input: an m × n matrix  U =  u1    · · ·     un  with orthonormal  columns, and a vector  v.

Output:

c1 

c: a vector  c =  .  such that v = c  u


 

 

+ · · · + c  u


 

+ vˆ⊥

 . 

cn


1    1                       n  n

v_perp: the residual  vector  v⊥ such that ui · vˆ⊥ = 0 for all i.

 

 

Implementation notes:

 

Here vˆ  is simply the orthogonal  projection  of v onto the subspace  H . Since we have  an orthogonal basis of H , this  should  be easy to compute. In fact,  since the basis is orthonormal,  it is possible to compute the projection  in just one line of code without  any for loops.  Hint:  What  are the entries  of UT  v?

 

Test cases:

1   0


 

2


 

 

 

 2 


 

0

3

1.  U = 0   0 , v = 3


→      c =


, vˆ⊥ =  

0   1


4


4              0

                                                           

0   0               5                                             5

0.6


1


 0.16 

2.  U = 0.8 , v = 1   →      c =  1.4  , vˆ⊥ = −0.12

0                 1                                                    1

 

 

 

 

 

3.  Generate a random orthonormal set  of 5 vectors  in R10   using  the  com- mand [U,~] = qr(randn(10,5),0), and generate another  random integer vector v = randi([-2,2], 10,1). Compute  your orthogonal  decomposi- tion, [c, v_perp] = ortho_decomp(U, v). Verify that  U*c + v_perp is nearly the same as v, and U’*v_perp is nearly zero (both  to within 10−12 ).

 

 

3    QR decomposition

 

 

Suppose you have an m × n matrix  A which is “tall  and thin”,  i.e. with m n, and  the  columns  of A  are  linearly  independent.  The  Gram-Schmidt process corresponds to  a factorization A  = QR, where  Q  is an  m × n  matrix with orthonormal columns,  and R is an n × n upper  triangular matrix.1

 

 

Specification:

 

function [Q, R] = my_qr(A)

Input: an m × n matrix  A with m n.

Output: an m × n matrix Q with  orthonormal columns,  and  an n × n upper triangular matrix  R, such that A = QR.

 

 

Implementation notes:

 

I recommend starting  your implementation by first having your function compute

Q correctly.  After that, you can modify your code to also compute  R.

 

The  columns  of Q  are  essentially obtained by performing the  Gram-Schmidt process with normalization on the columns of A:

 

q1  = a1 /ka1 k,



2
 
aˆ⊥ = a2  − projH1


a2                            where H1  = span{q1 },

q2  = aˆ⊥ /kaˆ⊥ k,

2          2



3
 
aˆ⊥ = a3  − projH2


a3                            where H2  = span{q1 , q2 },

q3  = aˆ⊥ /kaˆ⊥ k,

3          3

 

.

 

You can carry this out using your orthogonal  decomposition  function.  For each column  i = 1, . . . , n, call ortho_decomp with  an appropriate set of inputs, and

 

1 Technically, what we are  computing here  is known  as the  “thin” or “reduced” QR  decom- position. In the  full QR  decomposition, Q is square  and  orthogonal, and  R is an  m × n upper triangular matrix whose  lower  (m − n) rows  are  all zero.  In practice, it is also  not  computed with  the  Gram-Schmidt process, but with  other methods that are  more  numerically stable.

 

 

 

 

 

use the  resulting vˆ⊥ to fill in the  ith  column  of Q.  In the  end, the  columns  of

Q should be an orthonormal set of vectors  {q1 , . . . , qn } which span Col A.

 

Once you can compute  Q correctly,  modify your function  to compute  R as well. In principle, since QT Q = I, you could obtain R simply via QT A = QT QR = R. However, this is more work than  necessary,  because you can actually  fill in the entries  of R as you compute each  column  of Q.   When  you compute the  ith column of Q, the ortho_decomp call gives you both c and vˆ⊥ ; can you use these to fill in the i nonzero entries  in the ith  column of R? Hint:  Consider  the case

1   2   3

when A is already  an upper  triangular matrix, say A = 0   4   5, and work

0   0   6

out by hand  what  c and vˆ⊥ will be for each column.

 

Test cases:

 0     3     0  


 

 0     1     0  


 

2   0   0

1.  A =  0     0   −4     →      Q =  0     0   −1 , R = 0   3   0

−2    0     0

1   1   0


−1    0     0

1/2      1/2      −1/2


0   0   4

 

2   1   1



              
 
2.  A = 1   0    0

1   1   1

1   0   1


1/2    −1/2    −1/2



→      Q =                            , R =
 
1/2      1/2       1/2  

                                

1/2    −1/2      1/2


0   1   0

0   0   1

 

3.  Generate a  random 10 × 5  integer   matrix A = randi([-2,2], 10,5) and  compute your  QR  decomposition, [Q, R] = my_qr(A). Verify that istriu(R) is true,  Q’*Q is nearly  the  identity matrix, and  Q*R is nearly the same as A (both  to within  10−12 ).

 

 

 

 

 

4    Least-squares problems

 

 

Use the QR decomposition  to solve the least-squares problem  Ax  ≈ b.

 

 

Specification:

 

function x = least_squares(A, b)

Input: an m × n matrix  A and a vector  b ∈ Rm .

Output:  a  vector  x  ∈  Rn  which  is the  least-squares solution  of the  over- determined system  Ax  ≈ b, i.e. such that Ax − b ∈ (Col A)⊥ .

 

 

Implementation notes:

 

Do not  solve the  normal  equations AT Ax  = AT b directly, as this  is a very numerically unstable  approach.  Instead,  compute the QR factorization  of A and use it to solve the least-squares  problem with only a matrix-vector multiplication and a back-substitution, as described in the textbook.  Use the back_sub function provided  with this assignment to perform the back-substitution.

 

If you cannot get your my_qr function to work, you may use Matlab’s built-in qr function here so you can still attempt Part  5. Call qr(A,0) to get a rectangular Q and square  R as desired.

 

Test cases:

1


 

2

1.  A = 1 , b = 3     →      x =  2.5

0               4

1   0    0 

1   1    1 


0

3


 

 0 

2.  A = 1   2    4  , b = 4


→      x =   4

                                                  

1   3    9 


3                       −1

                          

1   4   16               0

 

3.  Generate a  random 10 × 5  integer   matrix A = randi([-2,2], 10,5) and  a random  integer  vector  b = randi([-2,2], 10,1). Compute your least-squares solution, x = least_squares(A, b), and obtain the residual r = b - A*x. Verify that A’*r is nearly  zero (to within  10−12 ).

 

 

 

 

 

5    Best-fitting transformations

 

 

An  affine  transformation is a combination of a linear  transformation and  a translation (i.e.  displacement) while  retaining parallelism, i.e.,  parallel  lines

 x 

remain parallel after the transformation. For example, a point   y

 x 


on A (orange

square)  in Figure  1 can be transformed to   y

 m11      m12  


on B (red  parallelogram) via a

linear transform M =


m21       m22

 

 x 


, i.e.,

 

 

 m11       m12     x   .

y   =  m21       m22        y

 

Note that the parallel  lines stay  parallel.  This transform can be combined  with a translation, moving  the  red  parallelogram to  blue  parallelogram (C).  This composite  transformation can be written as:

 

 x 

e   =

y

e


 

  x         tx

y   +  ty


 

=  m11       m12

m21       m22


 

x   +  tx

y         ty


 

x

= M   y


 

 

+ t,             (1)

 

 

where t =


 

 tx  

ty


 

 

is the translation vector.

 

 

C             


                        


         

 y% 


m21


m22   y1 


t y 

 

B

 

A

 

 x2  =  m11


 

 

 

m12   x1 

                                        

 y2 


m21


m22   y1 

Figure  1: Affine transform.

 

In sum, Equation (1) can be written as

 

 x 

e   =

y

e


 

m11 x + m12 y + tx

m22 x + m22 y + ty


 

 

.                                       (2)

 

 

 

 

 

5.1     Linear Equation from a Single Correspondence

 



xi  yi
 
Your  task  is given  many  correspondences2  (xi , yi )  ↔  (e , e )  where  i  is the index for the correspondence,  to compute  the best affine transform parameters, m11 , m12 , m21 , m22 , tx , ty (unknowns):  We can rewrite Equation  (2) by arranging it with respect to the unknowns  for, say, the first correspondence:

m11 

m12 

 x1  


        

e     =  x1      y1       0     0    1   0

y1               0     0    x1      y1      0   1


m21  .                           (3)



       
 
m22

e                                                         

| {z }


|        {z                  }  tx  



1
 
p                              A1                                   

e                                                         ty

| {βz }

 

or simply,

 



e
 
p1 = A1 β.                                               (4)

 

 

Note  that A1   depends  on the  original  position  of the  point, p1  =


 

 x1  

y1


 

 

.  The

correspondence produces  2 linear  equations while the  number of unknowns is

6, so at  least  3 correspondences are  needed  to  uniquely  determine the  affine transform parameters.

 

Implement a function beta = my_pack(M, t) to obtain β from M =

 

 tx  


 

 m11      m12  

m21       m22

and t =


ty  , and its inverse function  [M, t] = my_unpack(beta).

 

function beta = my_pack(M, t)

Input: a 2 × 2 matrix  M  and a vector  t.

Output: a vector  β ∈ R6  containing  the entries  of M  and t.

 

function [M, t] = my_unpack(beta)

Input: a vector  β ∈ R6 .

Output: a 2 × 2 matrix  M  and a vector  t ∈ R2  such that my_pack(M, t) = β.

 

Finally,  construct the matrix  Ai given pi .

 

function Ai = design_matrix(pi)

 

Input: a vector  pi =


 xi  

yi


 



2
 
∈ R .

Output: the matrix  Ai given by Equation (3).

 



x, y)
 
2 A  point  (x, y) in  A  is transformed to  a  point  (e e


 

 

in  C.  This pair  of points forms  a

correspondence.

 

 

 

 

 

5.2     Solving Linear System from n Correspondences

 

Equation (3) can be extended  to include n correspondences  as follow:



1
 
x 

e


x1      y1        0      0     1   0 m11 

y1                 0      0     x1       y1      0   1


m12 

 e      


       

                                                 m

 .      .       .       .


.      .    .  


21 

 .  =  .


.       .       .


.    .  m22  ,                         (5)



n      n                                
 



 
xn 


x     y       0      0     1   0       

e     


  tx  



n
 
y

e

 

or again simply,


0      0     xn     yn     0   1       ty



1
 
p  

e

 . 


A1 

 .  

 .  =  .



n
 


n
 
p          A

e


 β,                                          (6)

 

If all correspondences  are noise-free, Equation (6) will be always satisfied.  Due to correspondence  noise in practice,  it cannot  be satisfied,  and therefore,



1
 
p  

e

 . 


A1 

 .  

 .  ≈  .



n
 


n
 
p          A

e


 β,                                          (7)

 

Now we will compute  the best affine transform parameters using Equation (7).

 

function [M, t] = affine_fit(P, P_tilde)


 

p1                   pk  

Input: 2×k correspondence matrices P =  p1    · · ·     pk  and Pe =  e

containing  the reference points  and transformed points  as columns.


· · ·     e

Output: a 2 × 2 matrix M  and  a vector  t ∈ R2  such that the  affine transfor- mation Mx  + t best  matches the  given data. To compute this,  you will have to set up and solve the least-squares  problem  in Equation  (7) to obtain  β, then unpack  it to get M  and t out.

 

Test cases:

 

 

For the first two tests,  pick an arbitrary M  and t, say M =


 

 

 1    2 

3   4


 

 

 

and t =


 

 

 5 

6  .

 

1.  Check that  [M_new, t_new] = my_unpack(my_pack(M, t)) recovers the original values of M  and t.

 

 0 

2.  Verify that when x =

 

 1 


0  , design_matrix(x)*my_pack(M,t) gives back

 0 

t.  Similarly,  x =  0


and  x =  1


should  give you m1 + t and  m2 + t

respectively,  where m1 and m2 are the columns of M.

 

 

 

 



 

 0    1    1    0
 

 0.4
 

1    1.8    1.2
0
0   1   1  ,  Pe
=  0.8
0   0.6    1.4
 
 
 
 
 
 
3.  Applying  affine_fit to  P =

 

   0.6      0.8 


 

 

 0.4 

should give M =


−0.8   0.6


and t =


0.8

 



4
 
p

e


: a rotation and a translation.

 

 

 



3
 
p

e

 

 

 

e

 



pi to Mpi + t
 
4.  We have  provided  a function  [P, P_tilde, M, t] = generate_data() that produces some random  test  data.  It does so by filling random  values in M  and t, choosing random  points  pi , then  setting  each e

plus a small amount of random noise. You can visualize the data  by calling

the provided  function  visualize(P, P_tilde).

 

Call affine_fit(P, P_tilde) to obtain  your own estimates  of M  and t. The result  should be close to the “ground  truth” transformation returned by generate_data, although  due to the added  noise it will not be exactly identical.  Call visualize(P, P_tilde, M, t) to see what  your fit looks like.

 

Note: If you want to test on a smaller data set, just use the first few columns

of P and Pe , for example P = P(:, 1:10) and P_tilde = P_tilde(:, 1:10).

More products