Starting from:
$30

$24

Lab Session 5 Solution

MA-423 : Matrix Computations Lab













The purpose of this example is to illustrate that Householder QR factorization (or for that matter QR factorization obtained by using rotations) is backward stable, that is, computed QR factors are the exact QR factors of a slightly perturbed matrix.



R = triu(randn(50)); % compute a 50-by-50 random upper triangular matrix. [Q, X] = qr(randn(50)); % compute a 50-by-50 random unitary matrix.




A = Q*R; % A is a matrix with known QR factors.




[S, T] = qr(A); % Compute Householder QR factorization of A.







How accurate are S and T? Show that S and T are very far from the known QR factors Q and R of A. This illustrates that the Householder QR factrization algorithm is not what is called forward stable, that is, the computed factors are not close to the exact factors. Compute the errors




[norm( Q-S ), norm( R-T )] % what do you observe?




Now test the backward stability of the Householder QR factorization. Setting E = S*T

A, we have A+E = S*T. Hence if ∥E∥2/∥A∥2 = O(u) then clearly the algorithm qr is backward stable. Compute



norm(A-S*T) % what is your conclusion?




Comment: The numerical rank of A ∈ Rn× m is obtained in matlab by typing rank(A). matlab uses the SVD of A to obtain this value. More specifically, it is obtained by calculating a tolerance level tol = ϵ max{n, m} A∥2 where ϵ stands for machine epsilon and then setting rank(A) to be the number of singular values of A which are greater than this value of tol. It is possible for the user to change this tol value to something else. Type help rank for details. The purpose of the next exercise is to illustrate that the above method computes the rank of a matrix quite efficiently in the presence of rounding.




Type A = randn(7, 4) and examine its rank by typing rank(A). Since random matrices are usually full rank, its rank will be 4 in all probability. Add two more columns to A by typing A(:, 5, 6) = A(: 2, 3) + A(:, 3, 4); This will create a fifth column in A which is the sum of the 2nd and 3rd columns and a sixth column which is the sum of the 3rd and 4th columns. Theoretically, the rank of A should remain unchanged. What happens to the numerical rank? Type rank(A) to find out.



Comment: The purpose of the next exercise is to illustrate that in the presence of rounding, the SVD is generally more efficient in determining the rank of a matrix than the rank revealing QR factorization.




The Kahan matrix Rn(θ) is an n × n upper triangular matrix depending on a parameter





















1






Let c = cos(θ) and s = sin(θ). Then





0
1






10
1 −c −c . . . −c
1




s


. .


1 −c . . . −c




B






CB




. .
..
C




B








CB






−.c
C


Rn(θ) :=
B
s2
.
sn


CB
1
.


C
.


B




1
CB






1
C




B







CB








C




@








A@








A


If θ and n are chosen so that s is close to 1 and n is modestly large, then none of the main diagonal entries are extremely small. It appears that the matrix is far from rank deficient, which is actually not the case. Consider Rn(θ) when n = 90 and θ = 1.2 radians. Verify for yourself that the largest main diagonal entry of Rn(θ) is 1 and the smallest is .001.




(a) To generate Rn(θ) in matlab and find its singular values type




= gallery(′kahan′, 90, 1.2, 0); sig = svd(A)



Type format short e and examine σ1, σ89 and σ90. Type rank(A) to get matlab’s opinion of the numerical rank of A.




Type A = gallery(′kahan′, 90, 1.2, 25) to get a slightly perturbed version of the Kahan matrix. (This produces the Kahan matrix with very small perturbations to the diagonal entries. Type help private/kahan for more details.) Repeat part (a) for the perturbed matrix. Perform a QR decomposition by column pivoting on A by typing [Q, R, E] = qr(A). Verify that no pivoting was done in this case by examining the value of dif = norm(eye(90) − E). Examine R(90, 90) and infer that the rank revealing QR decomposition failed to detect the numerical rank deficiency of A.



The purpose of this exercise is to test the text mining algorithms. Consider the documents



Doc. 1: The Google matrix G is a model of the Internet.

Doc. 2: Gij is nonzero if there is a link from web page j to i.




Doc. 3: The Google matrix G is used to rank all web pages.




Doc. 4: The ranking is done by solving a matrix eigenvalue problem.




Doc. 5: England dropped out of the top 10 in the FIFA ranking.




Next, consider the dictionary { eigenvalue, England, FIFA, Google, Internet, link, matrix, page, rank, web } and determine the term-document matrix A. The list of terms created in alphabetic order is called index. Index is created after eliminating (a) all stop words - common words whose occurrence in a document does distinguish it from other documents - and (b) performing stemming. Stemming is the process of reducing each word that is conjugated or has a suffix to its stem. For example, for information retrieval, no information is lost in the following reduction:




computable







−→


computation



9






=




computing



comput







computed

















computational
;









2






Public domain stemming algorithms are available on the Internet.




Suppose that we want to find all documents that are relevant to the query ranking of web pages. Let v be the query vector (the term-document matrix of the query). Then the information retrieval task can now be formulated as a mathematical problem.




Problem: Find the columns of A that are close to the query vector v.




Use cosine distance measure (with a tolerance) to return the relevant document. Your task is to determine the relevant documents (a) by comparing cosine distances from v to the columns of A and (b) by latent semantic indexing (LSI) method.




*** End ***























































































































































3

More products