$29
Remember the honor code while submitting this (and every other) assignment. You may discuss broad ideas with other students or ask me for any di culties, but the code you implement and the answers you write must be your own. We will adopt a zero-tolerance policy against any violation.
Submission instructions: Follow the instructions for the submission format and the naming convention of your les from the submission guidelines le in the homework folder. However, please do not submit the face image databases in your zip le that you will upload on moodle. Please see assignment4 SVD FaceRecognition.rar. Upload the le on moodle before 11:55 pm on 6th November. We will not penalize submission of the les till 10 am on 7th November. No late assignments will be accepted after this time. Please preserve a copy of all your work until the end of the semester.
1. Given a matrix A of size m n, write a MATLAB routine called MySVD which takes this matrix as input and outputs the left and right singular vectors (i.e. column vectors of U and V under usual notation) and the singular values (i.e. diagonal entries of S) of A. You are not allowed to use the svd or svds functions of MATLAB directly. You should use only the eigenvalue decomposition routines eig or eigs for this task. Cross-check your answer by verifying that A = USV T based on your computation. [15 points]
2. Consider a set of N vectors X = fx1; x2; :::; xN g each in Rd, with average vector x. We have seen in class that
the direction e such that
iN=1 kxi x (e (xi
x))ek2 is minimized, is obtained by maximizing etCe, where
matrix of the vectors in
X
.
This vector e is the eigenvector of matrix C with the highest
C is the covariance
P
t
Cf is maximized, is the eigenvector of
eigenvalue. Prove that the direction f perpendicular to e for which f
C with the second highest eigenvalue. For simplicity, assume that all non-zero eigenvalues of C are distinct and that rank(C) > 2. [10 points]
3. Consider a matrix A of size m n; m n. De ne P = AT A and Q = AAT . (Note: all matrices, vectors and scalars involved in this question are real-valued).
(a) Prove that for any vector y with appropriate number of elements, we have ytP y 0. Similarly show that ztQz 0 for a vector z with appropriate number of elements. Why are the eigenvalues of P and Q non-negative?
(b) If u is an eigenvector of P with eigenvalue , show that Au is an eigenvector of Q with eigenvalue . If v is an eigenvector of Q with eigenvalue , show that AT v is an eigenvector of P with eigenvalue . What will be the number of elements in u and v?
(c)
If vi is an eigenvector of Q and we de ne ui ,
AT vi
. Then prove that there will exist some real,
kAT vik2
non-negative i such that Aui = ivi.
(d)
It can be shown that uiT uj = 0 for i 6= j and likewise viT vj = 0 for i 6= j
for correspondingly distinct
eigenvalues.1. Now, de ne U = [v1jv2jv3j:::jvm] and V = [u1ju2ju3j:::jum]. Now show that A = U V T
where is a diagonal matrix containing the non-negative values 1; 2; :::; m. With this, you have just
established the existence of the singular value decomposition of any matrix A. This is a key result in linear
algebra and it is widely used in image processing, computer vision, computer graphics, statistics, machine
learning, numerical analysis, natural language processing and data mining. [5 + 5 + 5 + 5 = 20 points]
1This follows because P and Q are symmetric matrices. Consider P u1
= 1u1 and P u2
= 2u2. Then u2T P u1 = 1u2T u1. But
u2T P u1 also equal to (P T u2)T u1 = (P u2)T u1 = ( 2u2)T u1 = 2u2T u1.
Hence 2uT u1 = 1uT u1
. Since 2 = 1, we must have
u2T u1 = 0.
2
2
6
1
4. In this part, you will implement a mini face recognition system. Download the ORL face database from the homework folder. It contains 40 sub-folders, one for each of the 40 subjects/persons. For each person, there are ten images in the appropriate folder named 1.pgm to 10.pgm. The images are of size 92 by 110 each. Each image is in the pgm format. You can view the images in this format, either through MATLAB or through image viewers like IrfanView on Windows, or xv/display/gimp on Unix. Though the face images are in di erent poses, expressions and facial accessories, they are all roughly aligned (the eyes are in roughly similar locations in all images). For the rst part of the assignment, you will work with the images of the rst 32 people. For each person, you will include the rst six images in the training set (that is the rst 6 images that appear in a directory listing as produced by the dir function of MATLAB) and the remaining four images in the testing set. You implement the recognition system by using the svd function of MATLAB on an appropriate data matrix. Record the recognition rate using squared di erence between the eigencoe cients while testing on all the images in the test set, for k 2 f1; 2; 3; 5; 10; 15; 20; 30; 50; 75; 100; 150; 170g. Plot the rates in your report in the form of a graph. Now modify the required few lines of the code but using the eig function of MATLAB (on the L matrix as de ned in class) instead of svd.
Repeat the same experiment (using just the svd routine) on the Yale Face database from the homework folder. This database contains about 64 images each of 38 individuals (labeled from 1 to 39, with number 14 miss-ing; some folders have slightly less than 64 images). Each image is in pgm format and has size 192 by 168. The images are taken under di erent lighting conditions but in the same pose. Take the rst 40 images of every person for training and test on the remaining 24 images (that is the rst 40 images that appear in a directory listing as produced by the dir function of MATLAB). Plot in your report the recognition rates for k 2 f1; 2; 3; 5; 10; 15; 20; 30; 50; 60; 65; 75; 100; 200; 300; 500; 1000g based on (a) the squared di erence between all the eigencoe cients and (b) the squared di erence between all except the three eigencoe cients corresponding to the eigenvectors with the three largest eigenvalues. [30 points]
5. Display in your report the reconstruction of any one face image from the ORL database using
k 2 f2; 10; 20; 50; 75; 100; 125; 150; 175g values. Plot the 25 eigenvectors (eigenfaces) corresponding to the 25 largest eigenvalues using the subplot or subimage commands in MATLAB. [10 points]
6. What will happen if you test your system on images of people which were not part of the training set? (i.e. the last 8 people from the ORL database). What mechanism will you use to report the fact that there is no matching identity? Work this out carefully and explain brie y in your report. Write code to test whatever you propose on all the 32 remaining images (i.e. 8 people times 4 images per person), as also the entire test set containing 6 images each of the rst 32 people. How many false positives/negatives did you get? [15 points]
2