$24
1. Consider the vector dataset D given in the link https://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones# with jDj = N such that each v 2 D is embedded in a suitable R D of min-
imum possible dimension D. Construct a suitable subspace S RD of
ln
p
N
0:05
dimension at most O
0:01
such that at least 95% of the pairwise
distances between the points in D and their corresponding projections to S do not di er by more than a factor of 0:1. Now produce the best- t of
◦ along this S.
2. Construct the top k-SVD subspace Vk for D such that the ratio of t of
◦ along Vk to the t of D along V (the full SVD-subspace) does not fall below 0:1. Having obtained this Vk, compare this t with the t obtained in Part 1 above. Discuss the results.
3. Generate a dataset D0 which has the same dimensions as the original dataset D such that each v 2 D0 is distributed N (0; ). Choose such that it is non-zero in all its elements. Now nd the probability of the following events:
P
pjDj
1:05 max(
p
) + q
n
max(D0)
tr( )
P
q
pjDj
0:95 min(
)
n
min(D0)
p
tr( )
by repeated generation of such a dataset under your same chosen .