Starting from:
$30

$24

Machine Learning Homework 5 Solution




Problem 1 (Markov chains) – 30 points




In this problem, you will rank 763 college football teams based on the scores of every game in the 2017 season. The data provided in CFB2017 scores.csv contains the result of one game on each line in the format







Team A index, Team A points, Team B index, Team B points.




If Team A has more points than Team B, then Team A wins, and vice versa. The index of a team refers to the row of “TeamNames.txt” where that team’s name can be found.




Construct a 763 763 random walk matrix M on the college football teams. First construct the unnor-malized matrix Mc with values initialized to zeros. For one particular game, let i be the index of Team A and j the index of Team B. Then update

                       Mii
Mii
+ 1
Team A wins






+


pointsi
;






pointsi+pointsj


c
c


1f






g




pointsj


Mjj
Mjj +








Team B wins


+




;










pointsi+pointsj
c
c
+
1
f






g




pointsj
;
Mij
Mij


fTeam B winsg
+ pointsi+pointsj
c
c




























M
M
+ 1
f
Team A wins
g
+


pointsi


:
pointsi+pointsj
cji
cji














 





After processing all games, let M be the matrix formed by normalizing the rows of Mc so they sum to one. Let wt be the 1 763 state vector at step t. Set w0 to the uniform distribution. Therefore, wt is the marginal distribution on each state after t steps given that the starting state is chosen uniformly at random.




Use wt to rank the teams by sorting in decreasing value according to this vector. List the top 25 teams and their corresponding values in wt for t = 10; 100; 1000; 10000.



We saw that w1 is related to the first eigenvector of MT . That is, we can find w1 by getting the first eigenvector and eigenvalue of MT and post-processing:



h




i



MT u1 = 1u1; w1







= uT =1




P
This is because uT u1 = 1 by convention. Also, we observe that 1 = 1 for this specific matrix. 1










Problem 2 (Nonnegative matrix factorization) – 30 points




In this problem you will factorize an N M matrix X into a rank-K approximation W H, where W is N K, H is K M and all values in the matrices are nonnegative. Each value in W and H can be initialized randomly to a positive number, e.g., from a Uniform(1,2) distribution.




The data to be used for this problem consists of 8447 documents from The New York Times. (See below for how to process the data.) The vocabulary size is 3012 words. You will need to use this data to construct the matrix X, where Xij is the number of times word i appears in document j. Therefore, X is 3012 8447 and most values in X will equal zero.




Implement and run the NMF algorithm on this data using the divergence penalty. Set the rank to 25 and run for 100 iterations. This corresponds to learning 25 topics. Plot the objective as a function of iteration.



After running the algorithm, normalize the columns of W so they sum to one. For each column of W , list the 10 words having the largest weight and show the weight. The ith row of W corresponds to the ith word in the “dictionary” provided with the data. Organize these lists in a 5 5 table.



Comments about Problem 2:




When dividing, you may get 0/0 = NaN. This will cause the algorithm to return all NaN values. You can add a very small number (e.g., 10 16) to the denominator to avoid this.




Each row in nyt data.txt corresponds to a single document. It gives the index of words appearing in that document and the number of times they appear. It uses the format “idx:cnt” with commas separating each unique word in the document. Any index that doesn’t appear in a row has a count of zero for that word. The vocabulary word corresponding to each index is given in the corresponding row of nyt vocab.dat.


More products