COMS W4721: Machine Learning for Data Science (Solution)

$ 24.99
Category:

Description

Please read these instructions to ensure you receive full credit on your homework. Submit the written portion of your homework as a single PDF file through Courseworks (less than 5MB). In addition to your PDF write-up, submit all code written by you in their original extensions through Courseworks (e.g., .m, .r, .py, .c). Any coding language is acceptable, but do not submit notebooks, do not wrap your files in .rar, .zip, .tar and do not submit your write-up in .doc or other file type. Your grade will be based on the contents of one PDF file and the original source code. Additional files will be ignored. We will not run your code, so everything you are asked to show should be put in the PDF file. Show all work for full credit.
Problem 1 (Markov chains) – 30 points
Team A index, Team A points, Team B index, Team B points.
If Team A has more points than Team B, then Team A wins, and vice versa. The index of a team refers to the row of “TeamNames.txt” where that team’s name can be found.
Mcii ← Mcii + 1{Team A wins} + pointspointsi+pointsi j ,
pointsj
c {Team B wins} + pointsi+pointsj ,
pointsj
Mcij ← Mcij + 1{Team B wins} + pointsi+pointsj ,
{Team A wins} + pointspointsi+pointsi j .
After processing all games, let M be the matrix formed by normalizing the rows of Mc so they sum to one. Let wt be the 1×760 state vector at step t. Set w0 to the uniform distribution. Therefore, wt is the marginal distribution on each state after t steps given that the starting state is chosen uniformly at random.
a) Use wt to rank the teams by sorting in decreasing value according to this vector. List the top 25 teams and their corresponding values in wt for t = 10,100,1000,10000.
b) We saw that w∞ is related to the first eigenvector of MT. That is, we can find w∞ by getting the first eigenvector and eigenvalue of MT and post-processing:

This is because by convention. Also, we observe that λ1 = 1 for this specific matrix. Plot kwt − w∞k1 as a function of t for t = 1,…,10000.
Problem 2 (Nonnegative matrix factorization) – 30 points
In this problem you will factorize an N × M matrix X into a rank-K approximation WH, where W is N × K, H is K × M and all values in the matrices are nonnegative. Each value in W and H can be initialized randomly to a positive number, e.g., from a Uniform(1,2) distribution.
The data to be used for this problem consists of 8447 documents from The New York Times. (See below for how to process the data.) The vocabulary size is 3012 words. You will need to use this data to construct the matrix X, where Xij is the number of times word i appears in document j. Therefore, X is 3012×8447 and most values in X will equal zero.
a) Implement and run the NMF algorithm on this data using the divergence penalty. Set the rank to 25 and run for 100 iterations. This corresponds to learning 25 topics. Plot the objective as a function of iteration.
b) After running the algorithm, normalize the columns of W so they sum to one. For each column of W, list the 10 words having the largest weight and show the weight. The ith row of W corresponds to the ith word in the “dictionary” provided with the data. Organize these lists in a 5×5 table.
Comments about Problem 2:
• Each row in nytdata.txt corresponds to a single document. It gives the index of words appearing in that document and the number of times they appear. It uses the format “idx:cnt” with commas separating each unique word in the document. Any index that doesn’t appear in a row has a count of zero for that word. The vocabulary word corresponding to each index is given in the corresponding row of nytvocab.dat.

Reviews

There are no reviews yet.

Be the first to review “COMS W4721: Machine Learning for Data Science (Solution)”

Your email address will not be published. Required fields are marked *