Description
Sameer Singh http://sameersingh.org/courses/gml/fa18/
The submission for this homework, as for others so far, should be one stand-alone PDF file containing all of the relevant code, figures, and any text explaining your results. The code is not as important as your explanations of the trends and conclusions, so use the Markdown cells in Jupyter quite liberally. In this homework, we will be playing with a number of already implemented clustering and dimensionality reduction techniques.
Points: This homework adds up to a total of 100 points, as follows:
Problem 1: Clustering 45 points
Problem 2: EigenFaces 50 points
Statement of Collaboration 5 points
Be sure to re-download and replace the mltools package; it contains a number of new algorithms for you.
1 Clustering, 45 points
The code this week provides the three clustering algorithms we discussed: k-means, agglomerative clustering, and
EM for Gaussian mixture models; we will explore the first two here. (These functions are also provided in many
3rd party toolboxes; you are free to use those if you prefer.) In this problem, you’ll do some basic exploration of the clustering techniques.
1. Load the usual Iris data restricted to the first two features, and ignore the class / target variable. Plot the data and see for yourself how “clustered” you think it looks. Include the plot, and mention how many clusters you think exist (no wrong answer here). (5 points)
ml.plotClassify2D(None,X,z)
2. Run k-means on the data, for k = 2, k = 5, and k = 20. Try a few (at least 5 each) different initializations and check to see whether they find the same solution; if not, pick the one with the best score. For the chosen assignment for each k, include a plot with the data, colored by assignment, and the cluster centers. You can plot the points colored by assignments using, where z are the resulting cluster assignments of the data. You will have to additionally plot the centers yourself. (15 points)
ml.cluster.agglomerative from cluster.py
3. Run agglomerative clustering on the data, using single linkage and then again using complete linkage, each with 2, 5, and then 20 clusters (using). Again, plot with color the final assignment of the clusters. (This algorithm has no initialization issues; so you do not have to try multiple initializations.) (20 points)
4. Describe similarities and differences in the results from the agglomerative clustering and k-means. (5 points)
2 EigenFaces, 50 points
In class we discussed that PCA has been applied to faces, and showed some of the results. Here, you’ll explore this representation yourself. First, load the data and display a few faces to make sure you understand the data format:
X = np.genfromtxt(“data/faces.txt”, delimiter=None) # load face dataset plt.figure()
1
2
3
4
5
1. Subtract the mean of the face images (X0 = X −µ) to make your data zero-mean. (The mean should be of the same dimension as a face, 576 pixels.) Plot the mean face. (5 points)
scipy.linalg.svd
2. Useto take the SVD of the data, so that
X0 = U · diag(S)· Vh
W = U.dot( np.diag(S) )
Since the number of data is larger than the number of dimensions, there are at most 576 non-zero singular values; you can use full_matrices=False to avoid using a lot of memory. As in the slides, we suggest computingso that X0 ≈ W · Vh. Print the shapes of W and Vh. (10 points)
np.mean( (X0 Xˆ0)**2 )
3. For K = 1 . . . 10, compute the approximation to X0 given by the first K eigendirections, e.g., Xˆ0 = W[:, : K] · Vh[: K, :], and use them to compute the mean squared error in the SVD’s approximation,
− . Plot these MSE values as a function of K. (10 points)
2*np.median(np.abs(W[:,j]))
4. Display the first three principal directions of the data, by computing µ+α V[j,:] and µ-α V[j,:], where α is a scale factor (we suggest, for example,, to get a sense of the scale found in the data). These should be vectors of length 242 = 576, so you can reshape them and view them as “face images” just like the original data. They should be similar to the images in lecture. (10 points)
5. Choose any two faces and reconstruct them using the first K principal directions, for K = 5, 10, 50, 100. (5 points)
6. Methods like PCA are often called “latent space” methods, as the coefficients can be interpreted as a new geometric space in which the data are being described. To visualize this, choose a few faces (say 25), and display them as images with the coordinates given by their coefficients on the first two principal components:
idx = … # pick some data (randomly or otherwise); an array of integer indices
# compute where to place image (scaled W values) & size loc = (coord[i,0],coord[i,0]+0.5, coord[i,1],coord[i,1]+0.5)
img = np.reshape( X[i,:], (24,24) ) # reshape to square plt.imshow( img.T , cmap=”gray”, extent=loc ) # draw each image
plt.axis( (-2,2,-2,2) ) # set axis to a reasonable scale
1
2
3
4
5
6
7
8
9
10
11
This can often help you get a “feel” for what the latent representation is capturing. (10 points)
Statement of Collaboration (5 points)
It is mandatory to include a Statement of Collaboration in each submission, with respect to the guidelines below. Include the names of everyone involved in the discussions (especially in-person ones), and what was discussed.
Acknowledgements
This homework is adapted (with minor changes) from one made available by Alex Ihler’s machine learning course. Thanks, Alex!
Reviews
There are no reviews yet.