## Description

1. Consider the vector space Rn.

(a) Check that kxk∞ = max1≤i≤n |xi| is indeed a norm on Rn.

(b) Prove that: for any x ∈ Rn,

kxk∞ = lim kxkp.

p→∞

(c) Prove the equivalence

kxk∞ ≤ kxk1 ≤ nkxk∞, ∀x ∈ Rn.

2. For any A ∈ Rm×n, we have defined

kAxk2 kAk2 = sup .

x∈Rn, x6=0 kxk2

(a) Prove thatkAk2 = max kAxk2

x∈Rn, kxk2=1

(b) Prove that k · k2 is a norm on Rm×n.

(c) Prove that kAxk2 ≤ kAk2kxk2 for any A ∈ Rm×n and x ∈ Rn.

(d) Prove that kABk2 ≤ kAk2kBk2 for all A ∈ Rm×n and B ∈ Rn×p.

3. Let a1,a2,…,am be m given real numbers. Prove that a median of a1,a2,…,am minimizes

|a1 − b| + |a2 − b| + … + |am − b|

over all b ∈ R.

4. Suppose that the vectors x1,…,xN in Rn are clustered using the K-means algorithm, with group representatives z1,…,zk.

(a) Suppose the original vectors xi are nonnegative, i.e., their entries are nonnegative. Explain why the representatives zj output by the K-means algorithm are also nonnegative.

(b) Suppose the original vectors xi represent proportions, i.e., their entries are nonnegative and sum to one. (This is the case when xi are word count histograms, for example.) Explain why the representatives zj output by the K-means algorithm are also represent proportions (i.e., their entries are nonnegative and sum to one).

(c) Suppose the original vectors xi are Boolean, i.e., their entries are either 0 or 1. Give an interpretation of (zj)i, the i-th entry of the j group representative.

1

5. (You don’t need to answer anything for this question.) An interactive demonstration of K-means algorithm can be found at http://alekseynp.com/viz/k-means.html, where the K-means algorithm is also called Lloyd’s algorithm. Generate data by “random clustered”, and choose the same number of clusters in “Data Generation” and “K-means”. You will see that the K-means algorithm converges to a correct clustering in most of the test examples. There do exist some test examples for which the K-means algorithm converges to a wrong clustering.

2

## Reviews

There are no reviews yet.