CSCI_5521 – HW3 (Solution)

$ 24.99
Category:

Description

Arnab Dey
Student ID: 5563169
Email: dey00011@umn.edu
Solution 1
De nitions zt is the vector of indicator variables, z , where zit = 1 is xt belongs to cluster Gi.
Total dataset is denoted as and unobserved random variable dataset is Complete log-likelihood The complete log-likelihood function is given by:
(1)
!#
E-step: Here, we try to nd the expectation of complete log-likelihood given the observed dataset and prior parameters φ. Thus,
(2)
Now,
E
(3)
Therefore, from Eq. (2), we try to formulate the maximization step as follows:
M-step: We try to maximize E(φ|φl):
φl+1 = argmaxE(φ|φl)
φ
Now,
(4)
Maximization of priors, πi: This is a constrained optimization with the constraint being . Therefore, we use Lagrangian method to solve for πi as follows:

Also,

Therefore, the estimate of πi is:

Maximization of parameters of the components: Here, it is given that,
(5)
Therefore, from Eq. (4),
(6)
where mi is the MLE of µi which is found as follows. To minimize Eq. (4) w.r.t. µi, we rst binarize γ(zit) as bti = 1 if i = argmaxj γ(zjt). Then MLE of µi, mi is given by,
m ,
where,
Once we nd the MLEs of the parameters, we prepare the next iteration:
m
S

Solution 2.a

Figure 1: Q2.a: EM on ’stadium.jpg’ with k=4 and 200 iterations

Figure 2: Q2.a: EM on ’stadium.jpg’ with k=8 and 200 iterations

Figure 3: Q2.a: EM on ’stadium.jpg’ with k=12 and 200 iterations
Solution 2.b
It can be seen from Fig. 4 that for each single value of k, expected value of complete log-likelihood gets maximized and after a certain iteration it does not increase any more which denotes convergence. Moreover, as the value of k increases, expected value of complete log-likelihood also increases which means higher the value of k, the image becomes closer to original one. Though it reduces compression ratio but gives higher quality picture.
Solution 2.c
Trying to run EM algorithm on ’goldy.jpg’ raises runtime error while calculating P(xt|φ) as the covariance matrix becomes singular. Therefore, as the inverse of the covariance matrix cannot be calculated, we

Figure 4: Q2.b: Expected value of log-likelihood on ’stadium.jpg’ with di erent k
cannot calculate gaussian pdf value and cannot proceed further with E-step. Fig. (5) shows the result after

Figure 5: Q2.c: K-means on ’goldy.jpg’
compression of ’goldy.jpg’ with k-means algorithm. The behavior of k-means and EM algorithm di ers signi cantly as k-means does not use any parametric methods whereas in EM, the approach is probabilistic and depends on nding the parameters of underlying mixture of assumed probability density functions.
Solution 2.d
In the M-step, To nd the estimate of Σi with regularization, rst let us write the terms of expected complete log-likelihood function which depends on Σi as the other terms will eventually become 0 when we will take the derivative. We will also use the fact that
xTAx = trace[xTAx] = trace[xxTA]
The expectation of complete log-likelihood function involving the terms that depend on Si can be written as:

(7)
Removing the ln(2π) term as it is independent of Σi, the above can be written as:

To nd the estimate of Σi, what we denote as Sli+1, we can equivalently set the derivative of E0(φ|φl) w.r.t Σ−i 1 to 0, i.e.
(9)
Solution 2.e

Figure 6: Q2.e: EM with regularization λ = 0.0001 on ’goldy.jpg’ and 200 iterations
When we add a regularization parameter, λ = 0.0001, the MLE of covariance matrix becomes nonsingular as the regularization terms get added to the diagonal terms of covariance matrix. Therefore, we can see that the regularized EM can run the algorithm on ’goldy.jpg’ and produces the Fig. 6.

Reviews

There are no reviews yet.

Be the first to review “CSCI_5521 – HW3 (Solution)”

Your email address will not be published. Required fields are marked *