CS247: Advanced Data Mining (Winter 2024) (Solution)

$ 29.99
Category:

Description

Assignment 4 (Part 2)
Instructions
• Saruebmacicteypotaubrlaen. swer on Gradescope as a PDF file. Both typed and scanned handwritten answers
• Submit your solutions to Part 1 and Part 2 through GradeScope in BruinLearn separately.
Problems
In this part, we follow the notations below:
• Lne=t G|V=| a(nVd,Em)=be|Ea|.simple (that is, no self- or multi- edges) undirected, connected graph with • Aothiesrwthisee.adjacency matrix of the graph G, i.e., Aij is equal to 1 if (i,j) ∈ E and equal to 0
• D is the diagonal matrix of degrees: Dii = 󰁓j Aij = di, where di is the degree of node i.
• We define the graph Laplacian of G by L = D−A.
For a set of nodes S
“volume” value. We defin⊂e Vth,ewceutwoilfl tmheeasseutreStthoebqeutahlietynuomf Sbearsoaf ecdlugsetserthwaitthhaave“counte”evnadlupeoainntdina
S and one end point in the complement set S¯ = V S:
cut(S) = 󰁛S,j S Aij. (1)
i∈ ∈¯
Note that the cut is symmetric in the sense that cut(S) = cut(S¯). The volume of S is simply the sum of degrees of nodes in S: vol(S) = 󰁛di. (2)
i∈S
Problem 1: Normalized Cuts (15 points)
CS¯o=nsVideeSrdtdahsee:nportoebtlheme doifsjpoainrttitnioodneinsgetasgorfapthheGtwinotcolutwstoerssu.bTgrhaepnhosrwmitahlizseimd icluart sbizeetws.eeLnettwSo⊂clVustaenrds is defin
ncut(S) = cvuotl((SS)) + cvuotl((SS¯¯)). (3)
Assignment 4 (Part 2) CS247: Advanced Data Mining (Winter 2024) UCLA
Intuitively, a set S with a small normalized cut value must have few edges connecting to the rest of the graph (making the numerators small) as well as some balance in the size of the clusters (making the denominators large).
Define the assignment vector x for some set of nodes S such that
vol(S), i ∈ S
i 󰁁󰁁󰁁󰁁󰀿󰁁󰁁󰁁󰁁󰀽󰀻 󰁶󰁶vol(S¯)
x =(4)
vol(S) ¯
Please prove the following properties.
1. L = 󰁓(i,j)∈nEd(10’is−els1ejw)(h1eire−. 1Njo)Tte, twhhaetrewe1kareisnaont snu-mdimmienngsioovnearl tchoeluemntnirveeacdtojarcewnicthy ma a1traixt position k a and only count each edge once.
2. xTLx = 󰁓(i,j)∈E(xi − xj)2.
3. xTLx = c · ncut(S) for some constant c. Hint: Rewrite the sum in terms of S and S¯.
4. xTD1 = 0, where 1 is the vector of all ones.
5. xTDx = 2m.
Answer:
Assignment 4 (Part 2) CS247: Advanced Data Mining (Winter 2024) UCLA
Problem 2: Solution to Normalized Cut Minimization (15 points)
Since xTDx is just a constant (2m), we can formulate the normalized cut minimization problem in the following way:
Sm⊂inV,ixm∈iRzne xxTTDLxx
subject to xTD1 = 0, (5)
xTDx = 2m, x is defined as in Equation (4).
The constraint that x takes the form of Equation (4) makes the optimization problem NP-hard. We will instead use the “relax and round” technique, where we relax the problem to make the optimization problem tractable and then round the relaxed solution back to a feasible point for the original problem.
Our relaxed problem will eliminate the constraint that x takes the form of Equation (4) which leads to the following relaxed problem:
mixnimnize xxTTDLxx
subje∈cRt to xTD1 = 0, (6)
xTDx = 2m.
Please prove that the minimizer of Problem (6) is D−1/2v, where v is the eigenvector corresponding to the second smallest eigenvalue of the normalized graph Laplacian L˜ = D−1/2LD−1/2.
Finally, to round the solution back to a feasible point in the original problem, we can take the indices of all positive entries of the eigenvector to be the set S and the indices of all negative entries to be S¯.
Hints:
• Make the substitution z = D1/2x.
• Note that 1 is the eigenvector corresponding to the smallest eigenvalue of L.
Answer:

Reviews

There are no reviews yet.

Be the first to review “CS247: Advanced Data Mining (Winter 2024) (Solution)”

Your email address will not be published. Required fields are marked *