Assignment 2: CS 763, Computer Vision (Solution)

$ 24.99

Category: CS763

Description
Reviews (0)

Description

1. We have defined the concept of the Shannon entropy in class. Given a discrete random variable X having probability mass function P(X) = (p1,p2,…,pN), prove that H(X) ≥ 0 where H(X) is the Shannon entropy
N N
[6 points]
of X. Recall that H(X) = −P pi logpi and that P pi = 1 and ∀i,0 ≤ pi ≤ 1. Also prove that a
i=1 i=1
uniform distribution ( ) maximizes the Shannon entropy. To this end, find a stationary point of J(X) = H(X) − λ(P pi − 1) where λ is a Lagrange multiplier to impose the hard constraint that the
i=1
probabilities all sum up to 1. [2+4 = 6 points]
2. This is a straightforward exercise to make sure you understand the basic update equations in the HornShunck algorithm for optical flow. As seen in class, we seek to minimize the quantity J({(uij,vij)}) w.r.t.
N N
the optical flow vectors (ui,j,vi,j) at all pixels (i,j), where J({(ui,j,vi,j)}) = PP(Ix;i,jui,j + Iy;i,jvi,j +
i=1 j=1
It;i,j)2 +λ((ui,j+1 −ui,j)2 +(ui+1,j −ui,j)2 +(vi,j+1 −vi,j)2 +(vi+1,j −vi,j)2). Setting the partial derivatives w.r.t. uk,l and vk,l to 0, prove that
(1)
(2)
where ¯uk,l and ¯vk,l are as defined in the lecture slides. Also verify the Jacobi update equations given by the following:
(3)
(4)
3. You know that both the Horn-Shunck as well as Lucas-Kanade methods bank on the brightness constancy assumption. Given a pair of images, let us suppose that this assumption holds good for most physically corresponding pixels, but not for some p% of the pixels. Briefly explain how you will modify the HornShunck method and Lucas-Kanade method to deal with this. [3+3 = 6 points]
4. In the first camera calibration we studied in class, it turns out that the estimate of the rotation matrix (let’s call it Rˆ) is not orthonormal. The book by Trucco and Verri suggests the following procedure to ‘correct’ this issue by replacing Rˆ by R˜ = UVT where Rˆ = USVT is the SVD of Rˆ. Prove that R˜ as obtained by this procedure is given as R˜ = argminQkQ − Rˆk2F subject to the constraint that QQT = I. Also this correction step brings out a limitation of this camera calibration algorithm. State that limitation. [5+1 = 6 points]
5. The input to the Tomasi-Kanade factorization algorithm for structure from motion is the set of x and y coordinates of n ≥ 4 points {xij},{yij}, corresponding to unknown non-coplanar 3D points on a rigidly moving object, and tracked in N ≥ 3 different images (or frames), 1 ≤ j ≤ n,1 ≤ i ≤ N. The images are acquired under orthographic projection. The algorithm proceeds by performing an SVD of the 2N × n
matrix W˜ where X˜ . The output of the algorithm
consists of (1) the 3D coordinates , and (2) the 3D rotational motion of the object from one frame to another. Now, if the object motion were 3D affine instead of 3D rigid, can the point coordinates and the affine object motion still be unambiguously estimated by the algorithm? Why (not)? Write down all necessary equations. [6 points]
7. In this task, we will register two pairs of images with each other: (1) The famous barbara image (regarded as a fixed image) to be registered with its negative (regarded the moving image), and (2) a flash image (regarded as a fixed image) and a no-flash image (regarded as the moving image) of a scene. We will use the joint entropy criterion we studied in class as the objective function to be minimized for alignment. Download all required images from http:www.cse.iitb.ac.in/~ajitvr/CS763_Spring2016/HW2/ImageReg. Convert all images to gray-scale (if they are in color). Note that the flash image and the no-flash image have different image intensities at many places, and the no-flash image is distinctly noisier.
For each of the two cases, rotate the moving image counter-clockwise by 28.5 degrees, translate it by -2 pixels in the X direction, and add Gaussian noise of standard deviation 10 (on a 0-255 scale). Note that the rotation must be applied about the center of the image. Set negative-valued pixels to 0 and pixels with value more than 255 to 255. Now perform a brute-force search to find the angle θ and translation tx to optimally align the modified moving image with the fixed image (in each case), so as to minimize the joint entropy. The range for θ should be between -60 and +60 in steps of 1 degree, and the range for tx should be between -12 and +12 in steps of 1. Compute the joint entropy using a bin-size of 10 for both intensities. Plot the joint entropy as a function of θ and tx using the surf and imshow commands of MATLAB. Comment on the difference (if any) between the quality of alignment for the first and second pair of images.
Also, determine a scenario (for the first pair of images) where the images are obviously misaligned but the joint entropy is (falsely and undesirably) lower than the ‘true’ minimum. Again, display the joint entropy as mentioned before. Include all plots in your report. [6+2+2 = 10 points]

Reviews

There are no reviews yet.

Be the first to review “Assignment 2: CS 763, Computer Vision (Solution)”

Assignment 2: CS 763, Computer Vision (Solution)

Description

Reviews

Related products

Assignment 4: CS 763, Computer Vision (Solution)