Description
Question 1 [10 marks]
Please specify which Vs in 4V are reflected in the text above and explain the reason in detail. [10 marks]
Question 2 [15 marks]
Consider an imaginary web of 3 web pages, as shown in the figure below:
Assume that the initial page rank of each web page is 1 and the damping factor is 0.5.
a) Calculate the page rank values of A, B, C for the first three iterations. Approximate the results to 3 decimal places. [5 marks]
b) If the approximate page rank values stay unchanged in iterations, we consider that the page rank values reach convergence. Write the number of iterations required for page rank values to converge and give the final page rank values for A, B, and C. (Programming is encouraged) [5 marks]
c) The following graph illustrates the process of PageRank algorithm in MapReduce framework. Calculate the intermediate result with calculation process. [5 marks]
Question 3 [25 marks] Extracting part of the census data, we can get the following child-parent relationship table:
Child Parent
Tom Lucy
Tom Jack
Jone Lucy
Jone Jack
Lucy Mary
Lucy Ben
Jack Alice
Jack Jesse
Terry Alice
Terry Jesse
Philip Terry
Philip Alma
Mark Terry
Mark Alma
We need to use MapReduce to find the grandchild-grandparent relationship (example: Tom-Mary) from this table.
a) Explain how you implement the map and reduce functions (including the key-value pair definition) in pseudo code and show the intermediate results by each mapper and the output by each reducer. (Using 2 mappers and 2 reducers, consider the rank and shuffle module is predefined.) [15 marks]
b) Implement the map and reduce function using python language. And upload the source code file. [10 marks]
Reviews
There are no reviews yet.