Description
(Indexing and Query Execution)
1. [40 points] Consider the following B+tree for the search key “age. Suppose the degree d of the tree = 2, that is, each node (except for root) must have at least two keys and at most 4 keys. Note that sibling nodes are nodes with the same parent.
a. [10 points] Describe the process of finding keys for the query condition “age >= 35 and age <= 65”. How many blocks I/O’s are needed for the process?
6
b. [15 points] Draw the B+-tree after inserting 14, 15, and 16 into the tree. Only need to show the final tree after all insertions.
c. [15 points] Draw the tree after deleting 2 from the original tree.
2. [60 points] Consider natural-joining tables R(a, b) and S(a,c). Suppose we have the following scenario. i. R is a clustered relation with 5,000 blocks. ii. S is a clustered relation with 20,000 blocks. iii. 102 pages available in main memory for the join.
iv. Assume the output of join is given to the next operator in the query execution plan (instead of writing to the disk) and thus the cost of writing the output is ignored.
Describe the steps for each of the following join algorithms. For sorting and hashing-based algorithms, also indicate the sizes of output from each step. What is the total number of block I/O’s needed for each algorithm? Which algorithm is most efficient in terms of block’s I/O?
a. [10 points] (Block-based) nested-loop join with R as the outer relation.
steps: Read R once; Outer loop runs B(R)/(M-2) times, and each time need to read S total number of block I/O’s: B(R)+B(R)/(M-2)*B(S) = 5000+5000/100*20000 = 1005000
b. [10 points] (Block-based) nested-loop join with S as the outer relation.
steps: Read S once; Outer loop runs B(S)/(M-2) times, and each time need to read R total number of block I/O’s: B(S)+B(S)/(M-2)*B(R) = 20000+20000/100*5000 = 1020000
c. [20 points] Sort-merge join (assume only 100 pages are used for sorting and 101 pages for merging). Note that if join can not be done by using only a single merging pass, runs from one or both relations need to be further merged, in order to reduce the number of runs. Select the relation with a larger number of runs for further merging first if both have too many runs.
steps: sort – R => 50 runs => 2B(R); sort S => 200 + 2 runs => 4B(S); merge – B(R) + B(S); total number of block I/O’s: 3B(R) + 5B(S) = 3*5000+5*20000 = 115000
d. [20 points] Partitioned-hash join (assume 101 pages used in partitioning of relations and no hash table is used to lookup in joining tuples).
steps:
pass 1: hash R, S into 100 buckets, 50/200 blocks per bucket respectively => 2B(R)+2B(S) pass 2: join => B(R)+B(S)
total number of block I/O’s: 3B(R) + 3B(S) = 3*5000+3*20000 = 75000
Submit a pdf file: Lastname_Firstname.pdf
Reviews
There are no reviews yet.