Homework #4 Solution

Starting from:

$35

Home

1. [40 points] Consider the following B+tree for the search key “age. Suppose the degree d of the tree

◦ 2, that is, each node (except for root) must have at least two keys and at most 4 keys.

a. Describe the process of finding keys for the query condition “age >= 10 and age <= 50”. How many blocks I/O’s are needed for the process?
b. Draw the updated B+tree after inserting 14, 16, and 15 into the tree. Show the tree after all insertions. How many blocks I/O’s are needed for the process?

c. Draw the updated tree after deleting all odd ages (one by one) from the leaf nodes of the tree obtained in part b. Follow this order in deletion from left to right: 5, 7, … Show the tree after all deletions. How many blocks I/O’s are needed for the process?

2. [60 points] Consider natural-joining tables R(a, b) and S(a,c). Suppose we have the following scenario.
i. R is a clustered relation with 20,000 blocks and 200,000 tuples

ii. S is a clustered relation with 10,000 blocks and 200,000 tuples

iii. S has a clustered index on the join attribute a

iv. V(S, a) = 100 (recall that V(S, a) is the number of distinct values of a in S)

v. 102 pages available in main memory for the join.

vi. Assume the output of join is given to the next operator in the query execution plan (instead of writing to the disk) and hence the cost of writing the output is ignored.

Describe the steps (including input, output at each step, and their sizes) in each of the following join algorithms. What is the total number of block I/O’s needed for each algorithm? Which algorithm is most efficient?

a. Nested-loop join with R as the outer relation

b. Nested-loop join with S as the outer relation

c. Sort-merge join (assume only 100 pages used for sorting and 101 pages for merging). When the number of runs of a relation is too large for merging, the runs will be further merged first. Select the relation with larger number of runs for further merging if both have too many runs.

d. Simple sort-based join (same assumption as above)

1
INF 551 –

e. Partitioned-hash join (assume 101 pages used in partitioning of relations and no hash table used for lookup in joining buckets)
f. Index join (ignore the cost of index lookup)

Submission Criteria

• Submit a pdf file, named as <firstname>_<lastname>_hw4.pdf, with copy of your answers.

• Answer ALL the sub-questions asked in every question, please number or specify them properly.

• For B+ trees in question 1.b and 1.c, you can draw and insert them as photos in your file, but make sure they are clear enough (computer-drawn is preferred). For other questions, please type your answers.

2