$24
• Objectives
In this assignment, you are expected to implement a book index, which helps you to nd page number of a term in a book.
In an index, the terms are divided into groups according to their rst letter. These letters are ordered in alphabetical order. The terms in each letter group are in lexicographical order among themselves, as well. Your objective is simply to group terms whose rst letter are the same and list terms in each group with respect to lexicographical ordering. To achieve this, you are expected to implement some methods related to some functionalities of the index. An index example is provided below.
A:
AA-trees 1-5
Abstract data types 12 15
Algorithm design 13
D:
DAGs 45 89 134
N:
Nested loops 56 123-126
New line character 22-25 68 207
Each line beginning with a term has the page numbers of the occurrences of that term in the book. Note that page numbers can be a single number as well as an interval. For instance, \Nested loops" occurs in page 56 and also from page 123 to 126. In Figure 1, the binary search tree representation of above index is illustrated. Please note that, the rst letters of terms are arranged as binary search tree and each node for a letter also holds a separate binary search tree to store terms. The nodes in the binary search tree are ordered alphabetically. In the following description of the homework, the overall binary search tree in Figure 1 is referred to as Index; the nodes of this tree (which includes letters) are referred to as Index Node; the secondary tree stored in each node are referred to as TermTree; the nodes of the secondary tree as TermTreeNode.
1
Indexer
BinarySearchTree<TermTreeNodeData>
Abstract data
types
12 15
AA-trees
Algorithm
1-5
design
13
BinarySearchTree<IndexNodeData>
D
BinarySearchTree<TermTreeNodeData>
DAGs
45 89 134
BinarySearchTree<TermTreeNodeData>
A
N
New line
character
22-25 68 207
Nested loops
56 123-126
Figure 1: Binary Search Tree structure of Index.
• Implementation
You are given BinarySearchTree template class which is almost complete. In order to implement the Index tree and Term trees you will use this class template. The classes IndexNodeData and TermTreeNodeData are de ned in the following sections. Once you have those classes you can de ne Index and Term trees as follows:
BinarySearchTree<IndexNodeData> Index;
BinarySearchTree<TermTreeNodeData> tree;
Some remarks for the class de nitions are given in the following section. You can refer to C++ les for detailed information about class members. In the given les, some methods are left blank for you to implement according to the explanations commented in the method de nitions.
2.1 class IndexNodeData
This class describes nodes in Index tree. There are two private data members: a letter (A, B, C,
• : : ) and the term tree.
c l a s s IndexNodeData f
p r i v a t e : // data
c h a r letter ;
BinarySearchTree<TermTreeNodeData> tree ;
p u b l i c :
IndexNodeData methods /
/
p u b l i c
p r i v a t e :
u t i l i t y methods /
/
p r i v a t e
• ;
2
IndexNodeData class is completely implemented in IndexNodeData.h and IndexNodeData.cpp les.
Therefore, changing the content of these les is not allowed.
2.2 class TermTreeNodeData
This class describes nodes in Term tree. There are two private data members: a term which is a string and a list of page numbers.
c l a s s TermTreeNodeData f
p r i v a t e : // data
string term ;
vector<Pair> pagelist ;
p u b l i c :
/ p u b l i c TermTreeNodeData f u n c t i o n s /
class Pair is de ned to represent page intervals by two integers for start and end page. If start and end page numbers are equal then the interval is accepted as a single page and listed as a single number in the output. Otherwise, they are accepted as page interval and stated as <start>-<end> format in the output as in 1.
TermTreeNodeData and Pair classes are completely implemented in TermTreeNodeData.h and TermTreeNodeData.cpp les. Therefore, changing the content of these les is not allowed.
2.3 class TreeNode
This template class de nes the node of a binary tree. It is used by the template class Binary-SearchTree. The class has three private data members: a data variable and two pointers to left and right subtrees.
t e m p l a t e < c l a s s T>
c l a s s TreeNode f
p u b l i c :
T data ; // data s t o r e d a t t h i s node
TreeNode<T> left ; // r e f e r e n c e t o l e f t s u b t r e e
TreeNode<T> right ; // r e f e r e n c e t o r i g h t s u b t r e e
• C o n s t r u c t s a l e a f node with t h e g i v e n data . TreeNode<T>(T val=T ( ) ) f
data = val ; left = nullptr ; right = nullptr ;
g
• C o n s t r u c t s a branch node with t h e g i v e n data and l i n k s . TreeNode<T>(T val , TreeNode<T> lf , TreeNode<T> rt ) f
data = val ; left = lf ; right = rt ;
g
• ;
3
2.4 class BinarySearchTree
This is a template class to de ne a binary search tree abstract data type. Its only private data member is root, which is a TreeNode pointer to the root element of binary search tree. The class includes template functions to add, remove, search elements in a binary search tree in general. The class also includes copy constructor, destructor and assignment operator. The only empty method in the class is makeItBalanced. You need to to implement this method.
t e m p l a t e < c l a s s T>
c l a s s BinarySearchTree f
p u b l i c :
// = o p e r a t o r o v e r l o a d i n g
BinarySearchTree& o p e r a t o r =( c o n s t BinarySearchTree& t ) ;
//
Adds
t h e
g i v e n
v a l u e
t o
t h i s
BST
i n
s o r t e d o r d e r .
v o i d add ( T value ) ;
//
Removes t h e g i v e n v a l u e
from
t h i s
BST w h i l e
r e m a i n i n g
e l e m e n t s
// s t a y s o r t e d .
v o i d remove ( T value ) ;
//
Returns
i f
t h i s
t r e e
c o n t a i n s
v a l u e
v a l .
b o o l contains ( T val ) ;
//
Makes t h e
t r e e
b a l a n c e d .
v o i d makeItBalanced ( ) ;
/.../
p r i v a t e :
// P r i v a t e data member :
//
t h e
o v e r a l l r o o t o f
t h e
t r e e .
I t
i s
n u l l f o r
an
empty
t r e e
TreeNode<T> root ;
/ p r i v a t e u t i l i t y f u n c t i o n s /
v o i d makeItBalanced ( TreeNode<T> &r ,
vector<T> &A ,
i n t n ,
i n t &i ) ;
/
. . .
/
void makeItBalanced() restores tree to a minimum height tree, so it will be a balanced tree. It rst copies all values in the tree into a vector in inorder traversal. It then deletes all nodes in the tree (you can use the given deleteTree() method here), makes root null and calls the helper method makeItBalanced() (decribed below) to make root point to the root of the new minimum height tree.
void makeItBalanced(TreeNode<T> *&r, vector<T> &A, int n, int &i) inserts the values in the given vector into the tree in such a way that the middle value in the vector becomes the root of the new minimum height tree. The algorithm in pseudocode in Algorithm 1.
Note that balancing tree operation is implemented only within these two methods, mostly in void makeItBalanced(TreeNode<T> *&r, vector<T> &A, int n, int &i). In other words, ll the body of these two functions for implementing tree balancing on BinarySearchTree class.
4
You are expected to implement void makeItBalanced() and void makeItBalanced(TreeNode<T> *&r, vector<T> &A, int n, int &i) functions in BinarySearchTree.h le. The other member function de nitions are available in the le. You can de ne additional private utility functions for your solution.
Algorithm 1 makeItBalanced(TreeNode<T> *&r, vector <T> &A, int n, int &i)
if n > 0 then
// construct the left subtree
• pointer to new node with NULL child pointers makeItBalanced(r->leftChild, n/2)
// get the root
// Copy item from the ith position in the vector into r->data i i + 1
// construct the right subtree makeItBalanced (r->rightChild, (n-1)/2)
end if
2.5 class Indexer
Indexer is the class that implements the overall index structure. It has one private data member which is the Index tree. It includes methods to perform several operations on the index.
c l a s s Indexer f
p u b l i c :
v o i d addTerm ( string term , Pair p ) ;
v o i d removeTerm ( string term ) ;
v o i d removeLetter ( c h a r letter ) ;
v o i d makeIndexBalanced ( ) ;
v o i d makeLetterBalanced ( c h a r letter ) ;
/ ... /
p r i v a t e :
BinarySearchTree<IndexNodeData> Index ;
• ;
The class methods are explained in the following:
2.5.1 void addTerm(string term, Pair p)
This function inserts a term together with its page number interval to a Term tree.
The function rst checks if the rst letter of term is in the Index. If not, a new node with that rst letter is added to Index, a new Term tree is created for that letter and the given term is inserted to the new Term tree.
If the letter exists in the Index, this time, term is searched in the respective Term tree. If term is not found, then it is inserted as a new node in the Term tree together with the page interval. If
5
found, the pagelist is updated with the start and end page numbers.
After inserting page number into an existing TermTreeNodeData, there is no contiguous or inter-secting page intervals in the resulting index. For example, if page numbers 7 and 10 is inserted to a term with 5-8 11-15, the nal pagelist for the term should be 5-15 as 7-10 interval is intercepting with 5-8 and adjacent to 11-15. Adding new page interval into existing TermTreeNodeData element is handled by TermTreeNodeData& TermTreeNodeData::addPage(Pair p) function.
2.5.2 void removeTerm(string term)
The function removes TermTreeNodeData with term. If deleted TermTreeNodeData is the last term of its corresponding letter, remove that IndexNodeData node from Index, as well.
2.5.3 void removeLetter(char letter)
The function removes IndexNodeData with letter.
2.5.4 void makeIndexBalanced()
The function makes the overall Index a balanced tree. Note that balancing operation is only applied on tree with IndexNodeData elements. In other words, only the nodes with letters is balanced in this function.
2.5.5 void makeLetterBalanced(char letter)
The function makes Term tree of a given letter a balanced tree.
You are expected to implement copy constructor, = operator, and these 5 functions above namely, void addTerm(string term, Pair p), void removeTerm(string term), void removeLetter(char letter), void makeIndexBalanced(), and void makeLetterBalanced(char letter), in Indexer.cpp le. The other member function de nitions are available in the le. You can de ne additional private utility functions for your solution.
• Regulations
1. Programming Language: You will use C++.
2. External libraries other than those already included are not allowed.
3. Those who do the operations (insert, remove, search, print) without utilizing the tree will receive 0 grade.
4. You can change main.cpp le to test your implementation.
5. You are allowed to edit BinarySearchTree.h, Indexer.h, and Indexer.cpp les. Do not modify the content of the other homework les.
6. Those who modify already implemented functions and those who insert other data variables or public functions and those who change the prototype of given functions will receive 0 grade.
7. Options used for g++ are -ansi -Wall -pedantic-errors -std=c++0x -O0. They are al-ready included in the provided Make le.
8. You can add private member functions whenever it is explicitly allowed.
6
9. Late Submission: You have 5 days for late submissions in all assignments. Your assignment will not be accepted if you submit more than 5 days in total.
10. Cheating: We have zero tolerance policy for cheating. In case of cheating, all parts involved (source(s) and receiver(s)) get zero. People involved in cheating will be punished according to the university regulations. Remember that students of this course are bounded to code of honor and its violation is subject to severe punishment.
11. Newsgroup: You must follow odtuclass (odtuclass.metu.edu.tr) for discussions and possible updates on a daily basis.