$24
• Introduction
In this assignment, you are going to practice on hash table implementation for a book database, along with its accompanying hash function using C++. Remember that insertion, deletion, and retrieval operations are to run in expected constant time for hash tables.
Keywords: Hash table, Quadratic hashing, C++
• Problem De nition
In this homework, you will design a hash table for a book database. The application must store books (have some attributes) in the hash table according to their isbn. Note that, each book has a unique isbn. Your application need to provide the following requirements:
Inserting a book into the hash table Deleting a book from the hash table Getting the book with the given isbn
• Speci cations
In this hash table, for every possible key in the table, you are supposed to store at most 3 values, called a bucket. If the bucket for a key is used, then your hash table should use the standard open addressing (quadratic probing) to nd the next bucket for that key. Note that in a bucket, you should search the entries linearly, don’t iterate in the bucket quadratically.
Example
Note that, this example is for illustration purpose. In the assignment, all keys will be string and values will be a generic class. All (3, 11111) , (13, 11112) , (23, 11113) entries should be stored in the same bucket. However, (33, 11114) entry must be stored in another bucket since the bucket of the key 3 is full now:
1
You will be implementing a hash table class, named as HashTable. The bare header le, "HashTable.h", is given below. You are free to add any private variables/functions to it. However, your hash table should support at least the given functions.
t e m p l a t e < c l a s s T>
c l a s s H as hT a bl e f
s t r u c t Entry f
std : : string Key ; // t h e key o f t h e e n t r y
T Value ; // t h e v a l u e o f t h e e n t r y
b o o l Deleted ; // f l a g i n d i c a t i n g whether t h i s e n t r y i s d e l e t e d
b o o l Active ; // f l a g i n d i c a t i n g whether t h i s item i s c u r r e n t l y used
Entry ( ) : Key ( ) , Value ( ) , Deleted ( f a l s e ) , Active ( f a l s e ) fg
g;
s t r u c t Bucket f
Entry entries [ 3 ] ;
g;
i n t _ ca pa c it y ; // INDICATES THE SIZE OF THE TABLE
i n t _size ; // INDICATES THE NUMBER OF ITEMS IN THE TABLE
Bucket _table ; // HASH TABLE
// You can d e f i n e p r i v a t e methods and v a r i a b l e s
p u b l i c :
◦ TODO: IMPLEMENT THESE FUNCTIONS.
◦ CONSTRUCTORS, ASSIGNMENT OPERATOR, AND THE DESTRUCTOR H as hT a bl e ( ) ;
H as hT a bl e ( c o n s t HashTable<T>& rhs ) ;
HashTable<T>& o p e r a t o r =( c o n s t HashTable<T>& rhs ) ;
• H as hT a bl e ( ) ;
◦ TODO: IMPLEMENT THIS FUNCTION.
◦ INSERT THE ENTRY IN THE HASH TABLE WITH THE GIVEN KEY & VALUE
◦ IF THE GIVEN KEY ALREADY EXISTS , THE NEW VALUE OVERWRITES
◦ THE ALREADY EXISTING ONE.
◦ IF LOAD FACTOR OF THE TABLE IS BIGGER THAN 0 . 5 ,
◦ RESIZE THE TABLE WITH THE NEXT PRIME NUMBER.
v o i d Insert ( std : : string key , c o n s t T& value ) ;
• TODO: IMPLEMENT THIS FUNCTION.
• DELETE THE ENTRY WITH THE GIVEN KEY FROM THE TABLE
• IF THE GIVEN KEY DOES NOT EXIST IN THE TABLE, JUST RETURN FROM THE FUNCTION
• HINT: YOU SHOULD UPDATE ACTIVE & DELETED FIELDS OF THE DELETED ENTRY.
v o i d Delete ( std : : string key ) ;
• TODO: IMPLEMENT THIS FUNCTION.
• IT SHOULD RETURN THE VALUE THAT CORRESPONDS TO THE GIVEN KEY.
• IF THE KEY DOES NOT EXIST , THIS FUNCTION MUST RETURN T( )
• Get ( std : : string key ) c o n s t ;
◦ TODO: IMPLEMENT THIS FUNCTION.
◦ AFTER THIS FUNCTION IS EXECUTED THE TABLE CAPACITY MUST BE
◦ EQUAL TO newCapacity AND ALL THE EXISTING ITEMS MUST BE REHASHED
◦ ACCORDING TO THIS NEW CAPACITY.
// WHEN CHANGING THE SIZE , YOU MUST REHASH ALL OF THE ENTRIES FROM 0TH ENTRY TO LAST - ENTRY
v o i d Resize ( i n t n e w C a p a c i t y ) ;
// TODO: IMPLEMENT THIS FUNCTION.
2
• RETURNS THE AVERAGE NUMBER OF PROBES FOR SUCCESSFUL SEARCH d o u b l e g e t A v g S u c c e s s f u l P r o b e ( ) ;
• TODO: IMPLEMENT THIS FUNCTION.
• RETURNS THE AVERAGE NUMBER OF PROBES FOR UNSUCCESSFUL SEARCH d o u b l e g e t A v g U n s u c c e s s f u l P r o b e ( ) ;
• THE IMPLEMENTATION OF THESE FUNCTIONS ARE GIVEN TO YOU
• DO NOT MODIFY!
i n t Capacity ( ) c o n s t ;
i n t Size ( ) c o n s t ;
◦ ;
A book’s attributes will be represented with the Book class. It should not be modi ed in any case.
c l a s s Book f
p r i v a t e :
string isbn ;
string name ;
string category ;
string writer ;
string p ub l is he r ;
i n t f i r s t _ p u b _ d a t e ;
i n t p a g e _ c o u n t ;
p u b l i c :
// C o n s t r u c t o r s
fg
Book ( ) : isbn ( "" )
Book ( c o n s t string
isbn , c o n s t
string name , c o n s t
string category , c o n s t string writer ,
c o n s t string publisher , i n t
first_pub_date , i n t
p a g e _ c o u n t ) : isbn ( isbn ) , name ( name ) ,
category ( category ) , writer ( writer ) , p ub li s he r ( p ub li s he r ) , f i r s t _ p u b _ d a t e ( f i r s t _ p u b _ d a t e ) , p a g e _ c o u n t ( p a g e _ c o u n t ) fg
// G e t t e r s
and s e t t e r s
f
c o n s t
string &getIsbn ( ) c o n s t
r e t u r n isbn ;
g
v o i d setIsbn ( c o n s t string &isbn ) f
Book : : isbn = isbn ;
g
c o n s t
string &getName ( ) c o n s t
f
r e t u r n name ;
g
v o i d setName ( c o n s t string &name ) f
Book : : name = name ;
g
c o n s t
string &g e t C a t e g o r y ( ) c o n s t
f
r e t u r n category ;
g
v o i d s e t C a t e g o r y ( c o n s t string &category ) f
Book : : category = category ;
g
c o n s t
string &g et Wr i te r ( ) c o n s t f
r e t u r n writer ;
g
v o i d s et Wr i te r ( c o n s t string &writer ) f
Book : : writer = writer ;
g
c o n s t
string &g e t P u b l i s h e r ( ) c o n s t
f
r e t u r n p ub li sh e r ;
g
v o i d s e t P u b l i s h e r ( c o n s t string
&p ub li s he r ) f
3
Book : : p ub li sh e r = p ub li s he r ;
g
i n t g e t F i r s t _ p u b _ d a t e ( ) c o n s t f
r e t u r n f i r s t _ p u b _ d a t e ;
g
v o i d s e t F i r s t _ p u b _ d a t e ( i n t f i r s t _ p u b _ d a t e ) f
Book : : f i r s t _ p u b _ d a t e = f i r s t _ p u b _ d a t e ;
g
i n t g e t P a g e _ c o u n t ( ) c o n s t f
r e t u r n p a g e _ c o u n t ;
g
v o i d s e t P a g e _ c o u n t ( i n t p a g e _ c o u n t ) f
Book : : p a g e _ c o u n t = p a g e _ c o u n t ;
g
◦ ;
You are given a utility class, "HashUtils" which contains a hash function and a function for nding the next appropriate size for the hash table. It should not be modi ed in any case.
#ifndef _ _ H A S H U T I L S _ _
#define _ _ H A S H U T I L S _ _
#include <string>
•
Returns t h e hash v a l u e o f t h e g i v e n key
/
i n t Hash ( std : : string key ) ;
•
Finds t h e next a p p r o p r i a t e hash t a b l e s i z e from a g i v e n number
/
i n t N e x t C a p a c i t y ( i n t prev ) ;
#endif
You will implement functions (marked with "TODO") in the "HashTable.h", and must use open addressing with quadratic probing as a collision resolution strategy.
You can assume that isbn of each book will be unique.
A sample dataset was given to you, which includes name, category, writer, publisher, rst pub date and page count as well as their isbn. Test datasets will resemble the given dataset, however, they will also include new books.
You can de ne new functions, variables etc. to the HashTable class. However you are not allowed to modify or remove any of the given functions or variables.
In the default constructor, you can assume that size and capacity of the table is zero.
You have to call hash method in the "HashUtils.h" to calculate the hash value of an isbn.
"NextCapacity" function takes an unsigned integer and returns a bigger prime number. You have to call this method while resizing the table.
While calculating the total probe, do not consider traversal within a bucket. In other words, you should just take account of the number of quadratic probing while calculating the total number of probe.
While calculating the load factor, you should use 3 capacity as the size of the hash table.
3.1 Entry Struct
Each "Entry" object stores information regarding a single key-value pair in the hash table.
"Key" variable should store the original key variable given in the "Insert" method of the entry "Value" variables stores the value given in the "Insert" method.
"Active" variable denotes that the key-value pair stored in this entry is valid. For example, 0th entry of the 1st bucket in the example above is not valid, however 0th entries of the 3rd and 4th bucket are valid and have their "Active" as true. Initially, "Active" variable of each "Entry" object is set to false.
"Deleted" variable stores whether this entry has been deleted before. Initially, every "Entry" object has this variable as false.
4
• Dataset
You will be provided a book dataset where elds are seperated by j character. For the scope of this assignment you are given a function to read the books from the le. The le format is given below:
isbn j name j category j writer j p ub li sh e r j f i r s t _ p u b _ d a t e j p a g e _ c o u n t 9 7 8 6 0 5 1 8 5 3 1 5 4 j Son j Roman j Ayse Kulin j Everest Ya yi n la ri j2 0 1 8 j3 0 4
. . .
• Regulations
1. You will use C++.
2. You are not allowed to use vector or any other standard libraries.
3. You are not allowed to modify already implemented functions and variables.
4. You can de ne private methods and variables if necessary.
5. If your submission fails to follow the speci cations or does not compile, there will be a "signi cant" penalty of grade reduction.
6. Those who do the operations (insert, delete, resize, get) without utilizing the hashing will receive 0 grade.
7. Those who modify already implemented functions and those who insert other data variables or public functions and those who change the prototype of given functions will receive 0 grade.
8. Late Submission: Late submission policy is stated in the course syllabus.
9. Cheating: We have zero tolerance policy for cheating. In case of cheating, all parts involved (source(s) and receiver(s)) get zero. People involved in cheating will be punished according to the university regulations. Note that students of this course are bounded to code of honor and its violation is subject to severe punishment.
10. Newsgroup: You must follow the Forum (in Odtuclass) for discussions and possible updates on a daily basis.
• Submission
1. Submission will be done via Moodle (cengclass.ceng.metu.edu.tr).
2. Do not submit a main function in any of your source les.
3. A test environment will be ready in Moodle.
You can submit your source les to Moodle and test your work with a subset of evaluation inputs and outputs. Additional test cases will be used for evaluation of your nal grade. So, your actual grades may be di erent than
the ones you get in Moodle.
Only the last submission before the deadline will be graded.
5