$29
Introduction
This assignment is designed to use a hash table to count the number of unique addresses accessed by a program. You need to implement a program called count. The input of the count program is a trace consisting of 64-bit addresses and you are required to print out the number of unique address in the trace.
Problem speci cation: Use a hash table to count the number of unique addresses
2.1 Hash table with chaining
In this assignment, you will implement a hash table with chaining. An important part of a hash table is collision resolution. In this assignment, we want you to use chaining, which is di erent from Assignment 1. Figure 1 shows an example of using separate chaining in hash table. More information about chaining with linked list can be found at Wikipedia, http://en.wikipedia. org/wiki/Hash_table#Separate_chaining_with_linked_lists.
Figure 1: Example of a hash table with separate chaining. Use a linked list to store collisions.
2.2 Count number of unique addresses
Input format: This program takes a le name as argument from the command line. The le can have from a few lines to millions of lines. Each line contains an address in hexadecimal format, i.e. 0x7f1a91026b00, generated by pintool (http://pintool.org). Each address is represented as a 64-bit hexadecimal number.
Output format: Your program should print the number of unique addresses in the le. There should be no leading or trailing white spaces in the output. Your program should print \error" (and nothing else) if the le does not exist.
Example Execution:
Lets assume we have 3 text les with the following contents. \ le1.txt" is empty and,
le2.txt:
0x7f1a9804ae19
0x7f1a9804ae1c
0x7f1a9804ae1c
0x7f1a9804ae1c
le3.txt:
0x7f1a9804ae19
0x7f1a9804ae1c
0x7f1a9804ae1c
0x7f1a9804ae19
0x7f1a9804ae16
0x7f1a9814ae1c
./count le1.txt
0
./count le2.txt
2
./count le3.txt
4
./thirdcount le4.txt
error
2.3 Scalability
In this assignment, we will test your program with millions of lines of addresses. You can initialize the hash table with 1000 entries. The largest test case contains 5 million lines with about 15,000 unique addresses. In this case, the average number of nodes in each linked list is 15.
Submission
You have to e-submit the assignment using Sakai. Your submission should be a tar le named pa2.tar. To create this le, put everything that you are submitting into a directory (folder) named pa2. Then, cd into the directory containing pa2 (that is, pa2’s parent directory) and run the following command:
tar cvf pa2.tar pa2
To check that you have correctly created the tar le, you should copy it (pa2.tar) into an empty directory and run the following command:
tar xvf pa2.tar
This should create a directory named pa2 in the (previously) empty directory.
The pa2 directory contain folder count. The count folder must contain c source les, header les and a Make le. For example, after typing make, your program must generate an executable le called count.
Makefile: there should be at least two rules in your Make le:
count: build your count executable.
clean: prepare for rebuilding from scratch.
source code: all source code les necessary for building count.
Grading Guidelines
We will build a binary using the Make le and source code that you submitted, and then test the binary for correct functionality against a set of inputs. Thus:
You should make sure that we can build your program by just running make.
You should test your code as thoroughly as you can. In particular, your code should be adept at handling exceptional cases. For example, programs should not crash if the argument is not a proper number or le does not exist.
Your program should produce the output following the example format shown in previous sections. Any variation in the output format can result up to 100% penalty. There should be no additional information, e.g. debugging messages.
Be careful to follow all instructions. If something doesn’t seem right, ask.
3