$29
Algorithmic theory provides a useful framework for understanding how the time complexity of an algorithm increases with input size. In many cases, these techniques generate solid expectations for which of two algorithms will be better to solve a problem. In practice, however, growth rate alone is not enough; we can never be sure of an algorithm’s speed on a particular problem size until we try it out.
For the first three problems below, you will experimentally test how the speed of algorithms vary across input sizes; each question is worth 200 points toward your homework grade for a total of 600 points. For the remaining 400 points there are two bit-magic coding probelms.
This homework assignment is long; please do not wait until the last minute to look at the problems.
The first three problems:
Merge sort vs Insertion sort
Hybrid sorting
Comparing dictionary structures
For each problem you will upload a write-up (as a pdf) with the following five sections:
Hypothesis: Briefly describe what results you believe you will find. Write your hypothesis before you start conducting experiments as a way of acknowledging your initial expectations. Note: you will not lose points for your hypothesis being incorrect; in fact, getting a result different from your hypothesis will often be more exciting and more of a learning experience.
Methods: Describe step-by-step the experiments that you conduct. Provide the source code that you use, and details about which compiler you use, how you compile it (e.g., optimization flags), and the range of inputs that you feed into your program. You may provide a link to a repository rather than copying the code into your file. Your methods should accurately reflect how you actually generated your data such that, someone reading them can replicate your experiment.
Results: Present the data that your experiments produced . In most cases this will be a graph (as described in each question) and a brief explanation to make sure the reader understands the data in that graph. Graphs should be clearly labeled (axis labels, etc)!
Discussion: Provide a brief discussion of your data. Did anything about it surprise you? Were there any unexpected challenges in collecting the data? This section is where you will answer specific questions posed in each of these problems.
Conclusions: Present a concise take-away for this experiment. For example, “Under the conditions tested, data structure A produces a faster algorithm for n < 1000, while data structure B is faster for n > 1500. For n between 1000 and 1500 the two data structures are indistinguishable.”
Some advice:
If you have trouble measuring short run times, put a loop around an experiment to do it 1000 times in a row (or more) so that you can tell the difference. For example, if Quicksort appears to take 0 seconds when n=10, you might find that running it 10,000 times takes 0.13 seconds. Then you can estimate that each iteration took 0.000013 seconds.
Make sure that you are using appropriate input data for each test. For example, if you test two sorting algorithms on the same array, the first one will have sorted the data and thus the second one will have a qualitatively different input -- already sorted. One way to help ensure that you are not accidentally altering data is to swap the order that you test two algorithms in or to do them in independent runs of the code.
Most languages have a simple way to get the current time in milliseconds, so you can run this before and after the test and subtract to get the total time elapsed. For example, in C++ you can do:
#include <ctime>
std::clock_t start_time = std::clock();
the_function_you_want_to_time();
std::clock_t tot_time = std::clock() - start_time;
std::cout << "Time: "
<< ((double) tot_time) / (double) CLOCKS_PER_SEC
<< " seconds" << std::endl;
Our main goal is to give you experience empirically testing algorithms or data structures against one another. If you have ideas for how to adjust the experiments below, you should feel free to do so. For example, you may use a different language than C++ or Python, but you should make sure that you are confident about which data structures underly the types that you are using. The differnece between a dictionary based on a hash table verus one using a binary tree, for example, can be very meaningful. Furthermore, feel free to come up with your own extensions to these projects – especially thorough or clever studies will warrant extra credit points.
Remember, you may discuss these problems in detail with your classmates, either directly or on Slack, just do the final work that you turn in yourself.
The last two problems:
You may solve these programming problems in either Python or C++. To use Python, name the file containing your code code.py. To use C++, name the file code.cc or code.cpp. You can write your code directly on Mimir or by using your favorite local editor and copying it onto this page. If you're writing your code here, remember to save frequently so you don't lose you progress.
At any time, you can choose to run the tests that will be used to grade your solution by clicking the "run tests" button on the code editing screen. Debugging output will be printed to tell you which test cases you passed, which ones you failed, and what your final score would be if you submittted your current code. If a test case failed, the output will specify a failure mode (e.g. timeout, compiler error, wrong answer, etc.).