$29
For this project you will build an application that uses your list implementations from lab. To support this application, you will add some additional functionality to your list implementations. Once the implementation has been completed, you will perform a set of experiments to measure the run-time behavior of your list implementations. In addition to submitting the source code for your solution, you will also submit a short paper that outlines your experiments (including discussion of implementation variations) and displays the results of the experiments in various graphs.
The behavior of the client application is explained first and then a small set of required list operations is discussed. You are free to add additional list operations, but only if each is added to both implementations (i.e., to linked list and array list); your application code must not rely directly on the internal implementation of a given list data structure.
Note: For this assignment and every assignment in the class, every function must come with a signature and a purpose statement. You do not need to provide test cases for functions that produce output or consume input, but you must test (directly or indirectly) all functions that do not involve I/O (Input/Output), and you should make an attempt to design (or refactor) your code so that the functions that involve I/O are as small as possible.
1 Music Catalog
For this project you will implement a music catalog program with functionality similar to various media players (e.g., iTunes). Of course, this program will provide much less functionality than an actual media player (i.e., playing music is not supported, there is no graphical interface, playlists are not supported, etc.), but it will serve as a framework for some consideration of the design and implementation of such a system, it will demonstrate some uses for lists, and it will require that you consider operations beyond the initial set provided by your lists.
1.1 Input
Your program must take a single command-line argument. This argument is expected to specify the name of a file from which your program will load the contents of the catalog. Each line of this file is meant to contain information for a single song, specifically, the song’s title, artist, and album (examples given below).
Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits
Grazed Knees--Snow Patrol--Final Straw
The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming
Blah
Stockholm Syndrome--Muse--Absolution
Dust in the Wind--Kansas--Kansas
North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip
Nothingman--Pearl Jam--Vitalogy
Not a Song--
Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)
The Lowering--The Avett Brothers--Four Thieves Gone
Though one might expect that a catalog file would always be properly formatted (since it is likely to be generated by the program itself), it is better practice to verify such. Your program should skip blank lines (ignore them entirely) and report any other issues with the file by printing to the screen (but still load all properly formatted entries). For instance, loading this file might result in the following output.
Catalog input errors:
line 4: malformed song information
line 10: malformed song information
1.2 Menu
Tip: sys.stdin.readline() is a useful technique for reading from the "user" (standard input) and it works in both Python 2.7 and Python 3.
After loading the catalog, the user must be presented with the following menu of options. Each of the first four options is discussed in the following parts. If the user chooses to quit (or enters end-of-file), then your program should terminate. After each option (aside from quitting) is processed, the user is again presented with the menu.
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection:
Invalid Input: If the user enters anything other than one of the supported options, your program should note the error and prompt again as demonstrated below.
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: burrito
Invalid option
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection:
1.3 Print catalog
Selecting the "Print Catalog" option should result in the song information for each item in the catalog being printed to the screen in a style similar to that of the input file, but prefixed by the song number and in an order as determined by the most recent sort (see below). Each song is assigned a number as it is added to the list; on load, these numbers correspond to position within the input file (but accounting for blank lines and poorly formatted entries). For the sample file above, one would see the following.
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 1
0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits
1--Grazed Knees--Snow Patrol--Final Straw
2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming
3--Stockholm Syndrome--Muse--Absolution
4--Dust in the Wind--Kansas--Kansas
5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip
6--Nothingman--Pearl Jam--Vitalogy
7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)
8--The Lowering--The Avett Brothers--Four Thieves Gone
1.4 Song Information
Selecting the "Song Information" option should result in an additional prompt for the song number. After the song number is given, the program should print the attributes of the selected song as below. An invalid song number should generate an appropriate error message.
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 2
Enter song number: 3
Song information ...
Number: 3
Title: Stockholm Syndrome
Artist: Muse
Album: Absolution
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 2
Enter song number: 8
Song information ...
Number: 8
Title: The Lowering
Artist: The Avett Brothers
Album: Four Thieves Gone
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 2
Enter song number: 99
... Invalid song number
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection:
1.5 Sort
Selecting the "Sort" option should prompt the user for the primary sort key and then, if a valid option is given, sort the list using that key (see the discussion of sorting under the required list operations below). The results of the sort will be visible only when the catalog is printed.
Note: Sorting the catalog list itself will complicate the implementation of the "Song Information" menu option (at the very least it will be detrimental to performance). Consider using a separate list for sorting/displaying the catalog.
Tip: Do not write a separate sort for each key. Instead, generalize your sort to take a comparison function as an argument (some, like those working in Java, refer to this as "comparator"). You can then pass the appropriate function to the sort based on the user’s choice and use this function to compare two songs within the sort implementation.
1.5.1 Sort Key Comparison Precedence
Of course, many songs have the same title (crazy, right?) and, obviously, some are by the same artist and on the same album (are there multiple songs with the same title by the same artist on the same album?), so the order of some songs with an equal sort key may vary depending on the sort algorithm used. To reduce this variance, and to make your sort slightly more interesting, we will define a key comparison precedence of artist -> album -> title -> number.
What this means is that, for two songs being compared, if the artists match, then the songs are ordered by the albums. If the albums match, then the songs are ordered by the song titles. The song number is used to break the extraordinarily rare tie by using a known unique value. When the user specifies the primary key to use, that key is moved to the front of the key comparison precedence (i.e., sorting by album will give a comparison precedence of album -> artist -> title -> number).
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 3
Sort songs
0) By Number
1) By Title
2) By Artist
3) By Album
Sort by: 2
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 1
2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming
4--Dust in the Wind--Kansas--Kansas
5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip
3--Stockholm Syndrome--Muse--Absolution
7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)
6--Nothingman--Pearl Jam--Vitalogy
0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits
1--Grazed Knees--Snow Patrol--Final Straw
8--The Lowering--The Avett Brothers--Four Thieves Gone
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 3
Sort songs
0) By Number
1) By Title
2) By Artist
3) By Album
Sort by: title
... Invalid sort option
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection:
1.6 Add Songs
While a static catalog can be useful, the ability to extend the catalog will make your application more similar to an actual media player. A media player application might support adding songs (or general media) by downloading, by ripping physical media, or by adding existing files from a directory. For this program, you will implement a simpler approach that still mimics the behavior required in terms of your catalog (and display) lists.
Selecting the "Add Songs" option should result in an additional prompt for the name of a song listing file. This file is expected to be formatted the same as the initial catalog file. Your program must read the contents of this file (reporting errors as before) and add each valid song to the catalog (the song numbers for these new song must be contiguous, must be in order of entry in the file, and must continue from the last used number in the existing catalog). The addition of new songs must preserve the sort order most recently selected by the user (i.e., if the user sorted by title before adding new songs, then printing after adding will display the songs in order by title).
If the provided file cannot be opened, print an appropriate error message and return to the main menu.
Note: there are multiple approaches to implementing this feature. You are encouraged to explore different approaches with consideration for run-time complexity. One approach that is explicitly discouraged is reading through the file multiple times (to, for instance, determine the number of entries).
The example that follows adds the following songs. Notice that the song numbers are based on the order of the entries in the file, but that printing the catalog continues to display the songs in order by song title (the most recent sort).
Oceanside--The Decemberists--5 Songs
Shiny--The Decemberists--5 Songs
My Mother Was A Chinese Trapese Artist--The Decemberists--5 Songs
Angel, Won't You Call Me?--The Decemberists--5 Songs
I Don't Mind--The Decemberists--5 Songs
Apology Song--The Decemberists--5 Songs
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 3
Sort songs
0) By Number
1) By Title
2) By Artist
3) By Album
Sort by: 1
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 1
0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits
4--Dust in the Wind--Kansas--Kansas
1--Grazed Knees--Snow Patrol--Final Straw
5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip
6--Nothingman--Pearl Jam--Vitalogy
3--Stockholm Syndrome--Muse--Absolution
2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming
8--The Lowering--The Avett Brothers--Four Thieves Gone
7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 4
Enter name of file to load: new_songs
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 1
12--Angel, Won't You Call Me?--The Decemberists--5 Songs
14--Apology Song--The Decemberists--5 Songs
0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits
4--Dust in the Wind--Kansas--Kansas
1--Grazed Knees--Snow Patrol--Final Straw
13--I Don't Mind--The Decemberists--5 Songs
11--My Mother Was A Chinese Trapese Artist--The Decemberists--5 Songs
5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip
6--Nothingman--Pearl Jam--Vitalogy
9--Oceanside--The Decemberists--5 Songs
10--Shiny--The Decemberists--5 Songs
3--Stockholm Syndrome--Muse--Absolution
2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming
8--The Lowering--The Avett Brothers--Four Thieves Gone
7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection: 4
Enter name of file to load: fe
'fe': No such file or directory
Song Catalog
1) Print Catalog
2) Song Information
3) Sort
4) Add Songs
0) Quit
Enter selection:
2 Required List Operations
The following operations must be added as functions to your existing list implementations. You must test these operations separately from your music catalog application (i.e., you must write unit tests for these operations). These tests must be placed in the files named linked_list_tests.py and array_list_tests.py.
2.1 foreach
You will discover, or may have already discovered, that accessing and "processing" (e.g., printing) every element in a list is a relatively common operation, but doing so via the get function is particularly inefficient for long linked lists. To alleviate this inefficiency, you will add a foreach function to your list implementations.
This function takes a List and a function as arguments and applies the provided function to the value at each position in the List (from left-to-right). This function does not return a meaningful value (i.e., in Python it returns None).
2.2 sort
This function takes a List and a "less-than" function as arguments and sorts the List such that the elements are in ascending order as determined by the "less-than" function.
This function must return the sorted list. Mutation of the original list is as specified by instructor.
Requirements: The sort for your array list implementation must be either insertion sort or selection sort (you cannot use Python’s builtin sort functions). The sort for your linked list implementation must be an insertion sort into a new list (at least internally; again, mutation is as specified by instructor).
3 Empirical Study
This part of the project requires that you measure the performance of various implementations of list operations. Such measurements could be performed by timing their use in the catalog application, but doing so can skew the results (i.e., measuring the time to build the catalog list will also include the time to read from the catalog file). Instead, you will use the timeit library (an example is given below).
List Lengths: the execution time of each of your experiments will depend on the length of the corresponding lists. You must sample a range of list lengths appropriate to the experiment. For example, you might measure the performance of foreach on lists from length 10 to length 10000 in increments of 100.
Note that in order to allow a recursion depth greater than 1000, you will need to add code (just once) that instructs python to enable this. For instance:
import sys
sys.setrecursionlimit(10100)
The following shows an example use of timeit to measure the time for 10000 executions of the foreach function over a list of 1000 elements.
import timeit
import array_list
import random
# print out the timing results
def print_timing(desc, iterations, seconds):
print('{}\n\t{} iterations in {} seconds'.format(desc, iterations, seconds))
# build a list for this test ... the type of list is passed as module
def build_list(n, module, max=10000):
list = module.empty_list()
for pos in range(n):
list = module.add(list, pos, random.randrange(max))
return list
# the function passed to foreach must take an argument, but we
# don't really want to do anything with it for this experiment
def noop(value):
pass
# timeit expects that the function passed will take no arguments, so
# this function gathers the arguments and returns a new function that
# uses them, but that itself does not take any arguments
def build_operation(list):
def run_foreach():
array_list.foreach(list, noop)
return run_foreach
def run_one_experiment(num_elements, num_iterations, module):
list = build_list(num_elements, module)
to_run = build_operation(list)
seconds = timeit.timeit(to_run, number=num_iterations)
print_timing('{}.foreach identity: {} elements'.format(module.__name__,
num_elements), num_iterations, seconds)
# let's try just one experiment for now
run_one_experiment(1000, 10000, array_list)
4 Measurements
4.1 Linked List
• Measure the use of foreach vs. get to access each element in the linked list. Nothing should be done with the elements (it’s the access time that you want to measure). Vary the list sizes as discussed previously.
4.2 Array List
• Measure the time to build an array list under three different growth strategies. One strategy must be to increase the "array" size by one on each add. The other two must amortize the cost of adds by allocating, as needed, more space than necessary to enable the add. Do not trim the "array" to the needed size after building the list and do not pre-allocate the known needed amount.
More specifically, the list should be built by adding new elements at the end of the list. Vary the size of the list to be built.
Your report should include a graph of both the execution time and the total allocated space for the "array" after the list has been built. Your report must explicitly discuss the different growth strategies used.
4.2.1 Array List vs. Linked List
Consider each of these experiments as framed in the context of the music catalog application, but without a requirement that your lists contain songs (i.e., the content type of your list is somewhat irrelevant).
• Compare "Load" times — compare the time to build lists of each type (plot the time vs. the length for each list type). Use the "best" approach to building an array list based on the earlier experiment. You can build each list in whichever order you feel appropriate (i.e., adding at the front or adding at the end, since it turns out that reversing a list, if needed, is relatively quick (do not include such a reverse in your experiment)).
• Compare "Sort" times.
Note that sorting an already sorted list can affect the performance of some sorting algorithms. As such, you will not want to run multiple iterations of sort on the same list. You can either rebuild the list for each iteration (which will, itself, skew the timing because the building time is now measured) or you can rerun the entire experiment multiple times (with the same random seed) and then average the results.
• Compare "Song Information" times — compare the time required to access various indices (i.e., random access via get) for each type of list. Be sure to use the same access pattern for each list; for instance, if you access 50 elements randomly, then use the same seed for the random number generator for each type of list in your experiments.
• Compare "Add Songs" times — for a reasonably long initial list (e.g., 50,000 elements), add (in sorted order as in the music catalog application) a new list of songs (varying the lengths over the set of experiments).
◦ Measure for a "realistic" list of new songs (i.e., where the new songs are to be added at arbitrary positions within the list)
◦ Measure with a more "contrived" example where the new songs will be added at the beginning of the list
5 Handin
You must submit the following. All but the last (study.pdf) are due on the given deadline. The study portion is due on the following Wednesday, by adding a file to your repository.
• linked_list.py — Linked List implementation
• linked_list_tests.py — Linked List test cases
• array_list.py — Array List implementation
• array_list_tests.py — Array List test cases
• music_catalog.py — entry point for Music Catalog application
• study.pdf — empirical study report
6 Split Deadline
This assignment has a split deadline. The code for the music application, including all of its libraries, is due on the stated deadline. However, the study (study.pdf) is due on the following Wednesday. Submit it by pushing your repo with the extra file.