$24
Task:
This is for the parallel version of the Floyd’s all-pairs shortest path program, using a 2D block decomposition!
Write the following programs:
make-graph -n 100 -r 100 -p 150 -o output_file
This program writes to output_file a binary file with an n by n matrix representing a graph of n nodes, with directional edge weights 0 to r and a probability of an edge of r/p. Non-edges are represented with -1. Data are integers and the number n is stored as the first word of the file. Hint: for each edge, generate a random number u = 0..p if u <= r then u is the edge weight, otherwise it is -1 (no edge).
print-graph input_file
Reads a graph file created by program in step 1) and prints in ascii.
floyd-serial input_file output_file
Reads a graph file, computes Floyd's shortest pairs, writes resulting shortest path graph, to be stored in the same format.
mpirun -np __ floyd-parallel input_file output_file
Reads a graph file, computes Floyd's shortest pairs, writes resulting shortest path graph. It does so using a 2D checkerboard decomposition. You will need to use row and column MPI communicators, and most likely want to use an MPI Cartesian topology to set up this algorithm.
You should have a file named graph.c with the following functions:
read_graph(char *file_name, int *n, int ***A); write_graph (char *file_name, int n, int **A); print_graph(int n, int **A);
All of your programs should use this (it is ok if you use modified versions of the author’s read, print and write matrix, to make it happen in 2D checkerboard parallel). Your matrix should be represented in memory as presented in class and in the book with an array of int and an array of pointers to int. (The strategy discussed in the textbook)
Source files:
make-graph.c – includes main.c, argument processing, and graph creation, calls write_graph() print-graph.c – includes main.c argument processing, calls read_graph() and print_graph() floyd-serial.c – includes main.c, Floyd serial function, calls read_graph() and write_graph() floyd-parallel.c – includes main.c Floyd parallel function, calls read_graph() and write_graph() graph.c – listed above plus any other helper functions needed by more than one program graph.h – headers for contents of graph.c
+author’s sources if needed
All your source files should be in a single directory. The name of that directory should be first_last_HW07. Inside that directory, you can have the files called what you want, but the graph.c and graph.h have to be there, along with all of the source and a Makefile to compile down to the executables mentioned above. What I’m saying is that there should be a Makefile, where I can simply type make, and everything will be compiled and linked down to what is specified. Also, all of your source files (mains, source, and headers) should have a comment block at the top that has your information. Also, all programs must be documented well.
The zip (or .tar.gz) should have the same name at the underlying directory. Upload on BB by the specified time.
Project Report:
You’ll need to submit (in your zip file) a report that addresses various performance metrics of the code. I expect that you run your program on 1 (the sequential program), 2, 4, 8, 16, and 32 CPUs at least and try to get timings for 64 if possible. You should have two sets of timings, one for the entire execution time of the program (i.e. a start and end time at the very beginning and ending of the program) and also one that focuses just on the time taken in the Floyd’s functions call that does not include reading/writing files. You should choose n (number of nodes in the directed graph) to be fairly large, so that when you run the program on the largest number of CPUs, it still take at least 10 or 15 seconds, because otherwise it will run too quickly, and the timing measurements will not be trustworthy. Also, on the sequential version, you should run this on palmetto as well, using a PBS script (chunks=1:cpus=1), so that you can compare apples to apples (if you run it on the head node, other people are using that, and the timing measurement will not be accurate).
DONT WAIT UNTIL THE LAST MINUTE TO RUN THIS ON PALMETTO AND EXPECT TO GET IT ALL DONE IN A FEW HOURS!!!!!
After you collect these two sets of data, you should process it in a spreadsheet and create some graphs of, execution times, speedups, and efficiencies across the ranges of processors you used. For example, if you choose N=1000, 2000, and 3000, then you should have 2 sets of 3 curves on the speedup plot, 1 set of 3 for that uses the total execution times, and 1 set of 3 that uses just the floyd function call timings. If you need clarification on this, please let me know ASAP.
The file format should either be Word DOC, OpenOffice, or PDF. (do not submit the spreadsheet, just a doc with the plots in it. Also in the report, note your observations.
IMPORTANT:
This whole project has to be immaculate, i.e. all the error checking should be done, it needs to be structured programming as much as possible. No warnings,. It is important that I be able to easily compile the whole thing in one command. Neat, organized, and consistent to the interface provided above, and for the executables requested.
Command-line arguments should be parsed using getopt() as was the case with HW04.
If you have questions, please let me know ASAP.
Example execution:
$ ./make- graph -n 5 -r 100 -p 150 -o default-make-graph-file.dat writting graph to file default-make-graph-file.dat
$ ./print-graph default-make-graph-file.dat
reading graph from file default-make-graph-file.dat
Array is a 5 x 5 matrix
|
0
1
2
3
4
0
|-------------------------
|
0
19
-1
68
79
1
|
32
0
-1
-1
-1
2
|
57
77
0
5
8
3
|
79
100
1
0
10
4
|
62
-1
88
-1
0
$ ./floyd-serial default-make-graph-file.dat default-make-graph-file.seq
reading graph from file default-make-graph-file.dat
writting graph to file
default-make-graph-file.seq
floyd-serial execution
time:
n = 5
nodes
p = 1
cpus
(sec)
ptime
= 1.000000
ftime
= 0.010000
(sec)
$ ./print-graph ./default-make-graph-file.seq
reading graph from file ./default-make-graph-file.seq Array is a 5 x 5 matrix
|
0
1
2
3
4
0
|-------------------------
|
0
19
69
68
77
1
|
32
0
101
100
109
2
|
57
76
0
5
8
3
|
58
77
1
0
9
4
|
62
81
88
93
0
$