$29
Shared memory programming (Pthreads & openMP)
In this problem you are asked to write two parallel algorithms (one in Pthreads and the other in openMP) for solving a dense linear equations of the form A*x=b, where A is an n*n matrix and b is a vector. You will use Gaussian elimination without pivoting.
The algorithm has two parts:
Gaussian Elimination: the original system of equation is reduced to an upper triangular form Ux=y, where U is a matrix of size n*n in which all elements below the diagonal are zeros which are the modified values of the A matrix, and the diagonal elements have the value 1. The vector y is the modified version of the b vector when you do the updates on the matrix and the vector in the Gaussian elimination stage.
Back Substitution: the new system of equations is solved to obtain the values of x.
Normalization
0
Update
Values
The Gaussian elimination stage of the algorithm comprises (n-1) steps. In the algorithm, the ith step eliminates nonzero subdiagonal elements in column i by subtracting the ith row from row j in the range [i+1,n], in each case scaling the ith row by the factor Aji/ Aii so as to make the element Aji zero. See the figure.
Hence the algorithm sweeps down the matrix from the top left corner to the bottom right corner, leaving subdiagonal elements behind it.
The whole point of parallel programming is performance, so you will be graded partially on the efficiency of your algorithm. Therefore, once you achieve a correct solution, you should perform some experimentation with your programs to try to optimize its performance. You should hand in two working and documented programs. You should provide a basic correctness argument for your solution (arguing that the relevant dependencies and other issues are taken care
1
of by proper synchronization). In addition, you should describe some of the differ ent versions of your program that you tried, w hat performance results you got out of them, and why you think your current version is reaso nably efficient (or what more you could do to make it more efficient).
Suggestions:
Gaussian elimination in volves O(n3) operations. The back substitution stag e requires only O(n2) operations, so yo u are not expected to parallelize back substitution.
The algorithm should s cale, assuming n is much larger than the number of processors.
There should be clear comments at the beginning of the code explaining your algorithm.
Note on cheating: This is an individual assignment. Copying problem solutio ns or code is cheating. Both the person copy ing and the person giving the answers will be equally penalized. Make sure you do your own wo rk.
Requirements for uploading your assignment: Please put your source cod e (.c file) and document (.pdf file) into one .z ip or .tar file, and upload it on the Blackboard. Th e name of .zip or .tar should follow the format: CS546_LastName_FirstName_HW2. Also, please take a SCREENSHOT of your windo w and make sure it includes your username and hos tname on your terminal (see the picture below: username@hostname:~/Documents/CS546$, username and hostname are in the red box) , when you test your programs and put the scree nshot in your document.
2