$29
This assignment makes you implement slightly bigger programs compared to PA1 with the C pro-gramming language. There are three parts to this assignment. Part 2 is extra credit.
Part 1: Hexadoku (50 points)
In Part 1 of this assignment, you will write a C program that implements a simple Hexadoku solver (in the second part you can implement a complex Hexadoku solver for extra credit). Hexadoku is an extension of the logic puzzle Sudoku. The objective in Hexadoku is to ll a 16x16 gride of cells with each cell containing a hexadecimal digit (a number in the set 0-F) while satisfying the following constraints:
Each number is unique in its row and column.
Each number is unique in its subgrid where a subgrid is de ned by cutting the 16x16 grid into 16 non-overlapping 4x4 grids.
Each row, column, and subgrid have all numbers from a hexadecimal digit(0-F) present.
A Hexadoku is de ned as solved when all of its cells in the 16x16 grid are lled with numbers with each cell satisfying the constraints above. A Hexadoku is de ned as unsolvable when there is no con guration where all the cells can be lled without breaking the constraints. A Hexadoku will start partially completed with some numbers placed in the 16x16 grid. This will allow the solver to determine where to place new numbers in order to nd the solution for the given Hexadoku. For this part, we will only be looking at Hexadoku grids with one unique solution.
In the rst part, we will look at Hexadoku grids where there will always be a cell that you can ll with 100% certanity with an unique number according to the constraints described above.
Part 2: Extra Credit (25 points)
It is not always possible to ll at least one element with 100% certainity in each step for many Hexadoku grids. Sometimes the next step to be taken cannot be known and instead, a guess must be taken in order to move forward, such as perhaps lling in a cell randomly. For extra credit on this assignment, you can implement an algorithm that solves these more complex Hexadoku grids. A possible algorithm to do so would be to solve as much of the Hexadoku as can be done so con dently. Once no more nodes can be lled with 100% certainty, continue by taking the rst unsolved cell (let’s call this cell A) and lling it with a number that satis es the constraints and continue solving as usual from there. If a solution is found, then you are done. Otherwise, if no moves can be taken and the board has not yet been lled, then you must backtrack to the state where you guessed the value of cell A and either guess a new value for cell A or perhaps choose a new cell altogether. This is a simple backtracking algorithm. There are many algorithms to solve these complex Hexadoku. You can implement any algorithm that produces a solution.
1
Input format for Part 1 and Part 2
Your program will read in the data from a le given to the program as an argument from the command line. The le format will be a 16x16 grid with each number in the same row separated by a tab. Blank cells will be represented with a ‘-’.
An example Hexadoku grid (lets call it test1.text) is given below:
E -
5
2
6
1
A B
3
7
D F
- C
0
-
F 4
0
C
9
7
D 3
2
1
B
5
6
A
8
E
A 7
6
- 5
E
F 4
8
-
0
C
2
-
-
B
- 1
3
B 0
8
C 2
A
4
E
-
7
5
F
D
4
0
-9B3
8
- E -
F A
1
6
7
2
B E 2
- 7
F
4
9
1
3
6
0
C 8
-
-
3
5
F
1
E
-
6
-
7
8
C
2
B 4
9
A
- 6
C
7
A
5
2
1
D
B
4
9
F 3 E
0
C D
8
0
F
B
- E
9
6
A
4
- 7
-
-
7
A
E
3
-
6
9
8
B
C 5
D
0
F
4
1
6
F
9
4
D
-
3
5
0
2
-
-
- B
-
8
1
-
B
5
C
4
7
0
F
E
3
8
A D 6
9
2
B
4
F
8
D
5
7
-
0
9
1
3
E
A
6
- 3
7
6
1
2
0
F
4
A
8
E
D 9 B
C
0
9
A
E
4
C
B 6
5
D
7
3
8
-
1
F
D C
1
8
3
9
- A
6
F 2
-
4
0
5
-
You can assume that all inputs will be valid meaning the spacing will be correct and only the ‘-’ character and numbers 0-F will be present in the test les. The preset numbers in the test may not satisfy the constraints of Hexadoku resulting in an unsolvable Hexadoku.
Output format:
For a Hexadoku grid that has a solution, you should print out the 16x16 grid with each number in the same row separated by a tab and with each row on a new line. There should be no trailing spaces or tabs on any line of output.
A sample execution is given below:
./first test1.txt
E 8
5
2
6
1
AB37DF
9
C
0
4
F 4
0
C
9
7
D 3
2
1
B 5
6
A 8
E
A 7
6
D
5
E
F4890C
2
1
3
B
9
1
3
B
0
8
C2A4E6
7
5
F
D
4
0
D
9
B
3
8
CE5FA
1
6
7
2
B E
2
A
7
F
4
9
1
3
6
0
C 8
D
5
3
5
F
1
E
0
6
D78C2
B 4
9
A
8
6
C
7
A
5
2
1
DB49
F 3
E
0
C D
8
0
F
B
1
E96A4
5
7
2
3
7
A
E
3
2
6
9
8
BC5D
0
F
4
1
6
F
9
4
D
A
3
5
0
2
1
7
E B
C
8
1
2
B
5
C
4
7
0
FE38
A D
6
9
2
B4F8D
5
7
C
0
9
1
3
E A 6
5
3
7
6
1
2
0
F4A8E
D 9
B
C
0
9
A
E
4
C
B65D73
8
2
1
F
D C
1
8
3
9
EA6F2
B
4
0
5
7
For a Hexadoku grid that has no solution, your program should print \no-solution" and nothing else. For part 1, Hexadoku grids that have a solution but cannot be found without non-deterministic guesses are considered unsolvable.
2
Part 3: One-Shot Learning (50 points)
In the third part, you will write a C program that implements simple \one-shot" machine learning algorithm for predicting house prices in your area.
There is signi cant hype and excitement around arti cial intelligence (AI) and machine learning. CS 211 students will get a glimpse of AI/ML by implementing a simple machine learning algorithm to predict house prices based on historical data.
For example, the price of the house (y) can depend on certain attributes of the house: number of bedrooms (x1), total size of the house (x2), number of baths (x3), and the year the house was built (x4). Then, the price of the house can be computed by the following equation:
y = w0 + w1:x1 + w2:x2 + w3:x3 + w4:x4
(1)
Given a house, we know the attributes of the house (i.e., x1, x2, x3, x4). However, we don’t know the weights for these attributes: w0, w1, w2, w3 and w4. The goal of the machine learning algorithm in our context is to learn the weights for the attributes of the house from lots of training data.
Let’s say we have N examples in your training data set that provide the values of the attributes and the price. Let’s say there are K attributes. We can represent the attributes from all the examples in the training data as a Nx(K + 1) matrix as follows, which we call X:
[
1, x0;1 , x0;2 , x0;3 , x0;4
1, x1;1 , x1;2 , x1;3 , x1;4
1, x2;1 , x2;2 , x2;3 , x2;4
1, x3;1 , x3;2 , x3;3 , x3;4
..
1, xn;1 , xn;2 , xn;3 , xn;4
]
where n is N 1. We can represent the prices of the house from the examples in the training data as a Nx1 matrix, which we call Y .
[
y0
y1
..
yn
]
Similarly, we can represent the weights for each attribute as a (K + 1)x1 matrix, which we call W .
[
w0
w1
..
wk
]
The goal of our machine learning algorithm is to learn this matrix from the training data.
Now in the matrix notation, entire learning process can be represented by the following equation, where X, Y , and W are matrices as described above.
X:W =Y
(2)
Using the training data, we can learn the weights using the below equation:
W = (XT :X) 1:XT :Y
(3)
where XT is the transpose of the matrix X, (XT :X) 1 is the inverse of the matrix XT :X.
Your main task in this part to implement a program to read the training data and learn the weights for each of the attributes. You have to implement functions to multiply matrices,
3
transpose matrices, and compute the inverses of the matrix. You will use the learned weights to predict the house prices for the examples in the test data set.
Want to learn more about One-shot Learning? The theory behind this learning is not important for the purposes of this class. The algorithm you are implementing is known as linear regression with least square error as the error measure. The matrix ((XT :X) 1:X T ) is also known as the pseudo-inverse of matrix X. If you are curious, you can learn more about this algorithm at https://www.youtube. com/watch?v=FIbVs5GbBlQ&hd=1.
Computing the Inverse using Gauss-Jordan Elimination
To compute the weights above, your program has to compute the inverse of matrix. There are numerous methods to compute the inverse of a matrix. We want you to implement a speci c method for computing the inverse of a matrix known as Guass-Jordan elimination, which is described below. If you compute inverse using any other method, you will risk losing all points for this part.
An inverse of a square matrix A is another square matrix B, such that A:B = B:A = I, where I is the identity matrix.
Gauss-Jordan Elimination for computing inverses
Below, we give a sketch of Gauss-Jordan elimination method. Given a matrix A whose inverse needs to be computed, you create a new matrix Aaug, which is called the augmented matrix of A, by concatenating identity matrix with A as shown below.
Let say matrix A, whose inverse you want to compute is shown below:
[
1 2 4
1 6 7
1 3 2
]
The augmented matrix (Aaug) of A is:
[
1
2
4
1
0
0
1
6
7
0
1
0
1
3
2
0
0
1
]
The augmented matrix essentially has the original matrix and the identity matrix. Next, we perform row operations on the augmented matrix so that the original matrix part of the augmented matrix turns into an identity matrix.
The valid row operations to compute the inverse (for this assignment) are:
You can divide the entire row by a constant You can subtract a row by another row
You can subtract a row by another row multiplied by a constant
However, you are not allowed to swap the rows. In the traditional Gauss-Jordan elimination method you are allowed to swap the rows. For simplicity, we do not allow you to swap the rows.
Let’s see this method with the above augmented matrix Aaug.
Our goal is to transform A part of the augmented matrix into an identity matrix.
Since Aaug[1][0]! = 0, we will subtract the rst row from the second row because we want to make Aaug[1][0] = 0. Hence, we perform the operation R1 = R1 R0, where R1 and R0 represents the second and rst row of the augmented matrix. Augmented matrix Aaug after R1 = R1 R0
4
[
1
2
4
1
0
0
0
4
3
1
1
0
1
3
2
0
0
1
]
Now we want to make Aaug[1][1] = 1. Hence, we perform the operation R1 = R1=4. The augmented matrix Aaug after R1 = R1=4 is:
[
1
2
4
1
0
0
0
1
3
1
1
0
4
4
4
1
3
2
0
0
1
]
Next, we want to make Aaug[2][0] = 0.
Hence, we perform the operation R2 = R2
R0. The
augmented matrix Aaug after R2 = R2
R0 is:
[
1
2
4
1
0
0
0
1
3
1
1
0
4
4
4
0
1
-2
-1
0
1
]
Next, we want to make Aaug[2][1] = 0.
Hence, we perform the operation R2 = R2
R1. The
augmented matrix Aaug after R2 = R2
R1 is:
[
1
2
4
1
0
0
0
1
3
1
1
0
4
4
4
0
0
11
3
1
1
4
4
4
]
Now, we want to make Aaug[2][2] = 1, Hence, we perform the operation R3 = R3 114 . Then, Aaug is:
[
1
2
4
1
0
0
0
1
3
1
1
0
4
4
4
0
0
1
3
1
4
11
11
11
]
Next, we want to make Aaug[1; 2] = 0, Hence, we perform the operation R1 = R1
3
R2. Then,
4
Aaug is:
[
1
2
4
1
0
0
0
1
0
5
2
3
11
11
11
0
0
1
3
1
4
11
11
11
]
Next, we want to make Aaug[0; 2] = 0, Hence, we perform the operation R0 = R0
4 R2. Then,
Aaug is:
[
1
2
0
1
4
16
11
11
11
0
1
0
5
2
3
11
11
11
0
0
1
3
1
4
11
11
11
]
5
Next, we want to make Aaug[0; 1] = 0, Hence, we perform the operation R0 = R0 2 R1. Then,
Aaug is:
[
1
0
0
9
8
10
11
11
11
0
1
0
5
2
3
11
11
11
0
0
1
3
1
4
11
11
11
]
At this time, the A part of the augmented matrix is an identity matrix. Hence, the inverse of A matrix is:
[
9
8
10
11
11
11
5
2
3
11
11
11
3
1
4
11
11
11
]
Your goal is to write a program to compute the inverse of a matrix to perform one-shot learning.
Input/Output speci cation
Usage interface
Your program for this part will be executed as follows:
./third <train-data-file-name <test-data-file-name
where <train-data-file-name is the name of the training data le with attributes and price of the house. You can assume that the training data le will exist and that it is well structured. The <test-data-file-name is the name of the test data le with attributes of the house. You have to predict the price of the house for each entry in the test data le.
Input speci cation
The input to the program will be a training data le and a test data le.
Structure of the training data le
The rst line in the training le will be an integer that provides the number of attributes (K) in the training set. The second line in the training data le will be an integer (N) providing the number of training examples in the training data set. The next N lines represent the N training examples. Each line for the example will be a list of comma-separated double precision oating point values. The rst K double precision values represent the values for the attributes of the house. The last double precision value in the line represents the price of the house.
An example training data le (train1.txt) is shown below:
4
7
3.000000,1.000000,1180.000000,1955.000000,221900.000000
3.000000,2.250000,2570.000000,1951.000000,538000.000000
2.000000,1.000000,770.000000,1933.000000,180000.000000
4.000000,3.000000,1960.000000,1965.000000,604000.000000
3.000000,2.000000,1680.000000,1987.000000,510000.000000
4.000000,4.500000,5420.000000,2001.000000,1230000.000000
3.000000,2.250000,1715.000000,1995.000000,257500.000000
6
In the example above, there are 4 attributes and 7 training data examples. Each example has values for the attributes and last value is the price of the house. To illustrate, consider the training example below
3.000000,1.000000,1180.000000,1955.000000,221900.000000
The rst attribute has value 3.000000, the second attribute has value 1.000000, third attribute has value 1180.000000, and the fourth attribute has value 1955.000000. The price of the house for these set of attributes is provided as the last value in the line: 221900.000000
Structure of the test data le
The rst line in the training le will be an integer (M) that provides the number of test data points in the le. Each line will have K attributes. The value of K is de ned in the training data le. Your goal is predict the price of house for each line in the test data le. The next M lines represent the M test points for which you have to predict the price of the house. Each line will be a list of comma-separated double precision oating point values. There will be K double precision values that represent the values for the attributes of the house.
An example test data le (test1.txt) is shown below:
2
3.000000,2.500000,3560.000000,1965.000000
2.000000,1.000000,1160.000000,1942.000000
It indicates that you have to predict the price of the house using your training data for 2 houses. The attributes of each house is listed in the subsequent lines.
Output speci cation
Your program should print the price of the house for each line in the test data le. Your program should not produce any additional output. If the price of the house is a fractional value, then your program should round it to the nearest integer, which you can accomplish with the following printf statement:
printf("%0.0lf\n", value);
where value is the price of the house and its type is double in C.
Your program should predict the price of the entry in the test data le by substituting the attributes and the weights (learned from the training data set) in Equation (1).
A sample output of the execution when you execute your program as shown below,
./third train1.txt test1.txt
should be
737861
203060
Structure of your submission folder
All les must be included in the pa2 folder. The pa2 directory in your tar le must contain 2 directories or 3 directories (if decide to do the extra credit part). The name of the directories should be named rst through third (in lower case). Each directory should contain a c source le, a header le(if you use it), and a Make le. For example, the subdirectory rst will contain, rst.c, rst.h (if you create one), and Make le (the names are case sensitive).
7
Hints and suggestions
You are allowed to use functions from standard libraries but you cannot use third-party libraries downloaded from the Internet (or from anywhere else). If you are unsure whether you can use something, ask us.
We will compile and test your program on the iLab machines so you should make sure that your program compiles and runs correctly on these machines. You must compile all C code using the gcc compiler with the -Wall -Werror -fsanitize=address -std=c11 ags.
You should test your program with the autograder provided with the assignment.
Submission
You have to e-submit the assignment using Sakai. Your submission should be a tar le named pa2.tar. To create this le, put everything (three folders) that you are submitting into a directory named pa2. Then, cd into the directory containing pa2 (that is, pa2’s parent directory) and run the following com-mand:
tar cvf pa2.tar pa2
To check that you have correctly created the tar le, you should copy it (pa2.tar) into an empty directory and run the following command:
tar xvf pa2.tar
This should create a directory named pa2 in the (previously) empty directory.
Grading guidelines
The grading will be automatically graded using the autograder.
Automated grading phase
This phase will be based on programmatic checking of your program using the autograder. We will build a binary using the Make le and source code that you submit, and then test the binary for correct functionality and e ciency against a set of inputs.
We should be able build your program by just running make.
Your program should follow the format speci ed above for both both the parts.
Your program should strictly follow the input and output speci cations mentioned. Note: This is perhaps the most important guideline: failing to follow it might result in you losing all or most of your points for this assignment. Make sure your program’s output format is exactly as speci ed. Any deviation will cause the automated grader to mark your output as \incorrect". REQUESTS FOR RE-EVALUATIONS OF PROGRAMS REJECTED DUE TO IMPROPER FORMAT WILL NOT BE ENTERTAINED.
We will check all solutions pair-wise from all sections of this course to detect cheating using moss software and related tools. If two submissions are found to be similar, they will instantly be awarded zero points and reported to o ce of student conduct. See Rutgers CS’s aca-demic integrity policy at: https://www.cs.rutgers.edu/academic-integrity/introduction.
8
Autograder
We provide the AutoGrader to test your assignment. We provide public testcases and expected solution in the autograder for rst and third parts. We provide only the public test cases for the second part, which is the extra credit part of the problem. Multiple solutions are possible for the second part.
AutoGrader is provided as pa2 autograder.tar. Executing the following command will create the pa2 autograder folder.
tar xvf pa2_autograder.tar
There are two modes available for testing your assignment with the PA2 AutoGrader.
First mode
Testing when you are writing code with a pa2 folder
Lets say you have a pa2 folder with the directory structure as described in the assignment.
Copy the folder to the directory of the pa2 autograder
Run the pa2 autograder with the following command python pa2 autograder.py
It will run the test cases and print your scores.
Second mode
This mode is to test your nal submission (i.e, pa2.tar)
Copy pa2.tar to the pa2 autograder directory
Run the pa2 autograder.py with pa2.tar as the argument. The command line is: python pa2 autograder.py pa2.tar
9