$24
• Linear Regression Model Fitting (Programming)
There are TWO portions to this section.
2.1 Coding
First, you must ll out the template code provided. This code will perform linear regression on a data le named \regression-data.txt" which is also provided.
(BTW: This coding assignment does not include validation or testing which is bad practice but we will do them later in the semester. )
Please submit your python code as \linearRegression.py" and use Python3.
Using the \numpy" arrays and functions from http://www.numpy.org is required. (BTW: Numpy arrays and functions perform mathmatical operations faster because they allow for vectorization and use optimized libraries. For this dataset size, it is irrelevent, but for larger datasets it means the di erence between waiting 1 hour and 4 days. So we are forcing you to use them as practice.
2.2 Written Portion
Second, you must complete a written portion and turn it in as part of the pdf with the rest of the assignment. For the written portion you must describe what happens to the loss function per epoch as the learning rate changes for both gradient descent and stochastic gradient descent and explain why.
Function load data set() should also output a gure plotting the data; Please submit the plot in the written part of the homework.
For each optimization method you implement to learn the best LR line, please submit a gure showing the data samples and also draw the best- t line which has been just learned. Please also include the concrete value of the derived theta in the written part of the homework.
For each optimization method you implement to learn the best LR line, In your GD, SGD or MiniSGD implementation:
{ The functions should output a gure with x-axis showing epoch number (the updating iteration t, t mean the total number of times the entire training set is iterated over), and y-axis showing the mean training loss at that epoch.
{ The functions should use mean square error(MSE) as the loss function. { The functions should stop when t reaches a prede ned value tmax = 100;
{ In the written part of the submission, you should discuss the behavior of the function when varying the value of the learning rate (for instance varying values from f0:001; 0:005; 0:01; 0:05; 0:1; 0:3g.
{ It is a good practice to perform a random shu e your training samples before each epoch of (mini-batch) SGD;
{ In mini-batch SGD, you should try to observe the convergence behaviors by varying the size B your used for sizing the mini-batch.
2.3 Recommendations
Implement the code in order that it is given in the template main function.
Shu e the x and y before using stochastic gradient descent. (Make sure to shu e them together.) 0s will not work as hyperparameters for learning rate or number of iterations.
We will run "python3 linearRegression.py" and it should work!
A few useful links for using numpy array:
basics (more than we need): https://docs.scipy.org/doc/numpy/user/quickstart.html
3