Starting from:
$30

$24

HW05: Decision Tree Regression Solution

In this homework, you will implement a decision tree regression algorithm in R, Matlab, or




Python. Here are the steps you need to follow:




You are given a univariate regression data set, which contains 133 data points, in the file named hw05_data_set.csv. Divide the data set into two parts by assigning the first 100 data points to the training set and the remaining 33 data points to the test set.



Implement a decision tree regression algorithm using the following pre-pruning rule: If a node has or fewer data points, convert this node into a terminal node and do not split further, where is a user-defined parameter.



Learn a decision tree by setting the pre-pruning parameter to 10. Draw training data points, test data points, and your fit in the same figure. Your figure should be similar to the following figure.



P=10







training




test




50







0




y



−50







−100







0 10 20 30 40 50 60




x




Calculate the root mean squared error for test data points. The formula for RMSE can be written as:
∑01231( + − -+)/




RMSE = ' +45




7897




Your output should be similar to the following sentence.




RMSE is 27.6841 when P is 10




Learn decision trees by setting the pre-pruning parameter to 1, 2, 3, …, 20. Draw RMSE for test data points as a function of . Your figure should be similar to the



following figure.




























RMSE










33







32




31




30




29




28




27




26




5 10 15 20




P



What to submit: You need to submit your source code in a single file (.R file if you are using R, .m file if you are using Matlab, or .py file if you are using Python) and a short report explaining your approach (.doc, .docx, or .pdf file). You will put these two files in a single zip file named as STUDENTID.zip, where STUDENTID should be replaced with your 7-digit student number.




How to submit: E-mail the zip file you created to aghanem15@ku.edu.tr with the subject line Intro2MachineLearningHW05. Please follow the exact style mentioned for the subject line and do not send a zip file named as STUDENTID.zip. Submissions that do not follow these guidelines will not be graded.




Late submission policy: Late submissions will not be graded.




Cheating policy: Very similar submissions will not be graded.

More products