Starting from:
$35

$29

Lab. 4 Airline Data (Big) Regression Analysis Solution

1. In this lab practical we will work with Big data of Airline. Download 2 files from the lecture folder i.e. 1.2008.csv.bz2 2. Airline.desc . Unzip the files and you will get .csv file for further experiments.

    1. Compute the correlation coefficients by taking two variables from the csv file. Take variable X as Distance and Y as Airtime. Next compute the simple regression line equation is Y = β0 + β1 X. Find intercept β0 and Coefficient (slope) β1 . Find RMSE between the original y’s and predicted ^y ‘s using the derived β0 and β1 .

    2. Compute 95% confidence for the value of slope and the mean value of y0 when x0 is 1200.

    3. Using bi-weighted robust least square method to compute more reliable intercept β0 and slope β1, which should be more robust than the previous values. Find RMSE using newly computed parameters. In bi-weighted robust least square each data

point is weighted by a weight wi, where wi = (1-ui2)2 when ui<=1 otherwise wi=0. Here ui = di/3s; where s is the interquatile range of di and di = (yi - ^yi).

You may use Python/R for this exercise





















1

More products