Starting from:
$35

$29

Assigned Project 1 Solution

Before you start running this project, please unzip Baseline.zip, train_data.zip and test_data.zip under the directory of the code. 

To reproduce the results:  
1. Impute the data using random forest. (It will take about 4 hours to run. The output will be in test_output/iterative_imputer_extratrees_iter40)
```python3 random_forest_test.py```

2. Impute the data using 3D-mice. (It will take about 14 hours to run.)  
Run ```./prepare.sh``` to reindex all the test files to start from 1 then move them to the 'train_with_missing' folder.
```cd```to the directory of 3D-mice (could be Basline/code). Change the content of dnsroot in 'mimicConfig_release.R', then type ```R``` to enter the R environment. 
```
set.seed(100)
source('mimicMICEGPParamEvalTr_release.R')
``` 
to start running 3D-mice. 

After the program stops, press ```q()``` to quit the R environment. 
```cd```to the directory of 3D-mice (could be Basline/code) and enter ```R``` to get into the R environment again to export the results to CSV files.
```
load("tr_res_iter2.RData")
for (i in seq(1, 8267, by=1)){ write.csv(t(res$t.imp[[i]]), sprintf("%d.csv", i), row.names = FALSE)}
```
You will get a list of CSV files with names from 1 to 8267. Then run ```./finish.sh``` to reindex the last 2267 files and move them to 
'test_output/3d-mice-test' folder. 

3. Mix the output of them.
Run ```python3 mix_output.py``` to get a mixture of the output from the above two models. The output will be stored as 'test_output/mice_extratrees_40'.

More products