Starting from:
$35

$29

CS 513 HW Final Exam

Problem 1 - (20 points)

The “AL_NJ_Income_pct” CSV dataset on CANVAS categorizes the tax returns of families in the states of Alabama and New Jersey into six categories (Returns_pct1 to Returns_pct6). Use these six categories and Euclidian distance, to perform the following analysis

• Use the kmeans clustering method to create two clusters for the “AL_NJ_Income_pct” dataset.

• Show the cross tabulation of the clusters versus the State feature.

• Use the hierarchical clustering method and single linkage to create 4 clusters for the the “AL_NJ_Income_pct” dataset.

• Identify the outliers (if any).

Problem 2 - (20 points)

Use the Random Forest methodology to develop a classification model for the “State” (target), using the Returns_pct1 to Returns_pct6 features in the “AL_NJ_Income_pct dataset.

• Show the cross tabulation of the classification.

• What is the accuracy of your model?

• What is the precision of the model?

• What is the recall of the model?

• What is the F1 of the model?

Problem 3 - (20 points)

Use the C5.0 Forest methodology to develop a classification model for the “State” (target), using the Returns_pct1 to Returns_pct6 features in the “AL_NJ_Income_pct dataset.

• Show the cross tabulation of the classification.

• What is the accuracy of your model?

• What is the precision of the model?

• What is the recall of the model?

• What is the F1 of the model?

Problem # 4: (20 points)

Use theCART methodology to develop a classification model for the “State” (target), using the Returns_pct1 to Returns_pct6 features in the “AL_NJ_Income_pct dataset.

• Show the cross tabulation of the classification.

• What is the accuracy of your model?

• What is the precision of the model?

• What is the recall of the model?

• What is the F1 of the model?

Problem # 5: (20 points)

Using data in the table below, construct a Neural Network with one Output Layer (z) and one Hidden Layer (two nodes A and B). Calculate the predicted outcome if the inputs to the input nodes are (Node 1=.4, Node 2=.7 Node 3= .7 and Node 4=.2)

Use the actual value of .75 and a learning factor of .1 to adjust the weight for xx to z.

From

To

Weight

X

A

0.5

Node 1

A

0.6

Node 2

A

0.8

Node 3

A

0.6

Node 4

A

0.2

x

B

0.7

Node 1

B

0.9

Node 2

B

0.8

Node 3

B

0.4

Node 4

B

0.2

xx

z

0.5

A

z

0.9

B

z

0.9

Datasets: AL_NJ_Income_pct.csv

More products