MATH 208 Final Exam Solved

Starting from:

~~$30~~

$24

Question 1 [50 points]

data(midwest)

midwest_modified<-midwest %>% select(county,state,popdensity,

popwhite,popblack,

popamerindian,popasian,

popother,inmetro)

The data for this question comes from a modified version of the midwest dataset from the ggplot library.

str(midwest_modified)

tbl_df [437 x 9] (S3: tbl_df/tbl/data.frame)

$ county

: chr [1:437] "ADAMS" "ALEXANDER" "BOND" "BOONE" ...

$ state

: chr [1:437] "IL" "IL" "IL" "IL" ...

$ popdensity
: num [1:437] 1271 759 681 1812 324 ...

$ popwhite

: int [1:437] 63917 7054 14477 29344 5264 35157 5298 16519 13384 146506 ...

$ popblack

: int [1:437] 1702 3496 429 127 547 50 1 111 16 16559 ...

$ popamerindian: int [1:437] 98 19 35 46 14 65 8 30 8 331 ...

$ popasian

: int [1:437] 249 48 16 150 5 195 15 61 23 8033 ...

$ popother

: int [1:437] 124 9 34 1139 6 221 0 84 6 1596 ...

$ inmetro

: int [1:437] 0 0 0 1 0 0 0 0 0 1 ...

midwest_modified %>% slice(1:5) %>%

select(county:popblack)

# A tibble: 5 x 5

county
state popdensity popwhite popblack

<chr>
<chr>
<dbl>
<int>
<int>
1
ADAMS
IL

1271.
63917
1702
2
ALEXANDER IL

759
7054
3496
3
BOND
IL

681.
14477
429
4
BOONE
IL

1812.
29344
127
5
BROWN
IL

324.
5264
547

midwest_modified %>% slice(1:5) %>%

select(county,popamerindian:popother)
# A tibble: 5 x 4

county
popamerindian popasian popother

<chr>

<int>
<int>
<int>
1
ADAMS

98
249
124
2
ALEXANDER

19
48
9
3
BOND

35
16
34
4
BOONE

46
150
1139
5
BROWN

14
5
6

The dataset contains population data from midwest counties in five states in the United States from an unspecified year. There are identifying variables for both the county (the name) and the state (the postal abbreviation). The variable popdensity is a measure of density (population per unspecified area units). The variable inmetro is equal to 1 if the county is classified as a metropolitan area and 0 otherwise. The other variables contain counts of population size within self-identified racial classifications.

1

(a) [5 pts] Write a line of code that will generate the following tibble (or data.frame) containing the highest population density from each state:

• A tibble: 5 x 2 state Highest_Pop_Den

<chr>
<dbl>
1
IL
88018.
2
IN
34659.
3
MI
60334.
4
OH
54313.
5
WI
63952.

(b) [5 pts] Write a line of code that adds a new column to the midwest_modified tibble called Metro where the elements of that column are equal to a string “Metro” if inmetro is equal to 1 and “NonMetro” if inmetro is equal to 0. The first five rows are given below for the county, state, inmetro and Metro columns:

• A tibble: 5 x 4

county
state inmetro
Metro

<chr>
<chr>
<int>
<chr>
1
ADAMS
IL
0
NonMetro
2
ALEXANDER
IL
0
NonMetro
3
BOND
IL
0
NonMetro
4
BOONE
IL
1
Metro
5
BROWN
IL
0
NonMetro

(c) [5 pts] Write a line of code that will generate the following tibble (or data.frame) containing the highest population density from each state for metropolitan and non-metropolitan counties separately, using the modified tibble from part (b).

dens_table

• A tibble: 10 x 3

• Groups: state [5]

state
Metro
Highest_Pop_Den

<chr>
<chr>
<dbl>
1
IL
Metro
88018.
2
IL
NonMetro
2309.
3
IN
Metro
34659.
4
IN
NonMetro
3090.
5
MI
Metro
60334.
6
MI
NonMetro
2251.
7
OH
Metro
54313.
8
OH
NonMetro
5484.
9
WI
Metro
63952.
10
WI
NonMetro
2344.

CONTINUED ON NEXT PAGE

3
MATH 208 Final Exam December 18th – 21st,

(d) [5 pts] Assume the tibble from part (c) is called dens_table as above. Now write a line of code that produces a tibble which arranges the data above so that we have separate columns for “Metro” and “NonMetro”, as below:

• A tibble: 5 x 3

• Groups: state [5] state Metro NonMetro

<chr>
<dbl>
<dbl>
1
IL
88018.
2309.
2
IN
34659.
3090.
3
MI
60334.
2251.
4
OH
54313.
5484.
5
WI
63952.
2344.

Now we will work with only a modified version of the population counts for each county.

(e) [5 pts] Write a line of code to add a new variable to the data frame named HighDens which is equal to “High” if the population density for the county is higher than 1500 and “Not High” if the population density for the county is lower than 1500. Below are the first 5 rows of the data for the county, popdensity and HighDens columns:

• A tibble: 5 x 3

county
popdensity
HighDens

<chr>
<dbl>
<chr>
1
ADAMS
1271.
NotHigh
2
ALEXANDER
759
NotHigh
3
BOND
681.
NotHigh
4
BOONE
1812.
High
5
BROWN
324.
NotHigh

Then we will compute the total number of people in each combination of state, inmetro and HighDens using the code below:

pop_xtabs<-xtabs(

I(popwhite+popblack+popamerindian+popasian+popother)~

state+Metro+HighDens,data=midwest_modified)

pop_xtabs

, , HighDens = High

Metro

state Metro NonMetro

IL 9323624 405933

IN 3728008 689565

MI 7697643 354081

OH 8811604 1078957

WI 3004347 386892

, , HighDens = NotHigh

Metro

state
Metro NonMetro
IL
250175
1450870
IN
234438
892148
MI
0
1243573
OH
98555
857999
WI
326825
1173705

CONTINUED ON NEXT PAGE

4
MATH 208 Final Exam December 18th – 21st,

(f) [5 pts] What will the code pop_xtabs["IL",1,2] return as output?

(g) [5 pts] Using only the pop_xtabs object above, write a line of code to find the total number of people in areas high density (i.e. HighDens is “High”) as below:

High NotHigh

35480654 6528288

(h) [10 pts] Using only the pop_xtabs object above, write a line of code that computes the total population in the combination of State and HighDens to return the output below:

HighDens

state High NotHigh

IL 9729557 1701045

IN 4417573 1126586

MI 8051724 1243573

OH 9890561 956554

WI 3391239 1500530

(i) [5 pts] Using only the pop_xtabs object above, write a line of code (or multiple lines of code) that computes the percentage of individuals in High and Low density in each state as below:

HighDens

state High NotHigh

IL 85.11850 14.881500

IN 79.67977 20.320233

MI 86.62148 13.378518

OH 91.18149 8.818511

WI 69.32541 30.674588

END OF QUESTION 1

5

More products

$6.00 OFF

Lab 1: Lab Environment and Number Systems Solution

$30

$24

Buy now

$6.00 OFF

Assignment 5 Solution

$30

$24

Buy now

$6.00 OFF

Project 5: File System Checker Solution

$35

$29

Buy now