$24
Price Elasticity of Fish Demand (55 points)
Use the data set in FISHEXAM.DTA, which comes from Graddy (1995). The data contains 97 daily price and quantity observations on sh prices at the Fulton Fish Market in New York City. Please check the de nitions of the variables carefully and also use the "browse" command to see how they are recorded in the data. You are going to use this data to analyze the determinants of sh prices and to estimate a demand function for sh. In some of the questions, you need to generate new variables.
Part 1: Determinants of Fish Prices
1) (10 points) Estimate an empirical model to analyze the determinants of sh price. The model needs to answer how price varies (in percentage terms) by di erent days of the week and over time (use quadratic time trend). Interpret and discuss your ndings (coe cients, their signi cance and explanatory power of the model). Is there an evidence for a systematic variation in price within a week? What do the coe cients for quadratic time trend tell us?
2) (5 points) Now, add the variables "wave2" and "wave3" (to the above model), which are measures of wave heights over the past several days. Interpret the coe cients of these new variables. Are these variables individually signi cant? Explain why stormy seas would increase the price of sh. Explain why these variables can be assumed to be exogenous (not correlated with error term).
3) (5 points) Now, re-estimate the model in question (2) by using the daily growth rate in sh price
as the dependent variable. Interpret the size of coe cients that are signi cant at 0.10 signi cance level. Is there a signi cant time trend? How can we explain the di erent results that we obtained for time trend in questions (2) and (3)?
Part 2: Demand Fuction for Fish
4) (10 points) Now, you are expected to estimate the price elasticity of sh demand. Again, you need to control for daily seasonality and the quadratic time trend in your demand function. Discuss your ndings. Interpret the size of the coe cients that are signi cant at 0.10 signi cance level. Discuss how your results might be a ected when there is a random measurement error in "demand" variable? Discuss how your results might be a ected when there is a random measurement error in "price" variable?
5) (10 points) The variables "wave2" and "wave3" are measures of ocean wave heights over the past several days. What assumptions do we need to make in order to use "wave2" and "wave3" as instrumental variables for sh price in estimating the demand equation? Discus whether these assumptions are valid. Explain what
your results in question (2) indicate about validity of one of these assumptions.
6) (10 points) Now, estimate the model in (4) by 2SLS approach using "wave2" and "wave3" as instruments (here, you are expected to implement the two stage procedure in STATA). Next, estimate this 2SLS model with correct standard errors (using "ivreg" command). What is your conclusion about the price elasticity of sh demand? Based on this result, is the demand for sh price elastic or inelastic (check the de nition of "elastic demand")? How can you explain the di erence between elasticity estimates that are obtained in questions (4) and (6). What is the main methodological problem about the model estimated in (4) (Discuss the potential reason for a bias)? .
7) (5 points) Now, re-estimate the demand equation in question (6), this time by eliminating the outlier observations for "wave2" and "wave3" variables (do not include the days when the "wave2" or "wave3" are larger than 10). How did your price elasticity estimate change as compared to question (6)? What might be the other approach to eliminate the impact of these outliers on your result? (Hint: You can check the scatterplot showing the relationship between wave height and prices.)
Determinants of Crime Rate (35 points)
Cornwell and Trumbull (1994) used data on 90 counties in North Carolina, for the years 1981 through 1987, to analyze crime rates. The data are contained in CRIMEEXAM.DTA. The crime rate is number of crimes per person, "prbarr" is the estimated probability of arrest, "prbconv" is the estimated probability of conviction (given an arrest), "prbpris" is the probability of serving time in prison (given a conviction), "avgsen" is the average sentence length served, and "polpc" is the number of police o cers per capita.
8) (10 points) Discuss the summary statistics for the variables in the data. (For instance: What is the average crime rate in the counties etc.? Do this for all variables that you use in your model (in the next question))
9) (10 points) Estimate a model analyzing the determinants of crime rate in the counties. Include both crime related variables ( "prbarr", "prbconv", "prbpris", "avgsen" "polpc", "avgsen" ). and the control variables in your model. Here you are expected to build the best model, which will reduce the risk of bias for crime related variables. Interpret the coe cients that are signi cant at 0.10 signi cance level. Are the signs of these signi cant coe cients (especially crime related variables) in line with your expectations? If not, what might be the reason for surprising ndings (explain clearly and provide an example reason for potential bias)?
10) (5 points) Now estimate the model in question (9) with Fixed e ects Method. Discuss the main di erences in your ndings as compared to the results in question (9). What is the bene t of this method as compared to your estimation in question (9)? Explain. Why are some variables omitted from the regression? Explain.
11) (10 points) Your concern is that variable "polpc" is endogenous. What might be the reason for this concern? In order to deal with this problem, nd an instrumental variable (IV) (from the ones available in the data) for "polpc" variable. Justify your choice. Discuss (if possible test) the validity of IV assumptions for the variable that you choose. Now, estimate the same model in question (9), this time with IV approach. Is there any di erence in the estimated e ect of "polpc" variable as compared to your nding in (9). Provide an example of a better IV candidate for "polpc" variable (if you could nd data).
Questioning the Reported Corona Death Numbers in Turkey (10 points)
12) (10 points) After the corona outbreak, there are some discussions about the accuracy of the corona-related death numbers that are reported by the governments. This is also a debate in Turkey. Some people argue that the number of corona-related deaths that are reported by the government is less than the real numbers. In this question, using city level monthly data, you are asked to build an empirical model (regression equation) that can help you test whether this argument is true or not. The variables that you are going to use are: 1)
o cial (reported by the government) number of total deaths for each city in Turkey for each month between March 2019 and June 2020 2) o cial (reported by the government) number of corona-related deaths for each city in Turkey for each month between March 2019 and June 2020 (this variable takes zero for the months before March 2020). For the governments, it is not easy to manipulate the total death numbers as it can be easily observed by the society, but it might be possible to manipulate the number of deaths that are related to corona (by changing the medical reports). Using these two variables and some additional time controls, describe a regression model that can help you test this argument. Which coe cient in this model is the coe cient of interest? If you want to test this argument, what does the null hypothesis for that coe cient needs to be?
4