## Final Exam practice questions

1. The Frisch-Waugh-Lovell theorem shows how the OLS estimate of the coefficient on X1iin a regression of Yi on X1i and X2i can be obtained from two separate regressions, each of which only involves a single regressor. List the three steps this entails.
2. Using data on a sample of workers between the ages of 18 and 65 who were employed in 2015 I estimated the followingregression

where Employedi is a binary variable equal to 1 if worker i was employed in 2016 and 0 if not, and Agei is the age in years of worker i. Provide a clear and valid interpretation of the slope estimate in this regression.

1. Using the same sample of workers between the ages of 18 and 65 who were employed in 2015 I estimated the followingregression

where Employedi is a binary variable equal to 1 if worker i was employed in 2016 and 0 if not, and Agei is the age in years of worker i. What is the partial effect of age on employment according to this regression?

1. Considerthe regression model Yi β0 + β1X1i β2X2i ui.

a.Writedown three assumptions under which the OLS estimator βˆ1  is unbiased and con-sistent.

b.If X2iis not included in the regression, the OLS estimator for β1 may be Fill in  the missing parts of the following formula for the omitted variable bias.

1. The following table summarizes results from the RAND Health Insurance Experiment dis- cussed in Chapter 1 of ‘Mastering Metrics. In this study, participants were randomly assigned to an insurance plan. The table below uses results only for those participants assigned to the free comprehensive coverage plan and the catastrophic plan that only paid 5% of health care costs up to a per family

i.Were the treatment and control groups statistically significantly different on average at baseline?

ii.Is there a statistically significant difference in health outcomes between the treatmentand control groups?

iii.Suppose the researchers are worried that, contrary to the design of the study, lower income individuals were more likely to end up with the comprehensive

a.Would the selection bias/omitted variable bias be positive ornegative?   北美经济学代考

b.If the researchers know the family income for each patient in their dataset, what regression could they estimate to control for this? Write down the regression model. Be specifific.

iv.Suppose that 40% of the patients in the catastrophic group actually obtained compre-hensive insurance through other means. But no one assigned to the comprehensive insurance group turned it down. Compute a simple instrumental variables estimate, without controls, of the effect of comprehensive health insurance.

### 2.This problem is based on a regression analysis of a random sample of 532 home sales in Fairfax County in Virginia over the period 1997-2011.   北美经济学代考

The table below reports the results  of 4 different regressions. The regressors are, in order, a dummy for whether the home has central air conditioning, the number of bathrooms, the number of bedrooms, a dummy forwhether the home is a detached house (rather than a townhome or condo), a dummy for whether the home is newly constructed, and a dummy for whether the home has a fifireplace.

i.The second column is my  baseline specification.  Provide a clear interpretation of each   of the coefficient estimates in this column.

ii.Whydo you think I included dummy variables for the year in the baseline specification?

iii.What is the difference between the coefficient on the ac dummy in the first andsecond columns. Provide an explanation for why the estimate changed so much.

iv.Installing central air conditioning in my home would cost around \$10, Based on this regression analysis (the whole table), would you say that this is a wise investment in the value of myhome?   北美经济学代考

v.For each of the following, determine whether I am describing a threat to internalor external validity. Very briefly provide one possible solution to the problem.

a.Homes with central air conditioning are on average newer (that is, more recentlybuilt) than homes without. As a result, the coefficients are biased.

b.I would like to use these results to determine whether I should install central air conditioning in my home in Florida. However, I am worried that these results may not generalize from Virginia to Florida.

c.It turns out that adding central air conditioning improves the value of a new home much more than it improves the value of an old home.

### 3.This problem is based on a forthcoming article in the Journal of Health Economicsby Jenna Stearns.

In 1978, states with Temporary Disability Insurance (TDI) were required to start providing paid leave to pregnant women before and after

i.California was one of the states which had TDI and was therefore affected by the new  law.  So starting in 1978,  pregnant women working in California benefited from this   new law. Oregon did not have TDI and therefore, women working in Oregon were not affected by this new law. Suppose that 5% of babies in California were born with low birth weight in 1977 and that decreased to 4.7% in 1978.  In Oregon,  4.3% of babies  were born with low birth weight in 1977 and that decreased to 4.2% in 1978. Using this data, what is the difference-in-differences estimate of the effect of providing extended paidleave?  北美经济学代考

ii.How could data for each year from 1972 to 1984 from California and Oregon be used to calculate a difference-in-differences estimate of the effect of extended paid leave on the percent of babies born with low birth weight? (Hint: Write down the regression model  that would be used and make sure to specify which coefficient or coefficients gives the desired effffect.)

iii.The main assumption maintained is the common trend assumption. Explain what this means in this example. Be specifific.

iv.How can the common trend assumption be further relaxed here? Provide at least one specific example.