Thomas J. Leeper > Teaching > Research Design Course > Problem Set 8

Problem Set 8: Statistics and Regression


The purpose of this problem set is to assess your understanding of methods of quantitative data analysis, especially regression, and the statistical assumptions underlying those methods.


You will use the Quality of Government dataset to provide a quantitative description of the quality of democratic institutions across the world. Note: You may use the “Standard” dataset (which is based on a single cross-section of data) or the “Time Series” dataset (which has country-year observations).

Your Task

  1. Consider the following research question: ``How healthy are democratic countries compared to less democratic countries?’’ Thinking descriptively, develop a hypothesis or expectation about the causal relationship between democracy and health (i.e., decide which precedes which and assert an expected size and direction of the relationship).

  2. Define the concepts of democracy and health and select one or more appropriate indicators of each concept from the Quality of Government dataset, justifying your choice of each.

  3. Using an appropriate statistical test (such as regression), evaluate the size of the relationship between democracy and health, assess whether that relationship appears to statistically different from no relationship, and interpret the resulting measures of uncertainty for the test. Are the data consistent with your expectations?

  4. Consider the issue of causality. Are the relationships observed in these data reflective of a causal relationship? What criteria would have to be satisfied for any relationship between democracy and health to be causal? What potential sources of confounding would need to be controlled for?

  5. Using any appropriate means of data visualization, describe the relationship between democracy and health in a way that demonstrates whether a third variable might be a source of confounding. Include the visualization in your paper.

  6. Using multiple regression, control for any identified sources of confounding in order to re-estimate the relationship between democracy and health. Is this effect substantively large? Statistically significantly different from zero?

  7. In a few sentences, reflect on the most challenging and most rewarding aspects of this assignment. State any uncertainties you currently have about the material covered by this assignment and reflect on what you will do to address those uncertainties.

Submission Instructions

Your assignment should be no more than 4 pages of double-spaced A4, Times New Roman font size 12, including any visualization.

As an appendix to your submission, include all relevant R code used to generate your results and figures. This should be typed in Courier New font size 10.

Submit the assignment via Moodle by Tuesday, March 13.


You will receive feedback within two weeks.

Creative Commons License Except where noted, this website is licensed under a Creative Commons Attribution 4.0 International License.