Practice Problem Set 2: Data Visualization

  1. Run the following code to load the data. Glimpse the data and learn about the variables:

    For the following questions, please create a graph using ggplot to answer:

  2. Create a plot examine the distribution of re78 variable which has information on income of the participants in year 1978. (Note: you have to highlight code that is in different lines and then run in webr interface).

  3. Now check the difference between the distribution of that variable between the treatment and control groups.

  4. Think about one more different way to visualize the difference in the distribution of ret78 in treatment versus control. Which way do you prefer - one from Q3 or Q4 - and why?

  5. Create graphs showing distributions of the other variables in the data for treatment versus control groups.

  6. Examine the relationship between age and educ.

  7. Does the relationship between age and education differ between the treatment and control groups?

  8. Examine the relationship between race and income in 1978.

  9. Does the relationship differ by treatment level?

  10. Challenge Problem: Examine the trend in income across the three times given (1974, 1975, 1978). Draw conclusions about the trends.