Assignment 2
Tasks
For this assignment, you will submit SIX plots: a ‘rough draft and a ‘final’ version. See the description here for the final version, with the drafts described below.
Final plots Complete two plots, each using at least two, if not three, variables from your dataset in each plot: one plot needs continuous-y data and the other should involve categorical data. This means that, for example, you might have one plot to showcase continuous data, such as a boxplot. You could include categorical variables to include the continuous data by variable, separating out the plots. Similarly, you could do a scatterplot where you color-code the points by a secondary variable. For categorical data, you could have a bar plot where you showcase the proportion of each category aligns with a secondary variable. You could also focus on the categories for a box plot – e.g. the key emphasis is on the categorical variable. You need two different types of plots for this assignment.
You can use any data for this assignment but the dataset should be the same for both and there should be some connection between the two plots – some narrative of some sort. This assignment should be able to stand alone, meaning that all information needed to understand and interpret the plots is provided by you in your writeup. Keep the writing to approximately 500-750 words. The final plots should be professional quality–something we could expect to see in a final report, thesis, etc.
Components
You will first want to describe your data and provide some summary of the relevant variables to describe what they are, etc. Be sure to include your source for the data. BE WISE in how you select your data.
For each data type, you will provide three plots:
- The final plot. This is the plot that you selected to represent the data most effectively.
- The rough draft / initial plot. This is the initial version of your final plot with all ‘defaults’ – e.g. not at all customized.
- The alternative plot. This is a second plot option you tried before committing to the final form. For example, maybe you were debating between histograms, dot plots, and box plots. Include one version you tried.
Continuous-y plot
Describe the plot type you selected and what it tells us about the data. Your plot should be appropriate for the data (e.g. don’t use a plot for nominal data with continuous data), and be customized to showcase some finding. Effective graphs are ones that make your intended takeaway obvious.
Secondly, describe the plot you chose, alternatives you considered, and why this was the selected form.
Finally, describe the customizations (scales, labels, colors, titles, etc.) that you made to improve the graph from the original baseline.
Categorical plot
Describe the plot type you selected and what it tells us about the data. Your plot should be appropriate for the data (e.g. don’t use a plot for continuous data with categorical data), and be customized to showcase some finding. Effective graphs are ones that make your intended takeaway obvious.
Secondly, describe the plot you chose, alternatives you considered, and why this was the selected form.
Finally, describe the customizations (scales, labels, colors, titles, etc.) that you made to improve the graph from the original baseline.
Assessment
Your plots will be assessed on the following criteria:1
- Is it truthful?
- Is it functional?
- Is it beautiful?
- Is it insightful?
- Is it enlightening?
Getting started
All work will be performed inside a version-controlled GitHub repo. Create your project repo by going to this link on github classroom
Drawn from chapter 2 of The Truthful Art: Data, charts, and maps for communication by Alberto Cairo. ↩︎