class: center, middle, inverse, title-slide .title[ # Optimizing color spaces ] .author[ ### MACS 40700
University of Chicago ] --- # Agenda / heads up: ```r # install if necessary install.packages(c("tidyverse", "here", "colorspace", "scales", "ggthemes", "usethis", "cowplot")) usethis::use_course("MACS40700/choosing-colors") ``` FYI for some plots, you may need to find them on their own and install separately, e.g.: https://github.com/UrbanInstitute/urbnmapr --- # Uses of color in data visualization -- <table style = "border: none; line-height: 2.5;"> <tr style = "background: white;"> <td style = "text-align: left; width: 50%;"> 1. Distinguish categories (qualitative) </td> <td> <img src = "images/qualitative.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> </table> --- # Qualitative scale example <img src="index_files/figure-html/popgrowth-vs-popsize-colored-1.png" width="70%" style="display: block; margin: auto;" /> Palette name: Okabe-Ito ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Qualitative scale example <img src="index_files/figure-html/popgrowth-vs-popsize-colored2-1.png" width="70%" style="display: block; margin: auto;" /> Palette name: ColorBrewer Set1 ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Qualitative scale example <img src="index_files/figure-html/popgrowth-vs-popsize-colored3-1.png" width="70%" style="display: block; margin: auto;" /> Palette name: ColorBrewer Set3 ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Uses of color in data visualization <table style = "border: none; line-height: 2.5;"> <tr style = "background: white;"> <td style = "text-align: left; width: 50%;"> 1. Distinguish categories (qualitative) </td> <td> <img src = "images/qualitative.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> <tr style = "background: white;"> <td style = "text-align: left;"> 2. Represent numeric values (sequential) </td> <td> <img src = "images/sequential.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> </table> --- # Sequential scale example <img src="index_files/figure-html/four-locations-temps-by-month-1.png" width="80%" style="display: block; margin: auto;" /> Palette name: Viridis ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Sequential scale example <img src="index_files/figure-html/four-locations-temps-by-month2-1.png" width="80%" style="display: block; margin: auto;" /> Palette name: Inferno ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Sequential scale example <img src="index_files/figure-html/four-locations-temps-by-month3-1.png" width="80%" style="display: block; margin: auto;" /> Palette name: Cividis ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Uses of color in data visualization <table style = "border: none; line-height: 2.5;"> <tr style = "background: white;"> <td style = "text-align: left; width: 50%;"> 1. Distinguish categories (qualitative) </td> <td> <img src = "images/qualitative.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> <tr style = "background: white;"> <td style = "text-align: left;"> 2. Represent numeric values (sequential) </td> <td> <img src = "images/sequential.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> <tr style = "background: white;"> <td style = "text-align: left;"> 3. Represent numeric values (diverging) </td> <td> <img src = "images/diverging.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> </table> --- # Diverging scale example <img src="index_files/figure-html/forensic-correlations1-1.png" width="40%" style="display: block; margin: auto;" /> Palette name: ColorBrewer PiYG ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Diverging scale example <img src="index_files/figure-html/forensic-correlations2-1.png" width="40%" style="display: block; margin: auto;" /> Palette name: Carto Earth ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Diverging scale example <img src="index_files/figure-html/forensic-correlations3-1.png" width="40%" style="display: block; margin: auto;" /> Palette name: Blue-Red ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- # Uses of color in data visualization <table style = "border: none; line-height: 2.5;"> <tr style = "background: white;"> <td style = "text-align: left; width: 50%;"> 1. Distinguish categories (qualitative) </td> <td> <img src = "images/qualitative.png" width = 100% style = "text-align: right; vertical-align: middle;"></img> </td> </tr> <tr style = "background: white;"> <td style = "text-align: left;"> 2. Represent numeric values (sequential) </td> <td> <img src = "images/sequential.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> <tr style = "background: white;"> <td style = "text-align: left;"> 3. Represent numeric values (diverging) </td> <td> <img src = "images/diverging.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> <tr style = "background: white;"> <td style = "text-align: left;"> 4. Highlight </td> <td> <img src = "images/highlight.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> </table> --- # Highlight example .panelset[ .panel[.panel-name[Gray] <img src="index_files/figure-html/Aus-athletes-track-1.png" width="70%" style="display: block; margin: auto;" /> Palette name: Grays with accents ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) ] .panel[.panel-name[Gray2] <img src="index_files/figure-html/grays-1.png" width="70%" style="display: block; margin: auto;" /> Palette name: Grays with accents ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) ] .panel[.panel-name[OkabeIto] <img src="index_files/figure-html/Aus-athletes-track2-1.png" width="70%" style="display: block; margin: auto;" /> Palette name: Okabe-Ito accent ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) ] .panel[.panel-name[Accent] <img src="index_files/figure-html/Aus-athletes-track3-1.png" width="70%" style="display: block; margin: auto;" /> Palette name: ColorBrewer accent ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) ]] --- # Uses of color in data visualization <table style = "border: none; line-height: 2.5;"> <tr style = "background: white;"> <td style = "text-align: left; width: 50%;"> 1. Distinguish categories (qualitative) </td> <td> <img src = "images/qualitative.png" width = 100% style = "text-align: right; vertical-align: middle;"></img> </td> </tr> <tr style = "background: white;"> <td style = "text-align: left;"> 2. Represent numeric values (sequential) </td> <td> <img src = "images/sequential.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> <tr style = "background: white;"> <td style = "text-align: left;"> 3. Represent numeric values (diverging) </td> <td> <img src = "images/diverging.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> <tr style = "background: white;"> <td style = "text-align: left;"> 4. Highlight </td> <td> <img src = "images/highlight.png" width = 100% style = "text-align: right; vertical-align: middle"></img> </td> </tr> </table> --- class: inverse, middle # Choosing a color scale --- # Choosing a color scale - Emphasis on interpretability and accessibility - Default palettes are less than desirable - Variables may require transformations --- # Default palette in `ggplot2` <img src="index_files/figure-html/unnamed-chunk-1-1.png" width="80%" style="display: block; margin: auto;" /> --- # Suboptimal default choices <img src="index_files/figure-html/unnamed-chunk-2-1.png" width="80%" style="display: block; margin: auto;" /> --- # Common forms of color vision deficiency ### Red-green - Deuteranomaly - Protanomaly - Protanopia and deuteranopia ### Blue-yellow - Tritanomaly - Tritanopia ### Complete color vision deficiency - Monochromacy --- # Inspecting for color vision deficiency <img src="index_files/figure-html/unnamed-chunk-3-1.png" width="80%" style="display: block; margin: auto;" /> --- # Inspecting for color vision deficiency ``` r library(colorblindr) # https://www.rdocumentation.org/packages/colorblindr/versions/0.1.0 cvd_grid(plot = pen_fig) ``` <img src="index_files/figure-html/unnamed-chunk-5-1.png" width="70%" style="display: block; margin: auto;" /> --- # Inspecting for color deficiency <img src="index_files/figure-html/unnamed-chunk-6-1.png" width="80%" style="display: block; margin: auto;" /> --- # Inspecting for color deficiency <img src="index_files/figure-html/unnamed-chunk-7-1.png" width="80%" style="display: block; margin: auto;" /> --- # When to use quantitative or qualitative color scales? <img src="images/quant-qual.png" width="80%" style="display: block; margin: auto;" /> .footnote[Source: [Lisa Charlotte Muth](https://blog.datawrapper.de/category/color-in-data-vis/)] --- # Use qualitative for nominal variables <img src="images/unordered.png" width="80%" style="display: block; margin: auto;" /> .footnote[Source: [Lisa Charlotte Muth](https://blog.datawrapper.de/category/color-in-data-vis/)] --- # Use quantitative for numerical variables <img src="images/unemp-best.png" width="80%" style="display: block; margin: auto;" /> .footnote[Source: [Lisa Charlotte Muth](https://blog.datawrapper.de/category/color-in-data-vis/)] --- # Quantitative `\(\neq\)` continuous <img src="images/likert.png" width="80%" style="display: block; margin: auto;" /> .footnote[Source: [Lisa Charlotte Muth](https://blog.datawrapper.de/category/color-in-data-vis/)] --- # Shades to emphasize order <img src="images/treemap.png" width="80%" style="display: block; margin: auto;" /> .footnote[Source: [Lisa Charlotte Muth](https://blog.datawrapper.de/category/color-in-data-vis/)] --- # Double-encoded line chart <img src="images/double-encode-lines.png" width="80%" style="display: block; margin: auto;" /> .footnote[Source: [Lisa Charlotte Muth](https://blog.datawrapper.de/category/color-in-data-vis/)] --- class: inverse, middle # Implementing optimal color palettes in R --- ## **ggplot2** color scale functions -- .small.center[ Scale function | Aesthetic | Data type | Palette type :----------- | :---------- | :------------ | :------------ `scale_color_hue()` | `color` | discrete | qualitative ] --- ## **ggplot2** color scale functions are a bit of a mess .small.center[ Scale function | Aesthetic | Data type | Palette type :----------- | :---------- | :------------ | :------------ `scale_color_hue()` | `color` | discrete | qualitative `scale_fill_hue()` | `fill ` | discrete | qualitative ] --- ## **ggplot2** color scale functions are a bit of a mess .small.center[ Scale function | Aesthetic | Data type | Palette type :----------- | :---------- | :------------ | :------------ `scale_color_hue()` | `color` | discrete | qualitative `scale_fill_hue()` | `fill ` | discrete | qualitative `scale_color_gradient()` | `color` | continuous | sequential ] --- ## **ggplot2** color scale functions are a bit of a mess .small.center[ Scale function | Aesthetic | Data type | Palette type :----------- | :---------- | :------------ | :------------ `scale_color_hue()` | `color` | discrete | qualitative `scale_fill_hue()` | `fill ` | discrete | qualitative `scale_color_gradient()` | `color` | continuous | sequential `scale_color_gradient2()` | `color` | continuous | diverging ] --- ## **ggplot2** color scale functions are a bit of a mess .small.center[ Scale function | Aesthetic | Data type | Palette type :----------- | :---------- | :------------ | :------------ `scale_color_hue()` | `color` | discrete | qualitative `scale_fill_hue()` | `fill ` | discrete | qualitative `scale_color_gradient()` | `color` | continuous | sequential `scale_color_gradient2()` | `color` | continuous | diverging `scale_fill_viridis_c()` | `color` | continuous | sequential `scale_fill_viridis_d()` | `fill` | discrete | sequential `scale_color_brewer()` | `color` | discrete | qualitative, diverging, sequential `scale_fill_brewer()` | `fill` | discrete | qualitative, diverging, sequential `scale_color_distiller()` | `color` | continuous | qualitative, diverging, sequential ] ... and there are many many more --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/temps-tiles1-out-1.png" width="80%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(temps_months, aes(x = month, y = location, fill = mean)) + geom_tile(width = 0.95, height = 0.95) + coord_fixed(expand = FALSE) + theme_classic() # no fill scale defined, default is scale_fill_gradient() ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/temps-tiles2-out-1.png" width="80%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(temps_months, aes(x = month, y = location, fill = mean)) + geom_tile(width = 0.95, height = 0.95) + coord_fixed(expand = FALSE) + theme_classic() + * scale_fill_gradient() ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/temps-tiles3-out-1.png" width="80%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(temps_months, aes(x = month, y = location, fill = mean)) + geom_tile(width = 0.95, height = 0.95) + coord_fixed(expand = FALSE) + theme_classic() + * scale_fill_viridis_c() ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/temps-tiles4-out-1.png" width="80%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(temps_months, aes(x = month, y = location, fill = mean)) + geom_tile(width = 0.95, height = 0.95) + coord_fixed(expand = FALSE) + theme_classic() + * scale_fill_viridis_c(option = "B", begin = 0.15) ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/temps-tiles5-out-1.png" width="80%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(temps_months, aes(x = month, y = location, fill = mean)) + geom_tile(width = 0.95, height = 0.95) + coord_fixed(expand = FALSE) + theme_classic() + * scale_fill_distiller(palette = "YlGnBu") ``` ] ] --- ## The `colorspace` package creates some order Scale name: `scale_<aesthetic>_<datatype>_<colorscale>()` -- - `<aesthetic>`: name of the aesthetic (`fill`, `color`, `colour`) - `<datatype>`: type of variable plotted (`discrete`, `continuous`, `binned`) - `<colorscale>`: type of the color scale (`qualitative`, `sequential`, `diverging`, `divergingx`) -- Scale function | Aesthetic | Data type | Palette type :----------- | :-------- | :--------- | :------------ `scale_color_discrete_qualitative()` | `color` | discrete | qualitative `scale_fill_continuous_sequential()` | `fill` | continuous | sequential `scale_colour_continuous_divergingx()` | `colour` | continuous | diverging --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/temps-tiles6-out-1.png" width="80%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(temps_months, aes(x = month, y = location, fill = mean)) + geom_tile(width = 0.95, height = 0.95) + coord_fixed(expand = FALSE) + theme_classic() + * scale_fill_continuous_sequential(palette = "YlGnBu") ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/temps-tiles7-out-1.png" width="80%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(temps_months, aes(x = month, y = location, fill = mean)) + geom_tile(width = 0.95, height = 0.95) + coord_fixed(expand = FALSE) + theme_classic() + * scale_fill_continuous_sequential(palette = "Viridis") ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/temps-tiles8-out-1.png" width="80%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(temps_months, aes(x = month, y = location, fill = mean)) + geom_tile(width = 0.95, height = 0.95) + coord_fixed(expand = FALSE) + theme_classic() + * scale_fill_continuous_sequential(palette = "Inferno", begin = 0.15) ``` ] ] --- ``` r colorspace::hcl_palettes(type = "sequential", plot = TRUE) # all sequential palettes ``` <img src="index_files/figure-html/colorspace-palettes-seq-1.png" width="80%" style="display: block; margin: auto;" /> --- ``` r colorspace::hcl_palettes(type = "diverging", plot = TRUE, n = 9) # all diverging palettes ``` <img src="index_files/figure-html/colorspace-palettes-div-1.png" width="80%" style="display: block; margin: auto;" /> --- ``` r colorspace::divergingx_palettes(plot = TRUE, n = 9) # all divergingx palettes ``` <img src="index_files/figure-html/colorspace-palettes-divx-1.png" width="80%" style="display: block; margin: auto;" /> --- class: inverse, middle # Setting colors for discrete, qualitative scales --- ``` r colorspace::hcl_palettes(type = "qualitative", plot = TRUE) # all qualitative palettes ``` <img src="index_files/figure-html/colorspace-palettes-qual-1.png" width="80%" style="display: block; margin: auto;" /> --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/qual-scales-example1-out-1.png" width="70%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(popgrowth, aes(x = pop2000, y = popgrowth, color = region)) + geom_point(size = 4) + scale_x_log10() # no color scale defined, default is scale_color_hue() ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/qual-scales-example2-out-1.png" width="70%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(popgrowth, aes(x = pop2000, y = popgrowth, color = region)) + geom_point(size = 4) + scale_x_log10() + * scale_color_hue() ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/qual-scales-example3a-out-1.png" width="70%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(popgrowth, aes(x = pop2000, y = popgrowth, color = region)) + geom_point(size = 4) + scale_x_log10() + # uses Pastel 1 * scale_color_discrete_qualitative(palette = "Dark 2") ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/qual-scales-example3-out-1.png" width="70%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r library(ggthemes) # for scale_color_colorblind() ggplot(popgrowth, aes(x = pop2000, y = popgrowth, color = region)) + geom_point(size = 4) + scale_x_log10() + # uses Okabe-Ito colors * scale_color_colorblind() ``` ] ] --- ## Examples .panelset[ .panel[.panel-name[Output] <img src="index_files/figure-html/qual-scales-example4-out-1.png" width="70%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ``` r ggplot(popgrowth, aes(x = pop2000, y = popgrowth, color = region)) + geom_point(size = 4) + scale_x_log10() + * scale_color_manual( * values = c( * West = "#E69F00", South = "#56B4E9", * Midwest = "#009E73", Northeast = "#F0E442" * ) * ) ``` ] ] --- ## Okabe-Ito RGB codes .center[ <img src = "https://clauswilke.com/dataviz/pitfalls_of_color_use_files/figure-html/palette-Okabe-Ito-1.png", width = 100%></img> ] .tiny-font[ Name | Hex code | R, G, B (0-255) :---------- | :------- | :-------- orange | #E69F00 | 230, 159, 0 sky blue | #56B4E9 | 86, 180, 233 bluish green | #009E73 | 0, 158, 115 yellow | #F0E442 | 240, 228, 66 blue | #0072B2 | 0, 114, 178 vermilion | #D55E00 | 213, 94, 0 reddish purple | #CC79A7 | 204, 121, 167 black | #000000 | 0, 0, 0 ] ??? Figure from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Apply it -- work in groups through the exercises here ```r # install if necessary install.packages(c("tidyverse", "here", "colorspace", "scales", "ggthemes", "usethis", "cowgrid")) usethis::use_course("MACS40700/choosing-colors") ``` --- # Recap * Consider the palettes: balance saturation * Colors: * Qualitative: diverging * Quantitative: scale