Data Visualization

This lesson shows how to choose the appropriate graphic to represent your data and research question, inspired by Choosing a good chart, and then how to implement construction of the graphic using ggplot2.


Learners must understand basic data management and manipulation. They have likely worked with spreadsheets and interactive stats tools (SAS, R), are familiar with tabular data structure, filtering and summarizing data in groups, and basic statistics (mean, stdev, lm) to compare groups.

Learners should have some limited programming experience in R, including content from Lessons 00 - 04 of Data Carpentry’s analyzing ecological data in R. This specifically includes how to load a CSV (read.csv()), and basic dplyr function (filter(), summarize(), etc.)

Learners must install R and download the gapminder data before class starts: please see the setup instructions for details.


00:00 Introduction How do I read, analyze, and visualize a tabular data set?
How do I manipulate, summarize, and analyze data to answer a research question?
How do I choose the best chart to answer my research question?
How do I generate the best chart to answer my research question?
How do I produce a publication quality version of my chart?
00:15 Data Management in R How do I import data into the R environment?
What is the Gapminder data structure?
00:30 Data Structures What are data variables and categories?
What are data values and replicates?
What is the difference between absolute and relative values?
00:40 dplyr Basics What are the basic dplyr functions?
How do I implement the basic dplyr functions?
00:55 Scientific Questions and Hypotheses What makes a good scientific question or hypothesis?
01:10 Choosing a Good Chart How do I choose a good chart to visualize my data?
01:25 ggplot2 Basics What is the grammar of graphics established by ggplot2?
How does ggplot2 generate graphics through layers?
01:40 Generating a Line Histogram What kind of scientific question does a line histogram address?
How do I generate a line histogram with ggplot2?
02:00 Making Publication Quality Figures What makes up a publication quality figure?
What customization features are in ggplot2?
02:10 Coffee Break Break
02:25 Bubble Charts What kind of scientific question does a bubble chart address?
How do I generate a bubble chart with ggplot2?
02:45 Tidy Data Structure What is tidy data structure?
How do I use tidyr to restructure messy data?
02:55 Faceted Table of Histograms What kind of scientific question does a faceted table of charts address?
How do I generate a faceted table with ggplot2?
03:15 Stacked Area Chart What kind of scientific question does a stacked area chart address?
How do I generate a stacked area chart with ggplot2?
03:35 Wrap-up What have I learned?
Where can I get more information about data visualization?
What is one thing that was good about the lesson? that could be improved?
03:45 Finish