+ - 0:00:00
Notes for current slide
Notes for next slide

Fundamental of Data Science for EESS





R session 05 - Data viz

Daniel Vaulot

2021-02-15

1 / 53

Outline

  • Graph types
  • Grammar of graphics
  • Playing with ggplot2
  • Multiple graphs
  • ggplot2 syntax
  • Your turn
2 / 53

Installation and Resources

3 / 53

Data vizualization

4 / 53

Data vizualization

Graph purposes

4 / 53

Data vizualization

Graph purposes

  • Analysis graphs
    • design to see patterns, trends
    • aid the process of data description
    • interpretation
4 / 53

Data vizualization

Graph purposes

  • Analysis graphs
    • design to see patterns, trends
    • aid the process of data description
    • interpretation
  • Presentation graphs
    • design to attract attention
    • make a point
    • illustrate a conclusion

Source: Michael Friendly - http://datavis.ca/courses/RGraphics/

4 / 53

Graph types

Jitter

  • Two variables numerical
5 / 53

Graph types

Jitter

  • Two variables numerical

5 / 53

Graph types

Bubble

  • Two variables numerical
  • Add another variable numerical

6 / 53

Graph types

Animate

  • Two variables numerical
  • One variable numerical
  • One variable categorical
  • Animate another variable

7 / 53

Graph types

Times series

  • Line graph

8 / 53

Graph types

Bargraphs

  • One variable categorical
  • One variable numerical

9 / 53

Graph types

Bargraphs

  • Rotate

10 / 53

Graph types

Bargraphs

  • Two variable categorical
  • One variable numerical

11 / 53

Graph types

Boxplots

  • One variable categorical
  • One variable numerical but with many values

12 / 53

Graph types

Treemaps

  • One variable categorical
  • One variable numerical
  • Much better than pie charts

13 / 53

Graph types

3D

  • Three variable numerical
  • Avoid unless it is a simple shape

14 / 53

Graph types

Contours

  • Three variable numerical
  • Better than 3D

15 / 53

Graph types

Many...

16 / 53

Wooclap - Quizz on Data wrangling

17 / 53

ggplot2

@allison_horst

18 / 53

Initialize

Load necessary libraries

library("readxl") # Import the data from Excel file
library("dplyr") # filter and reformat data frames
library("ggplot2") # graphics
19 / 53

Initialize

Read the data

samples <- readxl::read_excel("data/CARBOM data.xlsx",
sheet = "Samples_boat") %>%
tidyr::fill(station)
sample number transect station date time depth level latitude longitude picoeuks nanoeuks phosphates nitrates temperature salinity
10 1 81 2013-11-13 1899-12-31 01:00:00 140 Deep -27.42 -44.72 3278 1232 0.20 0.26 17.3 35.9
11 1 85 2013-11-13 1899-12-31 13:30:00 110 Deep -26.80 -45.30 16312 1615 0.29 0.22 21.3 36.5
120 2 96 2013-11-18 1899-12-31 23:50:00 5 Surf -27.39 -47.82 1150 75 0.43 0.19 23.1 33.5
121 2 96 2013-11-18 1899-12-31 23:50:00 30 Deep -27.39 -47.82 1737 218 0.43 0.23 22.6 33.7
122 2 96 2013-11-18 1899-12-31 23:50:00 50 Deep -27.39 -47.82 853 234 0.56 0.21 20.3 35.9
125 2 98 2013-11-18 1899-12-31 05:00:00 5 Surf -27.59 -47.39 3086 1300 0.29 0.25 23.1 35.7
126 2 98 2013-11-18 1899-12-31 05:00:00 50 Deep -27.59 -47.39 1217 782 0.25 0.20 23.7 37.2
127 2 98 2013-11-18 1899-12-31 05:00:00 85 Deep -27.59 -47.39 3420 226 0.25 0.47 22.9 37.0
13 1 86 2013-11-13 1899-12-31 17:00:00 105 Deep -26.33 -45.41 6366 1007 0.34 0.15 20.9 36.3
140 2 101 2013-11-18 1899-12-31 12:00:00 5 Surf -27.79 -46.96 500 366 0.29 0.14 23.5 36.5
20 / 53

ggplot2

A simple plot

  • Choose the data set
  • Choose the geometric representation
  • Choose the aesthetics : x,y, color, shape etc...
ggplot(data=samples) +
geom_point(mapping = aes(x=phosphates,
y=nitrates))
  • All functions are from ggplot2 package unless specified
21 / 53

ggplot2

A simple plot

  • Choose the data set
  • Choose the geometric representation
  • Choose the aesthetics : x,y, color, shape etc...
ggplot(data=samples) +
geom_point(mapping = aes(x=phosphates,
y=nitrates))
  • All functions are from ggplot2 package unless specified

21 / 53

ggplot2

The grammar of graphics

Every graph can be described as a combination of independent building blocks:

  • data: a data frame: quantitative, categorical; local or data base query
  • aesthetic mapping of variables into visual properties: size, color, x, y
  • geometric objects (“geom”): points, lines, areas, arrows, …
  • coordinate system (“coord”): Cartesian, log, polar, map
22 / 53

ggplot2

Syntax

ggplot(data=samples) +
geom_point(mapping = aes(x=phosphates,
y=nitrates))

23 / 53

ggplot2

Alternatively

ggplot(data=samples,
mapping = aes(x=phosphates,
y=nitrates)) +
geom_point()
  • If different geometries origniate from different datasets or have different mapping the datasets or the mapping must be called inside the geom function.

24 / 53

ggplot2

Alternatively

ggplot(samples,
aes(x=phosphates,
y=nitrates)) +
geom_point()

25 / 53

ggplot2

Make dot size bigger

ggplot(samples,
aes(x=phosphates,
y=nitrates))

26 / 53

ggplot2

Make dot size bigger

ggplot(samples,
aes(x=phosphates,
y=nitrates)) +
geom_point(size=5)
  • Add: size=5 outside of the aesthetics function

27 / 53

ggplot2

Color according to depth level (discrete)

ggplot(samples,
aes(x=phosphates,
y=nitrates,
color=level)) +
geom_point(size=5)
  • The mapping aesthetics must be an argument of the aes function
  • geom_point(color=level, size=5) will generate an error...

28 / 53

ggplot2

Color according to depth (continuous)

ggplot(samples,
aes(x=phosphates,
y=nitrates,
color=depth)) +
geom_point(size=5)
  • Add: color=depth

29 / 53

ggplot2

Symbol according to transect (continuous)

ggplot(samples,
aes(x=phosphates,
y=nitrates,
color=depth,
shape=transect)) +
geom_point(size=5)
  • Add: shape=transect
Error: A continuous variable can not be mapped to shape

30 / 53

ggplot2

Symbol according to transect (continuous)

ggplot(samples,
aes(x=phosphates,
y=nitrates,
color=depth,
shape=as.character(transect))) +
geom_point(size=5)
  • Add: shape=as.character(transect)

31 / 53

ggplot2

Panels depending on one variable

ggplot(samples,
aes(x=phosphates,
y=nitrates)) +
geom_point() +
facet_wrap(~ level)

32 / 53

ggplot2

Adding a regression line

ggplot(samples,
aes(x=phosphates,
y=nitrates,
color=level)) +
geom_point(size=5) +
geom_smooth(mapping = aes(x=phosphates,
y=nitrates),
method="lm")
  • Add: geom_smooth()
  • You can choose the type of smoothing "lm" is for linear model

33 / 53

ggplot2

Adding a regression line

ggplot(samples,
aes(x=phosphates,
y=nitrates)) +
geom_point(aes(color=level),
size=5) +
geom_smooth(mapping = aes(x=phosphates,
y=nitrates),
method="lm")
  • If the mapping is in the ggplot function is for all the geom....

34 / 53

ggplot2

Finalizing the graph

ggplot(samples) +
geom_point(mapping = aes(x=phosphates,
y=nitrates,
color=level),
size=5) +
geom_smooth(mapping = aes(x=phosphates,
y=nitrates),
method="lm") +
xlab("Phosphates") +
ylab("Nitrates") +
ggtitle("CARBOM cruise")
  • Add: geom_smooth()
  • You can choose the type of smoothing "lm" is for linear model

35 / 53

Putting several graphs together

First graph

g1 <- ggplot(samples) +
geom_point(mapping = aes(x=phosphates,
y=nitrates,color=
level), size=5) +
geom_smooth(mapping = aes(x=phosphates,
y=nitrates),
method="lm") +
xlab("Phosphates") +
ylab("Nitrates")
g1

36 / 53

Putting several graphs together

Second graph

g2<- ggplot(samples) +
geom_point(mapping = aes(x=nanoeuks,
y=picoeuks,
color=level),
size=5) +
geom_smooth(mapping = aes(nanoeuks,
y=picoeuks),
method="lm") +
xlab("Pico-eukaryotes") +
ylab("Nano-eukaryotes")
g2

37 / 53

Putting several graphs together

Package patchwork

library(patchwork)
(g1 + g2)/g1

See also packages :

  • gridExtra
  • cowplot

38 / 53

Putting several graphs together

Package patchwork

  • Adding annotation
  • Collecting legends
g1 / g2 +
plot_annotation(tag_levels = 'A') +
plot_layout(guides = 'collect')

39 / 53

ggplot2 syntax

Anatomy of a plot

40 / 53

ggplot2 syntax

Geometries

41 / 53

ggplot2 syntax

Continuous x and y

42 / 53

ggplot2 syntax

Plotting error

43 / 53

ggplot2 syntax

Discrete x - Continuous y

44 / 53

ggplot2 syntax

Continuous x

45 / 53

ggplot2 syntax

3D

46 / 53

ggplot2 syntax

Modifying axis and scales

47 / 53

ggplot2 syntax

Palettes

48 / 53

ggplot2 syntax

Palettes

  • Use color blind friendly palettes
    • viridis (e.g. scale_colour_viridis_c)
49 / 53

ggplot2 syntax

Themes

50 / 53

Extensions

51 / 53

Recap

  • Exploratory vs. final
52 / 53

Recap

  • Exploratory vs. final
  • Decide what element is fixed and what varies
52 / 53

Recap

  • Exploratory vs. final
  • Decide what element is fixed and what varies
  • It takes time to get what you want...
52 / 53

Next time: Create maps

What you will learn :

  • Create simple maps
  • Create interactive maps
  • Create thematic maps

Install

  • rworldmap
  • leaflet
  • sf
  • raster
  • spData
  • tmap
  • ggplot2

Reading list

53 / 53

Outline

  • Graph types
  • Grammar of graphics
  • Playing with ggplot2
  • Multiple graphs
  • ggplot2 syntax
  • Your turn
2 / 53
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow