Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
GRAPHIQUES AVEC GGPLOT2L3 -R3
Julie Scholler - B246
novembre 2019
Graphics with ggplot2
Why?• elegant, polyvalent• mature and complete graphics system• very flexible• default behaviour carefully chosen• theme system for polishing plot appearance
How?grammar of graphics (Wilkinson, 2005)
The Grammar Of Graphics
The basic idea to building plot
• specify blocks/layers• combine them• get any kind of graphics
Blocks/layers
• data• aesthetic mapping• geometric object• statistical transformations• scales• coordinate system• position adjustments• faceting
Syntax
ggplot(data=...) + aes(x=..., y=...) + geom_...()
• Data: what is being visualized• Aesthetic Mappings: mappings between variables in the data
and components of the chart• Geometric Objects: geometric objects that are used to display
the data, such as points, lines, or shapes
First try
ggplot(data)
Aesthetic Mapping
In ggplot: aesthetic = “something you can see”
Examples• position (on the x and y axes)• color (“outside” color)• fill (“inside” color)• shape (of points)• linetype• size
Aesthetic mappings are set with the aes() function.
Second try
ggplot(data) + aes(x = note_totale)
25 50 75 100note_totale
Geometic Objects (geom)
Examples• points: geom_point• lines: geom_line• bar: geom_bar• histogram: geom_histogram• boxplot: geom_boxplot
List of available geometric objectsReference listhelp.search("geom_", package = "ggplot2")
Histogramm
ggplot(data) + aes(x = note_totale) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
0
5
10
25 50 75 100note_totale
coun
t
Create Good and Effective Graphics
• Labels+ labs(title=..., subtitle=..., caption=...,
x=..., y=..., color=..., etc.)
• Annotations+ geom_text()+ geom_text_repel()
• Coordinate+ coord_flip()
• Scales, Guides, Themes• Interactivity
ggplot(data = data) + aes(x=note_totale) +geom_histogram(bins = 15, fill="aquamarine3",
col="white") +labs(title = "Distribution des notes au QCM",
x = "Note", y = "Effectif")+theme_minimal()
0
5
10
15
20
25
25 50 75 100Note
Effe
ctif
Distribution des notes au QCM
Themes
The ggplot2 theme system handles non-data plot elements such as
• Axis labels• Plot background• Facet label backround• Legend appearance
Built-in themes include:
• theme_gray() (default)• theme_bw()• theme_classic()
Multivariateggplot(data = data) + aes(x=note_totale, fill=annee) +
geom_histogram(bins = 15, col="white", alpha=0.6) +labs(title = "Distribution des notes au QCM",
x = "Note", y = "Effectif") + theme_minimal()
0
5
10
15
20
25
25 50 75 100Note
Effe
ctif
annee
L1
L2
L3
Distribution des notes au QCM
Faceting
• Creates separate graphs for subsets of data• Two solutions
1. facet_wrap(): subsets as the levels of a single groupingvariable
2. facet_grid(): subsets as the crossing of two groupingvariables
• Facilitates comparison among plots
Syntax
ggplot(data=...) + aes(x=..., y=...,fill=...,color=...,group=...) +
geom_...() + facet_...(...) +labs(...) + theme_minimal()
• Data: what is being visualized• Aesthetic Mappings: mappings between variables in the data
and components of the chart• Geometric Objects: geometric objects that are used to display
the data, such as points, lines, or shapes• Statistical Transformations: applied to the data to summarize it• Facets: describe how the data is partitioned into subsets and
how these different subsets are plotted
Base histogram
gg <- ggplot(data = data) +aes(x=note_totale, fill=annee) +geom_histogram(bins = 15, alpha=0.6, col = "white") +labs(title = "Distribution des notes au QCM",
x = "Note", y = "Effectif") +theme_minimal()
facet_wrap()
gg + facet_wrap(~annee)
L1 L2 L3
25 50 75 100 25 50 75 100 25 50 75 100
0.0
2.5
5.0
7.5
10.0
12.5
Note
Effe
ctif
annee
L1
L2
L3
Distribution des notes au QCM
Legend position
gg + facet_wrap(~annee) +theme(legend.position="bottom")
L1 L2 L3
25 50 75 100 25 50 75 100 25 50 75 1000.0
2.5
5.0
7.5
10.0
12.5
Note
Effe
ctif
annee L1 L2 L3
Distribution des notes au QCM
Other use of facet_wrap()
gg + facet_wrap(~annee, ncol=2)
L3
L1 L2
25 50 75 100
25 50 75 1000.02.55.07.5
10.012.5
0.02.55.07.5
10.012.5
Note
Effe
ctif
annee
L1
L2
L3
Distribution des notes au QCM
Use of facet_grid()
gg + facet_grid(annee~sexe)
Un homme Une femmeL1
L2L3
25 50 75 100 25 50 75 100
02468
02468
02468
Note
Effe
ctif
annee
L1
L2
L3
Distribution des notes au QCM
Density chartggplot(data = data) + aes(x=note_totale) +
geom_density(fill="aquamarine3", color="white",alpha = 0.6) +
labs(title = "Distribution des notes au QCM",x = "Note", y = "") + theme_minimal()
0.000
0.005
0.010
0.015
0.020
0.025
25 50 75 100Note
Distribution des notes au QCM
Density chartsggplot(data = data) +
aes(x=note_totale, fill=annee, color=annee) +geom_density(alpha = 0.6) +labs(title = "Distribution des notes au QCM",
x = "Note", y = "") + theme_minimal()
0.00
0.01
0.02
0.03
0.04
0.05
25 50 75 100Note
annee
L1
L2
L3
Distribution des notes au QCM
With ridges lineslibrary(ggridges)ggplot(data = data) +
aes(x=note_totale, fill=annee, col=annee, y=annee) +geom_density_ridges(alpha = 0.6, scale = 3) +labs(title = "Distribution des notes au QCM",
x = "Note", y = "") + theme_minimal()
L1
L2
L3
50 100Note
annee
L1
L2
L3
Distribution des notes au QCM
Bar charts
ggplot(data) + aes(x=annee) +geom_bar(fill="aquamarine3") +theme_minimal()
0
20
40
60
L1 L2 L3annee
coun
t
Bar charts
ggplot(data) + aes(x=annee) +geom_bar(fill="aquamarine3", width = 0.5) +theme_minimal()
0
20
40
60
L1 L2 L3annee
coun
t
Bar charts
ggplot(data) + aes(x=annee, fill=bac) +geom_bar(width = 0.5) + theme_minimal()
0
20
40
60
L1 L2 L3annee
coun
t
bac
Bac ES
Bac S
Bac L
Bac STMG
Bac professionnel
Bar charts
ggplot(data) + aes(x=annee,fill=bac) +geom_bar(width = 0.5,position="fill") + theme_minimal()
0.00
0.25
0.50
0.75
1.00
L1 L2 L3annee
coun
t
bac
Bac ES
Bac S
Bac L
Bac STMG
Bac professionnel
Bar charts
ggplot(data) + aes(x=annee,fill=bac) +geom_bar(width = 0.5, position="dodge") + theme_minimal()
0
10
20
30
40
L1 L2 L3annee
coun
t
bac
Bac ES
Bac S
Bac L
Bac STMG
Bac professionnel
Position adjustement
Inside geom
• identity• stack• fill• dodge: side by side• jitter: useful for points (geom_jitter())• nudge: shift points
Draw multiple plots within one figure
density <- ggplot(data = data) +aes(x=note_totale, fill=annee, col=annee) +geom_density(alpha = 0.6) +labs(title = "Notes au QCM",
subtitle = "Les L2 sont très moyens.",x = "Note", y = "")+
theme_minimal()
barplot <- ggplot(data) + aes(x=annee, fill = bac) +geom_bar(width = 0.5) +labs(title = "Séries de baccalauréat par année de Licence",
subtitle = "Les filières ES et S sont très majoritaires.",x = "Note", y = "") +
theme_minimal()
Draw multiple plots within one figure
library(ggpubr)ggarrange(density,barplot,align="h")
0.00
0.01
0.02
0.03
0.04
0.05
25 50 75 100Note
annee
L1
L2
L3
Les L2 sont très moyens.
Notes au QCM
0
20
40
60
L1 L2 L3Note
bac
Bac ES
Bac S
Bac L
Bac STMG
Bac professionnel
Les filières ES et S sont très majoritaires.
Séries de baccalauréat par année de Licence