How to create a Sankey plot in R?
How to create a Sankey plot in R?, You must install the ggsankey library and modify your dataset using the package’s make_long function in order to produce a Sankey diagram in ggplot2.
The data’s columns must correspond to the stages x (current stage), next_x (next stage), node (current node), and next_node (the following node).
Keep in mind that the final stage should indicate a NA.
A Side-by-Side Boxplot in R: How to Do It – Data Science Tutorials
Let’s install the remotes packages first,
install.packages("remotes")
Now we can install ggsankey package
remotes::install_github("davidsjoberg/ggsankey")
library(ggsankey)
Load Data
We can make use of mtcars data sets in R.
df <- mtcars %>% make_long(cyl, vs, am, gear, carb) df
x node next_x next_node 1 cyl 6 vs 0 2 vs 0 am 1 3 am 1 gear 4 4 gear 4 carb 4 5 carb 4 <NA> NA 6 cyl 6 vs 0
How to Create an Interaction Plot in R? – Data Science Tutorials
Sankey plot with ggsankey
To construct Sankey diagrams in ggplot2, the ggsankey package includes a geom called geom_sankey.
Keep in mind that you must give a factor as the fill colour when passing the variables to aes. The theme theme_sankey is also present in the function.
Let’s load ggplot2 for graph generation
library(ggplot2) library(dplyr)
ggplot(df, aes(x = x, next_x = next_x, node = node, next_node = next_node, fill = factor(node))) + geom_sankey() + theme_sankey(base_size = 16)
How to add labels in Sankey Plot
The package’s geom_sankey_label function lets you add labels to Sankey diagrams.
Remember to give the variable you want to display as the label inside the aes.
ggplot(df, aes(x = x, next_x = next_x, node = node, next_node = next_node, fill = factor(node), label = node)) + geom_sankey() + geom_sankey_label() + theme_sankey(base_size = 16)
How to Add Superscripts and Subscripts to Plots in R? (datasciencetut.com)
How to do Color customization in Sankey Plot
To alter how the Sankey diagram appears in R, a variety of arguments can be changed. The author of the program produced the following pictures as examples.
geom_sankey aesthetics geom_sankey geometries Color and fill of the Sankey plot
For instance, by adjusting the fill color palette and a few of the inputs to the geom_sankey_function, we can produce something like this.
ggplot(df, aes(x = x, next_x = next_x, node = node, next_node = next_node, fill = factor(node), label = node)) + geom_sankey(flow.alpha = 0.5, node.color = 1) + geom_sankey_label(size = 3.5, color = 1, fill = "white") + scale_fill_viridis_d(option = "A", alpha = 0.95) + theme_sankey(base_size = 16)
How to Label Outliers in Boxplots in ggplot2? (datasciencetut.com)
Changing the title of the legend
Changes to the legend’s title are available, just like with other ggplot2 charts. Here are several options for action.
ggplot(df, aes(x = x, next_x = next_x, node = node, next_node = next_node, fill = factor(node), label = node)) + geom_sankey(flow.alpha = 0.5, node.color = 1) + geom_sankey_label(size = 3.5, color = 1, fill = "white") + scale_fill_viridis_d() + theme_sankey(base_size = 16) + guides(fill = guide_legend(title = "Title"))
How to Add a caption to ggplot2 Plots in R? (datasciencetut.com)
Removing the legend
Finally, you can adjust the Sankey plot legend’s position to “none” if you want to remove it.
ggplot(df, aes(x = x, next_x = next_x, node = node, next_node = next_node, fill = factor(node), label = node)) + geom_sankey(flow.alpha = 0.5, node.color = 1) + geom_sankey_label(size = 3.5, color = 1, fill = "white") + scale_fill_viridis_d() + theme_sankey(base_size = 16) + theme(legend.position = "none")
Changing the Font Size in Base R Plots – Data Science Tutorials