Spider and parallel charts in R with the ggvanced package


An R package for effective visualization of multiple variables

A pretty spider chart. Image by Author.

During one of my data analysis projects, I found myself in need of an effective way to compare groups across several variables at once. Of course, bar charts came to mind first, but I wanted something more eye-catching, something more interesting. After browsing the web a bit, I settled on two prime candidates — a spider chart and a parallel chart.

After this, I usually just find a dedicated R package producing the needed visualizations, but this time, this approach left me empty-handed.

A LIE — more experienced R users might say! Such visualizations can already be obtained using packages such as fsmb and ggradar for radar charts and ggally for parallel plots.

However, aside from just performing the ranked comparison of groups across variables, I also wanted to simultaneously display the range of values for each variable. And, you guess it, none of the aforementioned packages offered this. So, I decided to build my own 🙂

ggvanced is an R package for creating advanced multivariable plots such as spider/radar charts and parallel plots. The visualizations are created on top of the ggplot2 package. The beauty of the ggplot2 package is the underlying grammar of graphics, allowing for creation of graphs by stacking multiple layers on top of one another. This powerful concept lets us create essentially any visualization, as long as we know how to code it.

A schamatic depiction of how ggplot2 charts are made. Image by
Zvonimir Boban.
Creating visualizations using the grammar of graphics approach. Image by Author.

The package is currently available on GitHub and can be installed by typing the devtools::install_github("Ringomed/ggvanced") command in R and calling library(ggvanced) afterwards.

If you are interested in the details of chart construction, in a recent post I showed the logic behind constructing a spider chart from scratch, so check out the story below or the detailed documentation on GitHub.

For the rest of you, below are some examples detailing what the package functions can do.

The ggspider() function creates spider charts which either a single shared axis scaled to a [0,1] range, or a separate axis with real values displayed for every displayed category. Let’s test the function on a couple of examples. First, we have to format the data so that the first column contains the group identifier, and other columns the descriptory variables. We will use the built-in mtcars and iris datasets.

library(tidyverse)

mtcars_summary <- mtcars %>%
tibble::rownames_to_column(var = "group") %>%
tibble::as_tibble() %>%
tail(3)

iris_summary <- iris %>%
dplyr::group_by(Species) %>%
dplyr::summarise(across(everything(), mean))
library(ggvanced)

Comparing car properties

ggspider(mtcars_summary)
Image by Author.

The key differences between the cars immediately stand out. As expected, compared to racing cars such as the Ferrari and Maserati, the Volvo has much less horsepowers (hp) and takes much longer to cover a quarter of a mile (qsec), but is also much more economical in terms of miles per gallon (mpg).

Visualizing differences bettween the Iris species

ggspider(iris_summary)
Image by Author.

Just like with the cars example, the spider chart is very effective for determining the differences between the iris species. We can immediately see that the Versicolor and Virginica species are much more similar, having essentailly the same ratios of petal and sepal lengths and widths and only differing in the total flower size. Conversely, the Setosa species has a much larger sepal width.

Radar charts

The function also allows for creation of traditional radar charts with a single common scaled axis by specifying the argument scaled = TRUE and switching to a round shape using polygon = FALSE.

ggspider(iris_summary, scaled = TRUE, polygon = FALSE)

The other function arguments are more aesthetic in nature, and cover aspects such as font size, position of the labels and so on. For more details, refer to the function documentation.

Although I prefer spider charts from an aesthetic viewpoint, parallel charts can make it easier to spot trends across variables. This is especially true when there are many variables or observations in the dataset.

ggparallel(mtcars_summary)
Image by Author.
ggparallel(iris_summary)
Image by Author.

The above charts are just barebone version. Of course, they can be “pimped up” just like any other ggplot2 chart. Below is an example of a ggvanced spider chart after a couple of alterations.

Image by Author.

And of course, the accompanying code. Enjoy! 🙂

library(tidyverse)
library(ggvanced)
library(sysfonts)
library(showtext)

sysfonts::font_add_google("Roboto Condensed")
showtext_auto()

mtcars_gr <- mtcars %>%
tibble::rownames_to_column(var = "group") %>%
tibble::as_tibble() %>%
tail(3) %>%
rename("Miles per Gallon" = mpg, "Cylinders" = cyl,
"Displacement" = disp, "Horsepower" = hp,
"Rear axlen ratio" = drat, "Weight" = wt) %>%
dplyr::select(1:7)

ggspider(mtcars_gr, axis_name_offset = 0.15, background_color = "beige", fill_opacity = 0.15) +
labs(col = "Car name", title = "Comparing Car Properties") +
theme(plot.title = element_text(hjust = 0.475, face = "bold"),
legend.title = element_text(face = "bold"),
text = element_text(family = "Roboto Condensed", face = "bold"))

In this post, I covered key functions and options of ggvanced — a package I made in response to a need for more advanced spider and parallel charts in R.

The text goes through a couple of examples for each function and then show how the final result can look like after some additional customization.

I hope that the package will be useful to you as it is to me. If you have requests for any more custom visualizations to be implemented in R, please drop a comment, and I will do my best to create a separate function for it. 🙂



Source link

Leave a Comment