Traditional vs. uncommon approaches and when to use them

From Unsplash

While the most popular way of representing categorical data is using a bar plot, there are some other visualization types suitable for this purpose. They all have their pros and cons, as well as limits of their applicability.

In this article, we’ll compare such graphs displaying the same data: the continents by area. The data was obtained from Wikipedia and are represented in mln km2, to avoid unnecessary precision.

import pandas as pd
dct = {'Oceania': 8.5, 'Europe': 10, 'Antarctica': 14.2, 'South America': 17.8,
'North America': 24.2, 'Africa': 30.4, 'Asia': 44.6}
continents = list(dct.keys())
populations = list(dct.values())

Let’s start with the most classical way of displaying categorical data: a bar plot that doesn’t even need an introduction.

import matplotlib.pyplot as plt
import seaborn as sns
plt.figure(figsize=(10,4))
plt.barh(continents, populations, color='slateblue')
plt.title('Continents by area', fontsize=27)
plt.xlabel('Area, mln km2', fontsize=19)
plt.xticks(fontsize=16)
plt.yticks(fontsize=16)
plt.tick_params(left=False)
sns.despine(left=True)
plt.show()
Image by Author

From the graph above, we clearly see the hierarchy of the continents by area both qualitatively and quantitatively. No wonder: bar plots hardly have any cons. They are multi-purpose, highly customizable, visually compelling, easy to interpret, familiar to a wide audience, and can be created with any dataviz library. The only thing we have to keep in mind when generating them is to follow good practices: data ordering, selecting appropriate colors, bar orientation, adding annotations, labels, decluttering, etc.

A stem plot is very similar to a bar plot and even has an advantage over the latter since it’s characterized by a maximized data-ink ratio and looks less cluttered. To create a stem plot, we only need the matplotlib library. For a horizontal stem plot (the one with a horizontal baseline and vertical stems), we can use either vlines() in the combination with plot() or directly the stem() function. In the first case, vlines() creates the stems and plot() – the ending points. For a vertical stem plot (the one with a vertical baseline and horizontal stems), we can’t use anymore the stem() function but the combination of hlines() and plot().

Let’s create a vertical stem plot for our data:

plt.figure(figsize=(10,4))
plt.hlines(y=continents, xmin=0, xmax=populations, color='slateblue')
plt.plot(populations,...

Continue reading: https://towardsdatascience.com/comparing-different-ways-of-displaying-categorical-data-in-python-ed8fabfb6661?source=rss—-7f60cf5620c9—4

Source: towardsdatascience.com