Stacked Column Chart and Clustered Column Chart in R GGplot

Stacked Column Chart

A stacked column chart also known as a stacked bar chart is a type of bar graph that stacks the categories of a group on top of each other. Each stack (category) is usually presented in a different color. The height of each stack is proportional to the percentages of that category.

The following R code generates a simple stacked column chart. The R dataset mtcars will be used to generate the graph. First, the percentages are calculated and then these are used to generate the bar graph.

#calculating the N and % of number of cylinders per number of carburetors using the summarize() 
barg01 <- mtcars %>%
  group_by(cyl, carb)%>%
  summarize(num=n())%>%
  mutate(perc=num/sum(num)*100)
#create the basic stacked column chart in R
s <- ggplot(data=barg01, aes(x=factor(cyl), y=perc,    fill=factor(carb))) +
     #the function geom_bar() creates a column or bar chart
     geom_bar(stat="identity", width=0.4)

The above R codes produce the default graph. We can use the following functions to customize the graph to make it more aesthetically pleasing.

default stacked column chart in R
#stacked bar graph
s + 
    #axes labels and legend title
    labs(y = "Percentage", x = "Number of Cylinders", fill = "# Carburetors") +
  
    #specify the color for each of the carburetor category
    scale_fill_manual(values=c("#FFFF80","#FFFF00","#FFBF00","#FF8000","#FF4000","#FF0000")) + 
  
    #choose theme - a change not related to the input data
    theme_classic()
stacked bar graph

Clustered Column Chart

A clustered column chart also known as a clustered bar chart or dodge bar graph is a type of bar graph that presents the categories of a group next to each other in a different color.

The following R codes will convert the above stacked column chart to a clustered column chart.

#Clustered bar graph
ggplot(data=barg01, aes(x=factor(cyl), y=perc, fill=factor(carb))) +
  labs(y = "Percentage", x = "Number of Cylinders", fill = "# Carburetors") + 

  #position= argument generates a clustered bar graph. The rest of the codes is the same as above
  geom_bar(stat="identity", width=0.4, position=position_dodge()) + 

  scale_fill_manual(values=c("#FFFF80","#FFFF00","#FFBF00","#FF8000","#FF4000","#FF0000")) + 
  theme_classic()
clustered column chart

The R code to create the clustered column chart is similar to the one used above for the stacked column chart. The difference is the addition of the parameter position=position_dodge() in the function geom_bar() function.

Similar Posts

Leave a comment

2 Comments

  1. How can we create a clustered column graph by using 2 values on the x axis? eg Number of cylinder by week?

    1. Hi Panos, in the last example above, the number of Carburetors are shown per number of cylinders on the x-axis. These are already two variables being used here. If you have to introduce a third variable, you can easily present it as separate panels. In your example, you can have a separate plot for each Week. You can use the facet_wrap() function for this. See how facet_wrap() was used in this post: Mean Profile Plot in R