Stacked Column Chart and Clustered Column Chart in R GGplot
Stacked Column Chart
A stacked column chart also known as a stacked bar chart is a type of bar graph that stacks the categories of a group on top of each other. Each stack (category) is usually presented in a different color. The height of each stack is proportional to the percentages of that category.
The following R code generates a simple stacked column chart. The R dataset mtcars will be used to generate the graph. First, the percentages are calculated and then these are used to generate the bar graph.
#calculating the N and % of number of cylinders per number of carburetors using the summarize()
barg01 <- mtcars %>%
group_by(cyl, carb)%>%
summarize(num=n())%>%
mutate(perc=num/sum(num)*100)
#create the basic stacked column chart in R
s <- ggplot(data=barg01, aes(x=factor(cyl), y=perc, fill=factor(carb))) +
#the function geom_bar() creates a column or bar chart
geom_bar(stat="identity", width=0.4)
The above R codes produce the default graph. We can use the following functions to customize the graph to make it more aesthetically pleasing.
#stacked bar graph
s +
#axes labels and legend title
labs(y = "Percentage", x = "Number of Cylinders", fill = "# Carburetors") +
#specify the color for each of the carburetor category
scale_fill_manual(values=c("#FFFF80","#FFFF00","#FFBF00","#FF8000","#FF4000","#FF0000")) +
#choose theme - a change not related to the input data
theme_classic()
Clustered Column Chart
A clustered column chart also known as a clustered bar chart or dodge bar graph is a type of bar graph that presents the categories of a group next to each other in a different color.
The following R codes will convert the above stacked column chart to a clustered column chart.
#Clustered bar graph
ggplot(data=barg01, aes(x=factor(cyl), y=perc, fill=factor(carb))) +
labs(y = "Percentage", x = "Number of Cylinders", fill = "# Carburetors") +
#position= argument generates a clustered bar graph. The rest of the codes is the same as above
geom_bar(stat="identity", width=0.4, position=position_dodge()) +
scale_fill_manual(values=c("#FFFF80","#FFFF00","#FFBF00","#FF8000","#FF4000","#FF0000")) +
theme_classic()
The R code to create the clustered column chart is similar to the one used above for the stacked column chart. The difference is the addition of the parameter position=position_dodge() in the function geom_bar() function.