simple bar graph

How to Create a Bar Graph in R

Bar graphs, also known as Bar charts, are one of the most commonly used methods of visualizing grouped data. The length of each bar represents the magnitude of the data it is presenting. Bar graphs can be plotted horizontally or vertically.

Bar graphs can easily be created in R using the ggplot2 package with the help of the geom_bar() function. The parameter stat= lets you specify if your dataset contains individual data points that need to be summarized before plotting, or if it contains data that needs to be plotted directly.

Plotting a Bar Graph Using Summary Data

Let’s create a bar graph directly from the summary data. We will use the mtcars dataset in R. First, we calculate the percentage of cars with four, five, and six gears. Then we feed these percentages directly into ggplot2 and geom_bar() to generate the plot.

In the first example we calculate the percentages beforehand and use geom_bar() to present the results in a bar graph. In the second example, we let geom_bar() summarize the data for us and then plot the result.

#Calculate the numbers and percentages from the MTCARS dataset using the SUMMARIZE function.
barg01 <- mtcars %>%
  group_by(gear)%>%
  summarize(num=n())%>%
  mutate(perc=num/sum(num)*100)
#Plot the results from the summarize function.
b1<- ggplot(data=barg01, aes(x=gear, y=perc))+            
     geom_bar(stat="identity")
b1

We specify stat=’identity’ so that the results in the dataset barg01 are plotted directly without any further calculations.

We have also specified that gear should be on the x-axis and perc (percentages) should be on the y-axis. We get a vertical bar graph by default. Generating a horizontal bar graph in R is discussed below.

gearnumperc
31546.875
41237.500
5515.625
The result from the summarize() function
vertical bar graph
Vertical bar graph

Plotting a Bar Graph Using Individual Data Points

#example 2: letting geom_bar() do the calculations
b2<- ggplot(data=mtcars, aes(x=factor(gear)))+
     geom_bar()
b2

We get the same output as in the first example, but with the y-axis showing counts instead of percentages. By default, geom_bar() provides the counts. The first approach may be preferred because we can summarize the data beforehand using a preferred method, before using geom_bar() to plot the result.

r bar chart

Horizontal Bar Graph in R

Horizontal bar graphs can be created from the above examples as follows:

b1+ 
coord_flip()
horizontal bar graph using cord_flip()
cord_flip() flips the axes to create a horizontal bar graph

In the first example where both x- and y-axes were specified, we add the cord_flip() function to flip the axes to obtain the horizontal bar graph.

ggplot(data=mtcars, aes(y=factor(gear)))+
  geom_bar()
horizontal bar graph by specifying on the y-axis
Specify only the y-axis to get a horizontal bar graph

In the second example where we previously specified only the x-axis, we now change the code to specify only the y-axis.

Customizing the Bar Graph

Time to add some colors, labels, and a theme to make the bar graph presentable.


ggplot(data=barg01, aes(x=factor(gear), y=perc)) +

#adding a different color for the outline of the bars, and reducing the width of the bars
  geom_bar(stat="identity", color="black", fill="#c5bf6f", width=0.5) +
  
#adding a label and a theme to make the graph presentation quality
  labs(y = "%", x = "Gear") +
  theme_classic()

We have now added a black outline to the bars and a hexadecimal color value for the fill color. The width of the graph was reduced to 0.5, relative to the x-axis scale.

vertical bar graph with color created in R

A label was added to the axes, and a theme was applied to increase the overall quality of the figure. In the next example, we specified only the fill color without the color for the outline of the bars.

ggplot(data=barg01, aes(x=factor(gear), y=perc)) +

  #Specifying only the fill color and flipping the graph
  geom_bar(stat="identity", fill="#8d6c9f", width=0.5) +  coord_flip() +

  #label and theme
  labs(y = "Percentage", x = "Gears") +
  theme_classic()

You can also specify different colors for each bar as illustrated in the next example below.

r bar graph

The parameter fill= in the ggplot statement is used below to specify a different color for each level of the variable gear. Therefore each bar gets a different color in this example. Default colors are automatically assigned. However, the functions scale_fill_manual() and scale_color_manual() can be used to manually pick colors if preferred. The function labs() is used here to change the title of the legend from “factor(gears)” as specified in the R code to “Gears” as desired.


#specifying different color for each bar
ggplot(data=barg01, aes(x=factor(gear), y=perc, fill=factor(gear) )) +
  
  #specify the bar width and flip graph
  geom_bar(stat="identity", width=0.5) +
  coord_flip() +
  
  #pick the colors if preferred
  scale_fill_manual(values=c("#c5bf6f", "#8d6c9f", "#46afd7")) + 
  
  #axes label and theme
  #fill= is used to customize legend title
  labs(y="Percentage", x="Gears", fill="Gears") +
  theme_classic()

simple bar graph

Stacked Bar Graphs and Dodged Bar Graphs in R

As the names suggest, stacked bar graphs have bars from different categories stacked on top of each other while in dodged bar graphs these categories are presented side by side.

#For each gear category, calculate N and % per engine type
barg02 <- mtcars %>%
  group_by(gear, vs)%>%
  summarize(num=n())%>%
  mutate(perc=num/sum(num)*100)

First of all, we calculate the percentages per gear variable and engine type using the summarize() function. The result is saved in the dataset barg02, which is then used as the input dataset in ggplot2, to generate the stacked and dodged bar graphs as shown below.

#Stacked bar graph
#fill=factor(vs) is used to specify different color for each stack or level of vs
ggplot(data=barg02, aes(x=factor(gear), y=perc, fill=factor(vs))) +

  #fill= here is used to specify the legend title: Engine will be displayed instead of vs
  geom_bar(stat="identity", width=0.5) +
  labs(y = "Percentage", x = "Gears", fill = "Engine") +

  scale_fill_manual(values=c("#f05c5c", "#34bde1")) + 
  theme_classic()

stacked bar graph
#Dodged bar graph
#The code is the same as the one used above for the stacked bar graph. The only difference is the addition of the parameter position=position_dodge()
ggplot(data=barg02, aes(x=factor(gear), y=perc, fill=factor(vs))) +

  geom_bar(stat="identity", width=0.5, position=position_dodge()) +
  labs(y = "Percentage", x = "Gears", fill = "Engine") +

  scale_fill_manual(values=c("#f05c5c", "#34bde1")) + 
  theme_classic()
dodged bar graph

Similar Posts

Leave a comment