Summarise categorical data for bar plot

best graph for categorical data
best graph for quantitative data
displaying categorical data
continuous variable graph
what is the name given to a chart that displays the summarised data of a single numeric variable?
select the appropriate graph for the variable gender
how to graph binary data
pie chart of categorical data

I ^ have a dataset that looks something like this:

    typestudy   dloop cytb coi  other microsat  SNP
    methods     no  no  no  no  yes no
    methods     yes no  no  no  no  yes
    methods     no  no  no  no  yes no
    methods     no  no  no  no  yes no
    wildcrime   no  no  no  yes no  no
    taxonomy    no  no  no  no  yes no
    methods     yes no  no  no  no  no
    methods     no  no  no  no  yes no
    taxonomy    no  no  no  no  yes no
    wildcrime   yes no  no  no  no  no
    methods     yes no  no  no  no  no
    taxonomy    no  no  no  no  yes yes
    taxonomy    no  no  no  no  yes no

Except it has 10 columns of yes/no corresponding to further genetic elements and there are over 200 rows.

In Excel the graphical summary option give me a great stacked bar plot but I need to be able to recreate it in R to meet university standards for my report

    > summary(dframe1$type.of.study)
         methods development                        other 
                          49                            5 
population genetic structure                     taxonomy 
                          91                           86 
              wildlife crime 
                           6 
    > barplot(as.matrix(dframe1))
     There were 11 warnings (use warnings() to see them)
    > warnings()
    Warning messages:
    1: In apply(height, 2L, cumsum) : NAs introduced by coercion
    2: In apply(height, 2L, cumsum) : NAs introduced by coercion
    3: In apply(height, 2L, cumsum) : NAs introduced by coercion
    4: In apply(height, 2L, cumsum) : NAs introduced by coercion
    5: In apply(height, 2L, cumsum) : NAs introduced by coercion
    6: In apply(height, 2L, cumsum) : NAs introduced by coercion
    7: In apply(height, 2L, cumsum) : NAs introduced by coercion
    8: In apply(height, 2L, cumsum) : NAs introduced by coercion
    9: In apply(height, 2L, cumsum) : NAs introduced by coercion
   10: In apply(height, 2L, cumsum) : NAs introduced by coercion
   11: In apply(height, 2L, cumsum) : NAs introduced by coercion

which gives me this

and I've also managed to produce this but can't find the script I used for it

My aim is something similar to this:

it's pretty pathetic but it's taken me almost a week of troubleshooting based on online resources and other questions on here to get to this point. I can't figure out how to count the types of study so they're tallied up to make the height of the bar plots corresponding to different genetic markers. I know this is far too vague for stackoverflow's standards but I'm desperate so I'm leaving this up in case anyone has any suggestions


^ (I'm going to be as concise as I can but as you'll see my fluency in R is atrocious, I wouldn't ask for help but I've spent DAYS grappling with this data and I'm petrified I'll never find a solution and there's nobody to ask for help on my research placement)

We can get the table of each column by looping over it and then do a barplot

barplot(sapply(df1[-1], function(x) table(factor(x, 
     levels = c("yes", "no")))), col = c("red", "blue"))
legend("topright", legend = c("yes", "no"), fill = c("red", "blue"))

Presenting Categorical Data Graphically, We usually begin working with categorical data by summarizing the data into a In this section we will work with bar graphs that display categorical data; the  Categorical data is usually displayed graphically as frequency bar charts and as pie charts: Frequency bar charts: Displaying the spread of subjects across the different categories of a variable is most easily done by a bar chart. To create a bar chart manually from a tally of subjects in each category, you draw a graph containing one vertical bar for each category, making the height proportional to the number of subjects in that category.

Just in case you want something that looks like the example (kind of):

df <- read.table(text = "typestudy   dloop cytb coi  other microsat  SNP
    methods     no  no  no  no  yes no
                 methods     yes no  no  no  no  yes
                 methods     no  no  no  no  yes no
                 methods     no  no  no  no  yes no
                 wildcrime   no  no  no  yes no  no
                 taxonomy    no  no  no  no  yes no
                 methods     yes no  no  no  no  no
                 methods     no  no  no  no  yes no
                 taxonomy    no  no  no  no  yes no
                 wildcrime   yes no  no  no  no  no
                 methods     yes no  no  no  no  no
                 taxonomy    no  no  no  no  yes yes
                 taxonomy    no  no  no  no  yes no", 
                 header = T, stringsAsFactors = F)

library(tidyr)
library(ggplot2)
library(dplyr)
df %>% gather(key = key, value = value, -typestudy) %>% 
  filter(value == "yes") %>% 
  ggplot(aes(x = key, fill = typestudy)) +
  geom_bar() + 
  coord_flip() + 
  theme_minimal() +
  theme(legend.position = "bottom",
        panel.grid.minor = element_blank(),
        panel.grid.major.y = element_blank()) +
  xlab(NULL) +
  ylab(NULL)

D3: Summarizing Categorical Data: Data Analysis and Probability , But perhaps the percentage bar chart is more useful since one can read the category percentages directly from the graph. PRACTICE: Summarizing Categorical  A bar chart shows the distribution of a discrete variable or a categorical one, and so will have spaces between the bars. It is a mistake to use a bar chart to display a summary statistic such as a mean, particularly when it is accompanied by some measure of variation to produce a "dynamite plunger plot"(1).

I don't know if you are just after the yeses but here is the possibility that may enable you to use the no's just in case you want barplots by types of response (yes/no).

df %>%
  gather(var, value, -typestudy) %>%
  group_by(typestudy, var, value) %>%
  count() %>%
  filter(value == "yes") %>%
  ggplot(aes(var, n, group = typestudy, fill = typestudy)) +
  geom_bar(stat = "identity") +
  scale_fill_brewer(palette = "Dark2", direction = -1) +
  coord_flip() +
  theme(
    axis.title.x=element_blank(),
    axis.title.y=element_blank(),
    legend.position = "bottom",
    panel.grid.minor = element_blank(),
    panel.grid.major.y = element_blank(),
    legend.title=element_blank())

Data
df <- structure(list(typestudy = c("methods", "methods", "methods", 
"methods", "wildcrime", "taxonomy", "methods", "methods", "taxonomy", 
"wildcrime", "methods", "taxonomy", "taxonomy"), dloop = c("no", 
"yes", "no", "no", "no", "no", "yes", "no", "no", "yes", "yes", 
"no", "no"), cytb = c("no", "no", "no", "no", "no", "no", "no", 
"no", "no", "no", "no", "no", "no"), coi = c("no", "no", "no", 
"no", "no", "no", "no", "no", "no", "no", "no", "no", "no"), 
    other = c("no", "no", "no", "no", "yes", "no", "no", "no", 
    "no", "no", "no", "no", "no"), microsat = c("yes", "no", 
    "yes", "yes", "no", "yes", "no", "yes", "yes", "no", "no", 
    "yes", "yes"), SNP = c("no", "yes", "no", "no", "no", "no", 
    "no", "no", "no", "no", "no", "yes", "no")), .Names = c("typestudy", 
"dloop", "cytb", "coi", "other", "microsat", "SNP"), class = "data.frame", row.names = c(NA, 
-13L))

[PDF] Summarising categorical variables in R, will be used to demonstrate summarising categorical variables. After saving the To produce a stacked bar chart of contingency table 'cross' with different  Plotting Categorical Data. Sometimes we have to plot the count of each item as bar plots from categorical data. For example, here is a vector of age of 10 college freshmen. age <- c(17,18,18,17,18,19,18,16,18,18) Simply doing barplot(age) will not give us the required plot. It will plot 10 bars with height equal to the student’s age.

[PDF] How do we summarize one categorical variable? What visual , For visual displays we typically use a bar chart or pie chart or similar variation to display the results for the variable in a graphical form. For numerical measures  Create a histogram bar plot directly from a categorical array. The function histogram accepts the categorical array, SelfAssessedHealthStatus, and plots the category counts for each of the four categories. Create a histogram of the hospital location for only the patients who assessed their health as Fair or Poor.

Graphical Summaries for Discrete Variables, Bar Charts for Dichotomous and Categorical Variables. Graphical displays are very useful for summarizing data, and both dichotomous and  There are actually two different categorical scatter plots in seaborn. They take different approaches to resolving the main challenge in representing categorical data with a scatter plot, which is that all of the points belonging to one category would fall on the same position along the axis corresponding to the categorical variable.

Summarise categorical data for bar plot, We can get the table of each column by looping over it and then do a barplot barplot(sapply(df1[-1], function(x) table(factor(x, levels = c("yes",  bar graph of categorical data is a staple of visualizations for categorical data. The spineplot heat-map allows you to look at interactions between different factors. These are not the only things you can plot using R. You can easily generate a pie chart for categorical data in r.

Comments
  • Do you need barplot(sapply(df1[-1], function(x) table(factor(x, levels = c("yes", "no")))))
  • @akrun you beautiful soul thank you so much for your time! I don't want to ask too much and you've helped me massively already but I'm wondering how to make the bar plots proportional to each other rather than as a percentage?
  • Is that what you wanted or different?
  • @akrun oh wow I was so excited but I've realised this still leaves me with some questions about how to incorporate the studytype column into the stacks to make them look like the excel barplot
  • Are you just counting the number of yeses?
  • you absolute dream of a human being thank you so much!! this is wonderful
  • its a shame that my vote on your answer doesn't show up because I'm new to the site, you well and truly saved my bacon I'm so grateful