group_by() and summarize() is not resulting in one row in the output for each group

by Prabhakar   Last Updated August 14, 2019 03:19 AM

I am learning basics of {dplyr} package in R and working with summarize() function. When I create groups using group_by() function, output does not result in summary rows for each group.

The data frame is:

head(titanic_df)

  Class    Sex   Age Survived Freq
1   1st   Male Child       No    0
2   2nd   Male Child       No    0
3   3rd   Male Child       No   35
4  Crew   Male Child       No    0
5   1st Female Child       No    0

Now I group the df by Class:

head(titanic_df %>% group_by(Class))

# A tibble: 6 x 5
# Groups:   Class [4]
  Class Sex    Age   Survived  Freq
  <fct> <fct>  <fct> <fct>    <dbl>
1 1st   Male   Child No           0
2 2nd   Male   Child No           0
3 3rd   Male   Child No          35
4 Crew  Male   Child No           0
5 1st   Female Child No           0

This is fine. But when I try to summarize the variable Freq in this grouped df, output is not grouped by Class.

titanic_df %>% 
  group_by(Class) %>% 
  summarize(num_passg_class = sum(Freq))

  num_passg_class
1            2201

The output is a single row of sum of the entire Freq column, not segregated by Class. Am I missing something?

I am expecting following results:

tapply(titanic_df$Freq, titanic_df$Class, sum)

 1st  2nd  3rd Crew 
 325  285  706  885 

PS: This is my first attempt to write formatted R codes at this forum, hope I haven't been terrible at it.

Tags : data-cleaning


Answers 1


How about

  titanic_df %>% 
  group_by(Class) %>% 
  summarise(num_passg_class = sum(Freq))

which will produce something like this (I made up the numbers as I couldn't find the dataset):

# A tibble: 5 x 2
  Class num_passg_class
  <chr>           <dbl>
1 1st                 4
2 2nd                 4
3 3rd                 6
4 Child              10
5 Crew                8

Or this which is spread:

titanic_df %>% 
  group_by(Class) %>% 
  summarise(num_passg_class = sum(Freq)) %>% 
  spread(Class, num_passg_class)

Which is this:

# A tibble: 1 x 5
  `1st` `2nd` `3rd` Child  Crew
  <dbl> <dbl> <dbl> <dbl> <dbl>
1     4     4     6    10     8
william3031
william3031
August 14, 2019 03:08 AM

Related Questions


Updated April 04, 2018 09:19 AM

Updated January 24, 2018 16:19 PM

Updated July 17, 2018 15:19 PM

Updated March 20, 2019 18:19 PM

Updated April 19, 2019 19:19 PM