Summing Multiple Groups of Columns

excel sum group by multiple columns
sum by group excel
excel sum values with same label
excel macro to group by column and sum value
group summation in excel
sumif
excel group by count
sum by group in r

I have a situation where my data frame contains the results of image analysis where the columns are the proportion of a particular class present in the image, such that an example dataframe class_df would look like:

id    A    B    C    D    E    F
 1 0.20 0.30 0.10 0.15 0.25 0.00 
 2 0.05 0.10 0.05 0.30 0.10 0.40
 3 0.10 0.10 0.10 0.20 0.20 0.30

Each of these classes belongs to a functional group and I want to create new columns where the proportions of each functional group are calculated from the classes. An example mapping class_fg

class         fg
    A          Z
    B          Z
    C          Z
    D          Y
    E          Y
    F          X

and the desired result would be (line added to show the desired new columns):

id    A    B    C    D    E    F |    X    Y    Z
 1 0.20 0.30 0.10 0.15 0.25 0.00 | 0.00 0.40 0.60
 2 0.05 0.10 0.05 0.30 0.10 0.40 | 0.40 0.40 0.20
 3 0.10 0.10 0.10 0.20 0.20 0.30 | 0.30 0.40 0.30

And I can do it one functional group at a time using

first_fg <- class_fg %>%
  filter(fg == "Z") %>%
  select(class) %>%
  unlist()

class_df <- class_df %>%
  mutate(Z = rowSums(select(., one_of(first_fg))))

Surely there is a better way to do this where I can calculate the row sum for each functional group without having to just repeat this code for each group? Maybe using purrr?


We could split the 'class_df' by 'class', loop through the list elements with map, select the columns of 'class_df' and get the rowSums

library(tidyverse)
class_fg %>%
    split(.$fg) %>% 
    map_df(~ class_df %>%
                select(one_of(.x$class)) %>% 
                rowSums) %>%
    bind_cols(class_df, .)
#  id    A   B    C    D    E   F   X   Y   Z
#1  1 0.20 0.3 0.10 0.15 0.25 0.0 0.0 0.4 0.6
#2  2 0.05 0.1 0.05 0.30 0.10 0.4 0.4 0.4 0.2
#3  3 0.10 0.1 0.10 0.20 0.20 0.3 0.3 0.4 0.3

Or do a group by nesting, and then do the rowSums by mapping over the list elements

class_fg %>% 
   group_by(fg) %>%
   nest %>%
   mutate(out = map(data, ~  class_df %>%
                               select(one_of(.x$class)) %>% 
                               rowSums)) %>% 
   select(-data)  %>%
   unnest %>% 
   unstack(., out ~ fg) %>% 
   bind_cols(class_df, .)

Excel formula: Sum by group, Note: data must be sorted by the grouping column to get sensible results. How this formula works. The framework of this formula is based on IF, which tests each​  Using the SUM/SUMPRODUCT Function for Multiple Columns. Instead of using the AutoSum feature you can also use the SUM function directly to calculate the total sales for a month. For this type =SUM(B2:B9). Now after pressing Enter, drag this formulated cell in cell C10 and D10 to calculate the total sales for the month of February and March.


Always it is easier to work on data in long format. Hence, change class_df to long format using tidyr:gather and join with class_fg. Perform analysis in long format on your data. Finally, spread in wide-format to match expected result.

library(tidyverse)

class_df %>% gather(key, value, -id) %>% 
  inner_join(class_fg, by=c("key" = "class")) %>%
  group_by(id, fg) %>%
  summarise(value = sum(value)) %>%
  spread(fg, value) %>%
  inner_join(class_df, by="id") %>% as.data.frame()

#   id   X   Y   Z    A   B    C    D    E   F
# 1  1 0.0 0.4 0.6 0.20 0.3 0.10 0.15 0.25 0.0
# 2  2 0.4 0.4 0.2 0.05 0.1 0.05 0.30 0.10 0.4
# 3  3 0.3 0.4 0.3 0.10 0.1 0.10 0.20 0.20 0.3

Data:

class_fg <- read.table(text = 
"class         fg
                 A          Z
                 B          Z
                 C          Z
                 D          Y
                 E          Y
                 F          X",
header = TRUE, stringsAsFactors = FALSE)

class_df  <- read.table(text = 
"id    A    B    C    D    E    F
1 0.20 0.30 0.10 0.15 0.25 0.00 
2 0.05 0.10 0.05 0.30 0.10 0.40
3 0.10 0.10 0.10 0.20 0.20 0.30",
header = TRUE, stringsAsFactors = FALSE)

SQL SUM() with GROUP by, SQL SUM() using multiple columns with group by. To get data of 'cust_city' and the sum of 'opening_amt' and 'receive_amt' for each individual  Sumif with multiple criteria in one column In Excel, SUMIF function is a useful function for us to sum cells with multiple criteria in different columns, but with this function, we can also sum cells based on multiple criteria in one column.


Yet another option, along with the already contributed working solutions, would be to use quasiquotation with the rlang package to build expressions to calculate the sums in each group.

library(tidyverse)

First, define a helper function for doing an elementwise sum of vectors:

psum <- function(...) reduce(list(...), `+`)

Extracting the groupings into a list from class_fg we can then construct a list of expressions to calculate the sum in each group:

sum_exprs <- with(class_fg, split(class, fg)) %>% 
  map(~ rlang::expr(psum(!!!rlang::syms(.x))))

sum_exprs
#> $X
#> psum(F)
#> 
#> $Y
#> psum(D, E)
#> 
#> $Z
#> psum(A, B, C)

With the list of expressions ready, we can just "bang-bang-bang" (!!!) them into the data with mutate:

class_df %>%
  mutate(!!!sum_exprs)
#>   id    A   B    C    D    E   F   X   Y   Z
#> 1  1 0.20 0.3 0.10 0.15 0.25 0.0 0.0 0.4 0.6
#> 2  2 0.05 0.1 0.05 0.30 0.10 0.4 0.4 0.4 0.2
#> 3  3 0.10 0.1 0.10 0.20 0.20 0.3 0.3 0.4 0.3

(I used the code provided by @MKR in his answer to create the data).

Created on 2018-05-22 by the reprex package (v0.2.0).

How to sum values by group in Excel?, Here are two columns, one is the product name, and the other is Sales. You can sum values by group with one formula easily in Excel. Super Formula Bar (​easily edit multiple lines of text and formula); Reading Layout (easily read and edit  If you need to sum numbers based on multiple criteria, you can use the SUMIFS function. In the example shown, the formula in G6 is: = SUMIFS ( D4:D11 , B4:B11 , "blue" , C4:C11 , ">15" ) How this formula works The first range


My usual approach is to stick to base R as long as the data sets don't get too large. In your case, a base R solution would be:

class_df=as.data.frame(
  c(class_df,
    lapply(split(class_fg,class_fg$fg),
           function(x) rowSums(class_df[,x$class,drop=FALSE]))))
class_df
#  id    A   B    C    D    E   F   X   Y   Z
#1  1 0.20 0.3 0.10 0.15 0.25 0.0 0.0 0.4 0.6
#2  2 0.05 0.1 0.05 0.30 0.10 0.4 0.4 0.4 0.2
#3  3 0.10 0.1 0.10 0.20 0.20 0.3 0.3 0.4 0.3

If the data sets get too large, I use data.table. A data.table solution for your problem:

library(data.table)

class_dt=data.table(class_df)
grps=split(class_fg,class_fg$fg)

for (g in grps) class_dt[,c(g$fg[1]):=rowSums(.SD),.SDcols=g$class,]
class_dt
#   id    A   B    C    D    E   F   X   Y   Z
#1:  1 0.20 0.3 0.10 0.15 0.25 0.0 0.0 0.4 0.6
#2:  2 0.05 0.1 0.05 0.30 0.10 0.4 0.4 0.4 0.2
#3:  3 0.10 0.1 0.10 0.20 0.20 0.3 0.3 0.4 0.3

How can I "group by" and sum a column in excel?, Use function: Sum The formula will look in the D column for the text grocery, if it is present it will add the cells in column J that Click Structure > Grouping. AutoSum multiple rows and Columns. 1. Select the range you want to sum plus blank column right to this range and blank row below this range as left screen shot shown. 2. Apply the AutoSum function with pressing the Alt + = keys simultaneously.


Another tidyverse solution using rowSums on column subsets :

library(tidyverse)
class_fg %>%
  group_by(fg) %>% 
  summarize(list(rowSums(class_df[class]))) %>%
  spread(1,2) %>%
  unnest() %>%
  bind_cols(class_df, .)

#>   id    A   B    C    D    E   F   X   Y   Z
#> 1  1 0.20 0.3 0.10 0.15 0.25 0.0 0.0 0.4 0.6
#> 2  2 0.05 0.1 0.05 0.30 0.10 0.4 0.4 0.4 0.2
#> 3  3 0.10 0.1 0.10 0.20 0.20 0.3 0.3 0.4 0.3

Or for the glory of code golf :

x <- with(class_fg, tapply(class, fg, reformulate))
mutate(class_df, !!!map(x, ~as.list(.)[[2]]))
#>   id    A   B    C    D    E   F   X   Y   Z
#> 1  1 0.20 0.3 0.10 0.15 0.25 0.0 0.0 0.4 0.6
#> 2  2 0.05 0.1 0.05 0.30 0.10 0.4 0.4 0.4 0.2
#> 3  3 0.10 0.1 0.10 0.20 0.20 0.3 0.3 0.4 0.3

Sum data by using a query - Access - Office Support, values and then group them by two the data used as column headings,  Multiple columns SUM As you can see the above sum is sum of marks of each student. This is horizontal sum in our table. Now we will try to get sum of marks of all students in all subjects.


Grouping and summarizing report data, You can group report data by columns or rows you select, to help you better understand the data. You can also set reports to automatically display the sum,  My question involves summing up values across multiple columns of a data frame and creating a new column corresponding to this summation using dplyr. The data entries in the columns are binary(0,1). I am thinking of a row-wise analog of the summarise_each or mutate_each function of dplyr.


Aggregations in Power Pivot - Excel, Aggregations are a way of collapsing, summarizing, or grouping data. from another table, and then sum, count, or average those values in the related rows. SUM is used with a GROUP BY clause. The aggregate functions summarize the table data. Once the rows are divided into groups, the aggregate functions are applied in order to return just one value per group. It is better to identify each summary row by including the GROUP BY clause in the query resulst.


How to Group and Summarize your Results (simple explanation , The SQL GROUP BY Clause is used to output a row across specified column learn about grouping results returned from your queries using the GROUP BY clause. It is typically used in conjunction with aggregate functions such as SUM or