How to sum values across different rows and summarise as one row (R)

r sum columns by row
r sum multiple columns by group
sum specific rows in r
sum across columns in r dplyr
sum rows in r
dplyr summarise multiple columns
summary statistics in r dplyr
sum columns in r

I have employee payments data that comes out as one row = one payment record. The variables describe the name, what payment it was and the value.

My end goal is to have a data frame in which each employee = one row with the different types of payments summed up and each payment type has its own variable.

Please see example:

data <- data.frame("name" = c("John", "John", "John", "Marie", "Marie", "Alex"),
               "payment.reason" = c("bonus", "bonus", "commission", "commission", "commission", "discretionary bonus"),
               "value" = c(1000, 5000, 2500, 1500, 500, 2500))

which looks like this:

   name      payment.reason value
1  John               bonus  1000
2  John               bonus  5000
3  John          commission  2500
4 Marie          commission  1500
5 Marie          commission   500
6  Alex discretionary bonus  2500

and this is the end result I am after:

goal
   name bonus commission discretionary.bonus
1  John  6000       2500                   0
2 Marie     0       2000                   0
3  Alex     0          0                2500

I know I'll need to spread the data to push the payment.reason values into columns, but I am struggling to figure out how to sum each individual payment type value for each person and have the data come out grouped by each person.

Thank you in advance!

We can do all of this with pivot_wider in tidyr:

library(tidyr)

pivot_wider(data, name, names_from = payment.reason, values_from = value, values_fn = list(value = sum))
#> # A tibble: 3 x 4
#>   name  bonus commission `discretionary bonus`
#>   <fct> <dbl>      <dbl>                 <dbl>
#> 1 John   6000       2500                    NA
#> 2 Marie    NA       2000                    NA
#> 3 Alex     NA         NA                  2500

Created on 2019-12-23 by the reprex package (v0.3.0)

Note (as in @AlexB's answer) that you can also add values_fill = list(value = 0) if you need explicit 0s instead of NA.

Summarise each group to fewer rows — summarise • dplyr, Source: R/summarise. It will have one (or more) rows for each combination of grouping variables; if there are variables, the output will have a single row summarising all observations in the input. data-masking > Name-value pairs of summary functions. A data frame, to add multiple columns from a single expression. In the following examples, we will compute the sum of the first column vector Sepal.Length within each Species group. Example 1: Sum by Group Based on aggregate R Function. In the first example, I’ll show you how to compute the sum by group with the aggregate function.

We can use dcast from data.table and make use of the fun.aggregate

library(data.table)
dcast(setDT(data), name ~ payment.reason, value.var = 'value', sum)
#    name bonus commission discretionary bonus
#1:  Alex     0          0                2500
#2:  John  6000       2500                   0
#3: Marie     0       2000                   0

Or xtabs from base R

xtabs(value ~ name + payment.reason, data)
#    payment.reason
#name    bonus commission discretionary bonus
#  Alex      0          0                2500
#  John   6000       2500                   0
#  Marie     0       2000                   0

rowsum: Give Column Sums of a Matrix or Data Frame, Based on a, For integer arguments, over/underflow in forming the sum results in NA . Value. A matrix or data frame containing the sums. There will be one row per unique value � Explanation To sum values in matching columns and rows, you can use the SUMPRODUCT function. In the example shown, the formula in J6 is: = SUMPRODUCT(data * (codes = J4) * (days = J5))

library(tidyr)    
data %>%
  group_by(name, payment.reason) %>%
  summarise(value = sum(value)) %>%
  pivot_wider(name, names_from = payment.reason,  values_from = value, values_fill = list(value = 0))

  name  `discretionary bonus` bonus commission
  <fct>                 <dbl> <dbl>      <dbl>
1 Alex                   2500     0          0
2 John                      0  6000       2500
3 Marie                     0     0       2000

Creating Summary Lines • gt, We can use the gt summary_rows() function to insert summary rows into a table. The groupwise summaries operate on one or more row groups, which can You choose how to format the values in the resulting summary cells by use of a formatter in quotes (e.g., "sum" ), as bare functions (e.g., sum ), or as one-sided R� Sum cells based on column and row criteria with formulas. Here, you can apply the following formulas to sum the cells based on both the column and row criteria, please do as this: Enter any one of the below formulas into a blank cell where you want to output the result: =SUMPRODUCT((A2:A7="Tom")*(B1:J1="Feb")*(B2:J7))

Using data.table:

library(data.table)
setDT(data)[, value := sum(value), by = c("name", "payment.reason")]
data <- unique(data)
data <- reshape(data, idvar = "name", timevar = "payment.reason", direction = "wide")
data[is.na(data)] <- 0
colnames(data) = gsub("value.", "", colnames(data))
data

         name       bonus       commission        discretionary bonus
  #  1:  John        6000             2500                         0
  #  2: Marie           0             2000                         0
  #  3:  Alex           0                0                      2500

Sum data in one column based on condition in another column , Apologies, I don't even know where to begin so I don't have any R code yet. and I would like to sum the average column in chunks based on values in the So, the first one would be the sum of rows 2-11 in the average those station names are repeated throughout the data frame for ~20 different dates. summarise(data, mean_run = mean(R)): Creates a variable named mean_run which is the average of the column run from the dataset data. Output: ## mean_run ## 1 19.20114

Here is a base R solution, where reshape() and aggregate() are used

dfout <- reshape(aggregate(data[3],data[-3],FUN = sum),
                 direction = "wide",
                 idvar = "name",
                 timevar = "payment.reason")
dfout[is.na(dfout)] <- 0

such that

> dfout
   name value.bonus value.commission value.discretionary bonus
1  John        6000             2500                         0
3 Marie           0             2000                         0
4  Alex           0                0                      2500

Sum across multiple columns with dplyr, My question involves summing up values across multiple columns of a data frame To sum down each column, you can use the following: 1 4 5 4 3 7. To sum up, each row, use the following: If you want to learn What is R Programming visit R Programming Summarizing multiple columns with dplyr? When you have a data table in your worksheet you can insert the Total Row option for summing up the data in a table. For inserting the Total Row, first, select any cell of the table and in the Design tab, select the Total Row under the Table Style option. You will see the sum value of column D after selecting the Total Row option. It means the

Row-wise operations, Calling a function multiple times with varying arguments. df <- tibble(x = 1:2, y = 3:4, z = 5:6) df %>% rowwise() #> # A tibble: 2 x 3 #> # Rowwise: #> x y z dplyr::summarise() makes it really easy to summarise values across rows within one column. Let's say we want compute the sum of w , x , y , and z for each row. Add the sum formula into the total table. Type out the start of your sum formula =SUM(. Left click on the Jan sheet with the mouse. Hold Shift key and left click on the Dec sheet. Now select the cell C3 in the Dec sheet. Add a closing bracket to the formula and press Enter. Your sum formula should now look like this. =SUM(Jan:Dec!C3)

Aggregating and analyzing data with dplyr, Apply common dplyr functions to manipulate data in R. Employ the 'pipe' operator to link Employ the 'mutate' function to apply other chosen functions to existing Frequently you'll want to create new columns based on the values in existing each group into a single-row summary. summarize() does this by applying an� The same order number can be present on multiple rows as it would be a single order for multiple items. I want to create a calculated column that sums all of the Total Sales values by Order Number, to give me a total sale by Order Number value.

Form Row and Column Sums and Means, Form row and column sums and means for numeric arrays (or data frames). Should missing values (including NaN ) be omitted from the calculations? dims. integer: Which dimensions are regarded as 'rows' or 'columns' to sum over. For row* , the sum or mean is over dimensions dims+1, ; for col* it is over dimensions� Definition of sum(): The sum R function computes the sum of a numeric input vector. In this tutorial I’ll explain in three examples how to apply the sum function in R. Let’s jump right to it. Example 1: Basic Application of sum() in R. First, we need to create some example data to which we can apply the sum R function.

Comments
  • Nice use of values_fn
  • Thank you - one line solution - really elegant!