map over columns and apply custom function

tidyverse apply function to each column
r map multiple functions
error: argument 1 must have names map_df
r map function examples
purrr apply function to each row
purrr apply function to each column
map with custom function purrr
dplyr loop through columns

Missing something small here and struggling to pass columns to function. I just want to map (or lapply) over columns and perform a custom function on each of the columns. Minimal example here:

library(tidyverse)
set.seed(10)
df <- data.frame(id = c(1,1,1,2,3,3,3,3),
                    r_r1 = sample(c(0,1), 8, replace =  T),
                    r_r2 = sample(c(0,1), 8, replace =  T),
                    r_r3 = sample(c(0,1), 8, replace =  T))
df
#   id r_r1 r_r2 r_r3
# 1  1    0    0    1
# 2  1    0    0    1
# 3  1    1    0    1
# 4  2    1    1    0
# 5  3    1    0    0
# 6  3    0    0    1
# 7  3    1    1    1
# 8  3    1    0    0

a function just to filter and counts unique ids remaining in the dataset:

cnt_un <-  function(var) {
  df %>% 
    filter({{var}} == 1) %>% 
    group_by({{var}}) %>% 
    summarise(n_uniq = n_distinct(id)) %>% 
    ungroup()
}

it works outside of map

cnt_un(r_r1)
# A tibble: 1 x 2
   r_r1 n_uniq
  <dbl>  <int>
1     1      3

I want to apply the function over all r_r columns to get something like:

df2
#      y n_uniq
# 1 r_r1      3
# 2 r_r2      2
# 3 r_r3      2

I thought the following would work but doesnt

map(dplyr::select(df, matches("r_r")), ~ cnt_un(.x))

any suggestions? thanks

Here is another solution. I changed the syntax of your function. Now you supply the pattern of the columns you want to select.

cnt_un <-  function(var_pattern) {
  df %>%
    pivot_longer(cols = contains(var_pattern), values_to = "vals", names_to = "y") %>%
    filter(vals == 1) %>%
    group_by(y) %>%
    summarise(n_uniq = n_distinct(id)) %>% 
    ungroup()
}

cnt_un("r_r")
#> # A tibble: 3 x 2
#>   y     n_uniq
#>   <chr>  <int>
#> 1 r_r1       2
#> 2 r_r2       3
#> 3 r_r3       2

The power of three: purrr-poseful iteration in R with map, pmap and , In this post we focus primarily on the map family of functions which, at their Example 4: Bonus, apply a custom function and create a network graph If we use pmap and apply the paste function but do not specify column� map function is something you do apply on Series only. You cannot apply map on DataFrame. The thing to remember is that apply can do anything applymap can, but apply has eXtra options. The X factor options are: axis and result_type where result_type only works when axis=1 (for columns).

Here's a base R solution that uses lapply. The tricky bit is that your function isn't actually running on single columns; it's using id, too, so you can't use canned functions that iterate column-wise.

do.call(rbind, lapply(grep("r_r", colnames(df), value = TRUE), function(i) {

  X <- subset(df, df[,i] == 1)

  row <- data.frame(y = i, n_uniq = length(unique(X$id)), stringsAsFactors = FALSE)

}))

     y n_uniq
1 r_r1      2
2 r_r2      3
3 r_r3      2

Looping through dataframe columns using purrr::map() – Sebastian , And the dot . is then again a shorthand for the column that is handed through the function (just as x in the normal apply call). Well, that's basically� Choose the columns in your data to create the functions and phases in your diagram. Use this page in the wizard to map which columns contain the function or phase values for a Cross Functional flowchart diagram. If you chose Basic Flowchart as the diagram type in the first page of the wizard, this page is not displayed.

I'm not sure if there's a direct tidyeval way to do this with something like map. The issue you're running into is that in calling map(df, *whatever_function*), the function is being called on each column of df as a vector, whereas your function expects a bare column name in the tidyeval style. To verify that:

map(df, class)

will return "numeric" for each column.

An alternative is to iterate over column names as strings, and convert those to symbols; this takes just one additional line in the function.

library(dplyr)
library(tidyr)
library(purrr)

cnt_un_name <- function(varname) {
  var <- ensym(varname)
  df %>% 
    filter({{var}} == 1) %>% 
    group_by({{var}}) %>% 
    summarise(n_uniq = n_distinct(id)) %>% 
    ungroup()
}

Calling the function is a little awkward because it keeps only the relevant column names (calling on "r_r1" gets columns "r_r1" and "n_uniq", etc). One way is to get the vector of column names you want, name it so you can add an ID column in map_dfr, and drop the extra columns, since they'll be mostly NA.

grep("^r_r\\d+", names(df), value = TRUE) %>%
  set_names() %>%
  map_dfr(cnt_un_name, .id = "y") %>%
  select(y, n_uniq)
#> # A tibble: 3 x 2
#>   y     n_uniq
#>   <chr>  <int>
#> 1 r_r1       3
#> 2 r_r2       2
#> 3 r_r3       2

A better way is to call the function, then bind after reshaping.

grep("^r_r\\d+", names(df), value = TRUE) %>%
  map(cnt_un_name) %>%
  map_dfr(pivot_longer, 1, names_to = "y") %>%
  select(y, n_uniq)
# same output as above

Alternatively (and maybe better/more scaleable) would be to do the column renaming inside the function definition.

Apply a function to every row in a pandas dataframe, This page is based on a Jupyter/IPython Notebook: download the original .ipynb. import pandas as pd. Use .apply to send a column of every row to a function. Map would apply the function to each list Stack Exchange Network Stack Exchange network consists of 177 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Apply a function to each element of a list or atomic vector — map , The map functions transform their input by applying a function to each map_dfr () and map_dfc() return a data frame created by row-binding and column-binding respectively. Additional arguments passed on to the mapped function. .id. These functions are variants of map() that iterate over multiple arguments simultaneously. They are parallel in the sense that each input is processed in parallel with the others, not in the sense of multicore computing. They share the same notion of "parallel" as base::pmax() and base::pmin(). map2() and walk2() are specialised for the two argument case; pmap() and pwalk() allow you to

pandas.apply(): Apply a function to each row/column in Dataframe , Modified Dataframe by applying lambda function on each column: Apply a User Defined function with or without arguments to each row or� To apply a function to every Image in an ImageCollection use imageCollection.map(). The only argument to map() is a function which takes one parameter: an ee.Image. For example, the following code adds a timestamp band to every image in the collection:

9 Functionals, 8.5 Custom conditions A common use of functionals is as an alternative to for loops. Section 9.2 introduces your first functional: purrr::map() . Section 9.3 demonstrates how you can combine multiple simple functionals to solve a more reduces a vector to a single result by applying a function that takes two inputs. A custom function cannot affect cells other than those it returns a value to. In other words, a custom function cannot edit arbitrary cells, only the cells it is called from and their adjacent cells. To edit arbitrary cells, use a custom menu to run a function instead. A custom function call must return within 30 seconds.