Returning pmap results as columns in a dataframe

dataframe apply
r map function examples
pandas math operations on columns
pandas dataframe
pandas apply return dataframe
pandas apply return multiple columns
pandas apply empty dataframe
pandas map values from one column to another

I want to divide each column in list a by a corresponding column in list b and then return the ratio as a new column in a dataframe that already exists.

I figured out a general way to do it using the following code (using the diamonds package as an example):

library(tidyverse)

results <- list(
  lst("depth", "table", "price"),
  lst("x", "y", "z")
) %>%
  pmap_dfc(~diamonds %>% mutate(var = !!sym(.x)/!!sym(.y))) %>%
  select(c(1:ncol(diamonds)), matches("var")) %>%
  rename(new1 = var,
         new2 = var1,
         new3 = var2)

My problem is that this duplicates the entire dataframe for each new variable I'm creating, and I then need to deselect these duplicated columns. This isn't an issue here, but might be when I need to do this with 1) more variables and/or 2) larger dataframes.

Any advice on how to only create the new columns and bind them to the diamonds dataframe (i.e. avoid having to use the select function in my code)?

EDIT

The desired result is what's currently in the results object above (& pasted below) -- the process of getting there in my code just feels wrong to me.

> results
# A tibble: 53,940 x 13
   carat cut       color clarity depth table price     x     y     z  new1  new2  new3
   <dbl> <ord>     <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 0.23  Ideal     E     SI2      61.5    55   326  3.95  3.98  2.43  15.6  13.8  134.
 2 0.21  Premium   E     SI1      59.8    61   326  3.89  3.84  2.31  15.4  15.9  141.
 3 0.23  Good      E     VS1      56.9    65   327  4.05  4.07  2.31  14.0  16.0  142.
 4 0.290 Premium   I     VS2      62.4    58   334  4.2   4.23  2.63  14.9  13.7  127.
 5 0.31  Good      J     SI2      63.3    58   335  4.34  4.35  2.75  14.6  13.3  122.
 6 0.24  Very Good J     VVS2     62.8    57   336  3.94  3.96  2.48  15.9  14.4  135.
 7 0.24  Very Good I     VVS1     62.3    57   336  3.95  3.98  2.47  15.8  14.3  136.
 8 0.26  Very Good H     SI1      61.9    55   337  4.07  4.11  2.53  15.2  13.4  133.
 9 0.22  Fair      E     VS2      65.1    61   337  3.87  3.78  2.49  16.8  16.1  135.
10 0.23  Very Good H     VS1      59.4    61   338  4     4.05  2.39  14.8  15.1  141.
# ... with 53,930 more rows

Just transmute() and then bind new cols to original df:

library(tidyverse)

results <- list(
  lst("depth", "table", "price"),
  lst("x", "y", "z")
) %>%
  pmap_dfc(~diamonds %>% transmute(var = !!sym(.x)/!!sym(.y))) %>%
  bind_cols(diamonds, .)

Pandas Dataframe Examples: Column Operations, Using map function multiply 'x' column by 2. def multiply_by_two_map(x): return x​*2 df['2x_map'] = df['x'].map(multiply_by_two) # >>> CPU times: Both map and apply yielded identical results Pandas: Convert a dataframe column into a list using Series.to_list() or numpy.ndarray.tolist() in python; How to get & check data types of Dataframe columns in Python Pandas; Pandas : Change data type of single or multiple columns of Dataframe in Python; Pandas : Convert Dataframe index into column using dataframe.reset_index() in python


You can generate those three new columns separately. Since the order is the same, you can then use bind_cols to concatenate it.

I just wanted to avoid using intermediate variables so I wrote everything in a pipeline.

diamonds %>%
    bind_cols(
        list(
            lst("depth", "table", "price"),
            lst("x", "y", "z")
        ) %>%
            pmap_dfc(~diamonds[[.x]]/diamonds[[.y]]) %>%
            {
                colnames(.) <- c("var1","var2","var3")
                return(.)
            }
    )
# A tibble: 53,940 x 13
   carat cut       color clarity depth table price     x     y     z  var1  var2  var3
   <dbl> <ord>     <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 0.23  Ideal     E     SI2      61.5    55   326  3.95  3.98  2.43  15.6  13.8  134.
 2 0.21  Premium   E     SI1      59.8    61   326  3.89  3.84  2.31  15.4  15.9  141.
 3 0.23  Good      E     VS1      56.9    65   327  4.05  4.07  2.31  14.0  16.0  142.
 4 0.290 Premium   I     VS2      62.4    58   334  4.2   4.23  2.63  14.9  13.7  127.
 5 0.31  Good      J     SI2      63.3    58   335  4.34  4.35  2.75  14.6  13.3  122.
 6 0.24  Very Good J     VVS2     62.8    57   336  3.94  3.96  2.48  15.9  14.4  135.
 7 0.24  Very Good I     VVS1     62.3    57   336  3.95  3.98  2.47  15.8  14.3  136.
 8 0.26  Very Good H     SI1      61.9    55   337  4.07  4.11  2.53  15.2  13.4  133.
 9 0.22  Fair      E     VS2      65.1    61   337  3.87  3.78  2.49  16.8  16.1  135.
10 0.23  Very Good H     VS1      59.4    61   338  4     4.05  2.39  14.8  15.1  141.
# ... with 53,930 more rows

Apply a function to each element of a list or atomic vector, The map functions transform their input by applying a function to each element of a list map_dfr() and map_dfc() return a data frame created by row-binding and If a string, the output will contain a variable with that name, storing either the  Extracting specific columns of a pandas dataframe ¶ df2[["2005", "2008", "2009"]] That would only columns 2005, 2008, and 2009 with all their rows. Extracting specific rows of a pandas dataframe ¶ df2[1:3] That would return the row with index 1, and 2. The row with index 3 is not included in the extract because that’s how the slicing syntax


One solution is to compose unevaluated expressions that capture the desired computation, then pass those expressions directly to mutate:

# Inputs
a <- c("depth", "table", "price")
b <- c("x", "y", "z")

# Compose unevaluated expressions
e <- map2( a, b, ~expr(!!sym(.x)/!!sym(.y)) )

# Pass them to mutate using unquote-splice
R <- diamonds %>% mutate( !!!set_names(e, c("new1","new2","new3")) )

# Compare to desired output
all_equal( R, results )     # TRUE

pandas.Series.map, Input/Output · General functions · Series · Constructor · pandas. Map values of Series using input correspondence (a dict, Series, or function). If 'ignore', propagate NA values, without passing them to the mapping correspondence. Returns: y : Series on a Series. DataFrame.apply: Apply a function row-/column​-wise. How to select multiple columns in a pandas dataframe Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier.


pandas.DataFrame.applymap, This method applies a function that accepts and returns a scalar to every In the current implementation applymap calls func twice on the first column/row to  Create a new column in Pandas DataFrame based on the existing columns While working with data in Pandas, we perform a vast array of operations on the data to get the data in the desired form. One of these operations could be that we want to create new columns in the DataFrame based on the result of some operations on the existing columns in the


21 Iteration, The first iteration will run output[[1]] <- median(df[[1]]) , the second will run output[[​2]] <- median(df[[2]]) , and so on. We wanted to rescale every column in a data frame: The type of the vector is determined by the suffix to the map function. Those summary functions returned doubles, so we need to use map_dbl() :. pandas.Series.map¶ Series.map (self, arg, na_action = None) [source] ¶ Map values of Series according to input correspondence. Used for substituting each value in a Series with another value, that may be derived from a function, a dict or a Series.


Python, pandas.map() is used to map values from two series having one column same. and pokemon_types index column are same and hence Pandas.map() matches the rest of two columns and returns a new series. Output: Example #2: This function works only with Series. Passing a data frame would give an Attribute error. If working with data is part of your daily job, you will likely run into situations where you realize you have to loop through a Pandas Dataframe and process each row. I recently find myself in