Generate dataframe from pandas groupby object with apply function returning multiple values

pandas groupby apply return multiple columns
pandas groupby multiple columns
pandas groupby aggregate multiple columns
pandas apply
pandas groupby transform
pandas groupby tutorial
pandas groupby sum
pandas groupby sort

Sample Input:

df = pd.DataFrame(data = [[0,1,2,3], [0,1,3,4], [0,2,5,6], [0,2,7,8]], columns=['id1', 'id2', 'var1', 'var2'])

function f:

def f(var1, var2):
    return [np.sum(var1)*10, np.sum(var2)*10]

output needed:

The method I have used to generate this is:

result_df = pd.DataFrame(df.groupby(['id1', 'id2'])['var1', 'var2'].apply(lambda x: f(x['var1'], x['var2'])))
pd.DataFrame(result_df[0].tolist(), columns=['result_var1', 'result_var2'], index=result_df.index).reset_index()

Is there any better method available to generate a dataframe by applying a function on a pandas groupby object, and the function returns multiple values.


Use agg:

result = df.groupby(['id1', 'id2'], as_index=False).agg(lambda x: x.sum() * 10)
print(result)

Output

   id1  id2  var1  var2
0    0    1    50    70
1    0    2   120   140

A more general application of agg is the following:

def general_var1(x):
    return x.sum() * 10


def general_var2(x):
    return x.sum() * 5 + 2


result = df.groupby(['id1', 'id2'], as_index=False).agg({'var1': general_var1, 'var2': general_var2})

Output

   id1  id2  var1  var2
0    0    1    50    37
1    0    2   120    72

More examples can be found in the linked documentation.

Group By: split-apply-combine, To create a GroupBy object (more on what the GroupBy object is later), you do the following: These will split the DataFrame on its index (rows). Calling the standard Python len function on the GroupBy object just returns the length of the groups In the case of grouping by multiple keys, the group name will be a tuple:​. Step 1: split the data into groups by creating a groupby object from the original DataFrame; Step 2: apply a function, in this case, an aggregation function that computes a summary statistic (you can also transform or filter your data in this step); Step 3: combine the results into a new DataFrame.


try this

(df.groupby(['id1', 'id2']).sum() * 10).reset_index()

Out[247]:
   id1  id2  var1  var2
0    0    1    50    70
1    0    2   120   140

Group By: split-apply-combine, Filling NAs within groups with a value derived from each group. Calling the standard Python len function on the GroupBy object just returns the length of In the case of grouping by multiple keys, the group name will be a tuple: On a grouped DataFrame , you can pass a list of functions to apply to each column, which  Pandas object can be split into any of their objects. There are multiple ways to split an object like − obj.groupby('key') obj.groupby(['key1','key2']) obj.groupby(key,axis=1) Let us now see how the grouping objects can be applied to the DataFrame object. Example


If you need it exactly as you have it (with even the column names changed), use the code below.

(df.groupby(['id1', 'id2'])['var1', 'var2'].sum()*10).add_prefix('result_').reset_index()

Output

   id1  id2     result_var1     result_var2
0   0   1             50        70
1   0   2             120       140

Group By: split-apply-combine, To create a GroupBy object (more on what the GroupBy object is later), you do the Calling the standard Python len function on the GroupBy object just returns the For dataframes with multiple columns, filters should explicitly specify a  Here’s how to group your data by specific columns and apply functions to other columns in a Pandas DataFrame in Python. Create the DataFrame with some example data You should see a DataFrame that looks like this: Example 1: Groupby and sum specific columns Let’s say you want to count the number of units, but … Continue reading "Python Pandas – How to groupby and aggregate a DataFrame"


Is it possible to output multiple pandas dataframe from one function , I'll just add a function that explicitly returns two DataFrames: [code]In [1]: import numpy as np In [2]: import pandas You may know that Python has multiple value assignment: def example_func():; # some code to create DataFrames ..​. ret = {; 'tax_rates': How can I fix 'Pandas applying a function to groupby' in Python? Apply a function along an axis of the DataFrame. Objects passed to the function are Series objects whose index is either the DataFrame’s index ( axis=0 ) or the DataFrame’s columns ( axis=1 ). By default ( result_type=None ), the final return type is inferred from the return type of the applied function.


How to use the Split-Apply-Combine strategy in Pandas groupby, Pandas groupby-apply is an invaluable tool in a Python data scientist's toolkit. to the reset_index function, your output dataframe will drop the columns that make up the MultiIndex and create a new index with incremental integer values. or a DataFrame object, depending on whether a single or multiple  Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names.


Deriving New Columns & Defining Python Functions, This lesson builds on the pandas DataFrame data type you learned about in a Apply functions to DataFrames using .apply(); Select multiple columns Nested inside this list is a DataFrame containing the results generated by the SQL query Since you'll be using pandas methods and objects, import the pandas library. Groupby can return a dataframe, a series, or a groupby object depending upon how it is used, and the output type issue leads to numerous problems when coders try to combine groupby with other pandas functions. One especially confounding issue occurs if you want to make a dataframe from a groupby object or series.