Using apply() on pandas dataframe, with other dataframe columns as inputs

Related searches

I have an dataframe of football results, and am attempting to make a new column at the end of the dataframe which shows which team won. I'm attempting to do this using df.apply. Here is what i have so far:

def match_winner(winner,home_team,away_team,home_goals,away_goals):

    if home_goals>away_goals:
        winner = home_team
    elif home_goals<away_goals:
        winner = away_team
    else:
        winner = "None"

match['Winning Team'] =""
match['Winning Team'].apply(match_winner,args=[match['Home Team'],match['Away Team'],match['home_team_goal'],match['away_team_goal']])

And here is the structure of the match dataframe:

<bound method DataFrame.info of           id  country_id  league_id     season  stage                 date  \
145      146           1          1  2008/2009     24  2009-02-27 00:00:00   
153      154           1          1  2008/2009     25  2009-03-08 00:00:00   
155      156           1          1  2008/2009     25  2009-03-07 00:00:00   
162      163           1          1  2008/2009     26  2009-03-13 00:00:00   
168      169           1          1  2008/2009     26  2009-03-14 00:00:00   
...      ...         ...        ...        ...    ...                  ...   
25972  25973       24558      24558  2015/2016      8  2015-09-13 00:00:00   
25974  25975       24558      24558  2015/2016      9  2015-09-22 00:00:00   
25975  25976       24558      24558  2015/2016      9  2015-09-23 00:00:00   
25976  25977       24558      24558  2015/2016      9  2015-09-23 00:00:00   
25978  25979       24558      24558  2015/2016      9  2015-09-23 00:00:00   

       match_api_id  home_team_api_id  away_team_api_id  home_team_goal  \
145          493017              8203              9987               2   
153          493025              9984              8342               1   
155          493027              8635             10000               2   
162          493034              8203              8635               2   
168          493040             10000              9999               0   
...             ...               ...               ...             ...   
25972       1992089             10243             10191               3   
25974       1992091             10190             10191               1   
25975       1992092              9824             10199               1   
25976       1992093              9956             10179               2   
25978       1992095             10192              9931               4   

       away_team_goal  goal shoton shotoff foulcommit  card cross corner  \
145                 1  None   None    None       None  None  None   None   
153                 3  None   None    None       None  None  None   None   
155                 0  None   None    None       None  None  None   None   
162                 1  None   None    None       None  None  None   None   
168                 0  None   None    None       None  None  None   None   
...               ...   ...    ...     ...        ...   ...   ...    ...   
25972               3  None   None    None       None  None  None   None   
25974               0  None   None    None       None  None  None   None   
25975               2  None   None    None       None  None  None   None   
25976               0  None   None    None       None  None  None   None   
25978               3  None   None    None       None  None  None   None   

      possession   BSA                Home Team         Away Team  \
145         None  2.25              KV Mechelen          KRC Genk   
153         None  2.38        KSV Cercle Brugge    Club Brugge KV   
155         None  7.00           RSC Anderlecht  SV Zulte-Waregem   
162         None  1.75              KV Mechelen    RSC Anderlecht   
168         None  4.33         SV Zulte-Waregem     KSV Roeselare   
...          ...   ...                      ...               ...   
25972       None   NaN                FC Zürich           FC Thun   
25974       None   NaN            FC St. Gallen           FC Thun   
25975       None   NaN                 FC Vaduz         FC Luzern   
25976       None   NaN  Grasshopper Club Zürich           FC Sion   
25978       None   NaN           BSC Young Boys          FC Basel   

                         League      Country  home_player_1  home_player_2  \
145      Belgium Jupiler League      Belgium          False          False   
153      Belgium Jupiler League      Belgium          False          False   
155      Belgium Jupiler League      Belgium          False          False   
162      Belgium Jupiler League      Belgium          False          False   
168      Belgium Jupiler League      Belgium          False          False   
...                         ...          ...            ...            ...   
25972  Switzerland Super League  Switzerland          False          False   
25974  Switzerland Super League  Switzerland          False          False   
25975  Switzerland Super League  Switzerland          False          False   
25976  Switzerland Super League  Switzerland          False          False   
25978  Switzerland Super League  Switzerland          False          False   

       home_player_3  home_player_4  home_player_5  home_player_6  \
145            False          False          False          False   
153            False          False          False          False   
155            False          False          False          False   
162            False          False          False          False   
168            False          False          False          False   
...              ...            ...            ...            ...   
25972          False          False          False          False   
25974          False          False          False          False   
25975          False          False          False          False   
25976          False          False          False          False   
25978          False          False          False          False   

       home_player_7  home_player_8  home_player_9  home_player_10  \
145            False          False          False           False   
153            False          False          False           False   
155            False          False          False           False   
162            False          False          False           False   
168            False          False          False           False   
...              ...            ...            ...             ...   
25972          False          False          False           False   
25974          False          False          False           False   
25975          False          False          False           False   
25976          False          False          False           False   
25978          False          False          False           False   

       home_player_11  away_player_1  away_player_2  away_player_3  \
145             False          False          False          False   
153             False          False          False          False   
155             False          False          False          False   
162             False          False          False          False   
168             False          False          False          False   
...               ...            ...            ...            ...   
25972           False          False          False          False   
25974           False          False          False          False   
25975           False          False          False          False   
25976           False          False          False          False   
25978           False          False          False          False   

       away_player_4  away_player_5  away_player_6  away_player_7  \
145            False          False          False          False   
153            False          False          False          False   
155            False          False          False          False   
162            False          False          False          False   
168            False          False          False          False   
...              ...            ...            ...            ...   
25972          False          False          False          False   
25974          False          False          False          False   
25975          False          False          False          False   
25976          False          False          False          False   
25978          False          False          False          False   

       away_player_8  away_player_9  away_player_10  away_player_11  \
145            False          False           False           False   
153            False          False           False           False   
155            False          False           False           False   
162            False          False           False           False   
168            False          False           False           False   
...              ...            ...             ...             ...   
25972          False          False           False           False   
25974          False          False           False           False   
25975          False          False           False           False   
25976          False          False           False           False   
25978          False          False           False           False   

      Winning Team  
145                 
153                 
155                 
162                 
168                 
...            ...  
25972               
25974               
25975               
25976               
25978               

[21374 rows x 47 columns]>

I need to send multiple columns as input arguements in order to find who's won but i'm not sure if .apply() allows you to pass in a series as an argument? Is it possible to do this using .apply(). If so a solution would be helpful, but if anyone knows a better alternative that would also be useful.

Use numpy.select instead apply, which should be not good idea here (loops under the hood):

match = pd.DataFrame({
        'Home Team':list('abcdef'),
        'Away Team':list('ghijkl'),
         'home_team_goal':[14,5,4,5,5,4],
         'away_team_goal':[7,8,2,4,8,4],
})

m1 = match.home_team_goal>match.away_team_goal
m2 = match.home_team_goal<match.away_team_goal
match['winner'] = np.select([m1, m2], [match['Home Team'],  match['Away Team']], default=None)
print (match)
  Home Team Away Team  home_team_goal  away_team_goal winner
0         a         g              14               7      a
1         b         h               5               8      h
2         c         i               4               2      c
3         d         j               5               4      d
4         e         k               5               8      k
5         f         l               4               4   None

How to apply a function to two columns in a Pandas DataFrame in , DataFrame.apply() to apply a function to two columns in a DataFrame DataFrame to apply a single-input func to each element in each column. print(df) Use any two-input function to obtain a new column based on two other columns. In this article, we will learn different ways to apply a function to single or selected columns or rows in Dataframe. We will use Dataframe/series.apply() method to apply a function. Syntax: Dataframe/series.apply(func, convert_dtype=True, args=())

Avoid DataFrame.apply (which is usually run as a hidden loop) and instead consider nested conditional logic with numpy.where on columns:

match['Winning Team'] = np.where(match['home_team_goal'] > match['away_team_goal'],
                                 match['Home Team'],
                                 np.where(match['home_team_goal'] < match['away_team_goal'],
                                          match['Away Team'],
                                          np.nan
                                         )
                                )

pandas.DataFrame.apply — pandas 1.1.1 documentation, DataFrame. apply (func, axis=0, raw=False, result_type=None, args=(), **kwds)[ source]�. Apply a False : passes each row or column as a Series to the function . True : the Additional keyword arguments to pass as keywords arguments to func . Using a numpy universal function (in this case the same as np.sqrt(df) ):. Python’s Pandas Library provides an member function in Dataframe class to apply a function along the axis of the Dataframe i.e. along each row or column i.e. DataFrame.apply(func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, args=(), **kwds)

Here's another way:

Creating dataframe with the teams:

results_di = {
    "Home Team": [
        "KV Mechelen",
        "KSV Cercle Brugge",
        "RSC Anderlecht",
        "KV Mechelen",
        "SV Zulte-Waregem",
        "FC Zürich",
        "FC St. Gallen",
        "FC Vaduz",
        "Grasshopper Club Zürich",
        "BSC Young Boys",
    ],
    "Away Team": [
        "KRC Genk",
        "Club Brugge KV",
        "SV Zulte-Waregem",
        "RSC Anderlecht",
        "KSV Roeselare",
        "FC Thun",
        "FC Thun",
        "FC Luzern",
        "FC Sion",
        "FC Basel",
    ],
    "home_team_goal": [2, 1, 2, 2, 0, 3, 1, 1, 2, 4],
    "away_team_goal": [1, 3, 0, 1, 0, 3, 0, 2, 0, 3],
}

df = pd.DataFrame(results_di)
df['winner'] = 'none'
home_win = df['home_team_goal'] - df["away_team_goal"] > 0
away_win = df["away_team_goal"] - df['home_team_goal'] > 0

df.loc[home_win, "winner"] = df.loc[home_win, 'Home Team']
df.loc[away_win, "winner"] = df.loc[away_win, 'Away Team']

print(df)

                 Home Team         Away Team  home_team_goal  away_team_goal  \
0              KV Mechelen          KRC Genk               2               1   
1        KSV Cercle Brugge    Club Brugge KV               1               3   
2           RSC Anderlecht  SV Zulte-Waregem               2               0   
3              KV Mechelen    RSC Anderlecht               2               1   
4         SV Zulte-Waregem     KSV Roeselare               0               0   
5                FC Zürich           FC Thun               3               3   
6            FC St. Gallen           FC Thun               1               0   
7                 FC Vaduz         FC Luzern               1               2   
8  Grasshopper Club Zürich           FC Sion               2               0   
9           BSC Young Boys          FC Basel               4               3   

                    winner  
0              KV Mechelen  
1           Club Brugge KV  
2           RSC Anderlecht  
3              KV Mechelen  
4                     none  
5                     none  
6            FC St. Gallen  
7                FC Luzern  
8  Grasshopper Club Zürich  
9           BSC Young Boys  

pandas.DataFrame.combine — pandas 0.24.2 documentation, Perform column-wise combine with another DataFrame based on a passed function. Combines a DataFrame with other DataFrame using func to element- wise Function that takes two series as inputs and return a Series or a scalar. s2: s1 if s1.sum() < s2.sum() else s2 >>> df1.combine(df2, take_smaller) A B 0 0 3 1 0 3. Apply a function to each row or column in Dataframe using pandas.apply() Apply a function to single or selected columns or rows in Pandas Dataframe; Ways to apply an if condition in Pandas DataFrame; Apply uppercase to a column in Pandas dataframe; Ways to apply an if condition in Pandas DataFrame; Highlight Pandas DataFrame's specific columns

pandas.DataFrame.apply — pandas 0.24.2 documentation, Input/Output � General functions � Series � DataFrame � Constructor False : passes each row or column as a Series to the function. If the DataFrame is empty, apply will use reduce to determine whether the result should be a Series or a DataFrame. Additional keyword arguments to pass as keywords arguments to func . Here’s an example using apply on the dataframe, which I am calling with axis = 1.. Note the difference is that instead of trying to pass two values to the function f, rewrite the function to accept a pandas Series object, and then index the Series to get the values needed.

Apply a function to every row in a pandas dataframe, You can use .apply to send a single column to a function. This is useful when cleaning up data - converting formats, altering values etc. # What's our data look like? Apply a lambda function to all the columns in dataframe using Dataframe.apply () and inside this lambda function check if column name is ‘z’ then square all the values in it i.e. # Apply function numpy.square () to square the value one column only i.e. with column name 'z' modDfObj = dfObj.apply(lambda x: np.square(x) if x.name == 'z' else x)

pandas.DataFrame.apply¶ DataFrame.apply (func, axis = 0, raw = False, result_type = None, args = (), ** kwds) [source] ¶ Apply a function along an axis of the DataFrame. Objects passed to the function are Series objects whose index is either the DataFrame’s index (axis=0) or the DataFrame’s columns (axis=1).

Comments
  • Check your answer, i think you meant match["Away Team"] in the second filter.
  • @jezrael ... all roads lead to Rome -right? I am used to the set-based, nested logic, similar and transferable to other langauges (i.e., R's ifelse, SQL's case). np.select is interesting!
  • I still rememeber first meet with this function from unutbu and it was really nice knowledge ;) I also used double, triple np.where, but a bit worse readable ;)