## Assign a group to values

I have a pandas dataframe with several columns(20) and rows (16404). One the columns is ['age']. I would like to be able to plot other metrics such as ['Income'] over a category of age. Ex: What's the income for all the Males under 20 years old or Females aged between 20 and 40.

I tried this type of condition:

for i in range(len(df['age'])): if df['age'][i]<25 and df['Gender'][i]==1: df['group'][i]=1

But I get the following error: `The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()`

Could you please indicate me how to assign a group to a row depending on these conditions please?

All the series are int64

Best

- Ambiguous error can be solved by
`(df['age'] < 25) & (df['Gender'] == 1)`

Note that I used an`&`

instead of`and`

. - if you did that, you are evaluating an entire column and assigning an entire column for every row which is very wasteful.

Do this to get booleans

df['group'] = df['age'].lt(25) & df['Gender'].eq(1)

you can convert that to integers `0`

and `1`

in many ways

df['group'] = df['group'].astype(int)

You should use apply method instead (see doc):

def your_function(row): if row['age']<25 and row['Gender']==1: return 1 else: return 0 df['group'] = df.apply(your_function,axis=1)

cond_1 = df['age'] < 25 cond_2 = df['Gender'] == 1 df['group'] = np.where(cond_1 & cond_2, 1, 0)

It will assign `1`

where both conditions are satisfied and `0`

everywhere else.

Taking into account your comments, this method doesn't have to be binary. You can include as many conditions as you need and you can substitute the `1`

for any int or str you want. Moreover, you can change the `0`

to `np.nan`

.

