Repeat the cell value in dataframe by condition
As the dataframe d1 below, A1
is correspond to B1
, A2
is correspond to B2
, and so on. I want to change B1-3
value by condition of: B
or C
= copy 2 times, D
= copy 3 times, as the dataframe target.
d1 = DataFrame([{'A1': 'A', 'A2': 'A', 'A3': '', 'B1': '2', 'B2': '2', 'B3': ''}, {'A1': 'A', 'A2': 'C', 'A3': '', 'B1': '2', 'B2': '2', 'B3': ''}, {'A1': 'A', 'A2': 'B', 'A3': 'C', 'B1': '2', 'B2': '4', 'B3': '4'}, {'A1': 'A', 'A2': 'C', 'A3': 'D', 'B1': '2', 'B2': '2', 'B3': '4'}]) d1 A1 A2 A3 B1 B2 B3 0 A A 2 2 1 A C 2 2 2 A B C 2 4 4 3 A C D 2 2 4
target = DataFrame([{'A1': 'A', 'A2': 'A', 'A3': '', 'B1': '2', 'B2': '2', 'B3': ''}, {'A1': 'A', 'A2': 'C', 'A3': '', 'B1': '2', 'B2': '22', 'B3': ''}, {'A1': 'A', 'A2': 'B', 'A3': 'C', 'B1': '2', 'B2': '44', 'B3': '44'}, {'A1': 'A', 'A2': 'C', 'A3': 'D', 'B1': '2', 'B2': '22', 'B3': '444'}]) target A1 A2 A3 B1 B2 B3 0 A A 2 2 1 A C 2 22 2 A B C 2 44 44 3 A C D 2 22 444
And I've tried using np.where
for the condition of B
and C
, but it's seems only apply on B
to copy the value. Is there any methods to reach it.
Acol = ['A1','A2','A3'] Bcol = ['B1','B2','B3'] d1[Bcol] = np.where(d1[Acol] == ('B' or 'C'), d1[Bcol]+d1[Bcol], d1[Bcol]) d1 A1 A2 A3 B1 B2 B3 0 A A 2 2 1 A C 2 2 2 A B C 2 44 4 3 A C D 2 2 4
I'd suggest storing the multiplier conditions for A, B, … in a dictionary and applying it like this:
multiplier_map={'':1,'A':1,'B':2,'C':2,'D':3} for i in [1,2,3]: df['B{0}'.format(i)]=df['B{0}'.format(i)]*df['A{0}'.format(i)].map(multiplier_map)
Note that the multiplier_map
also needs to contain an empty string as key.
Selecting rows in pandas DataFrame based on conditions , Let's see how to Select rows based on some conditions in Pandas DataFrame. Selecting rows based on particular column value using '>', '=', '='� Using these methods either you can replace a single cell or all the values of a row and column in a dataframe based on conditions .
Using np.select
for col in ('A1','A2','A3'): new_col = 'B'+col[-1] mask1 = df[col] == 'A' mask2 = (df[col] == 'B') | (df[col] == 'C') mask3 = df[col] == 'D' df[new_col] = df[new_col].astype('str') df[new_col] = np.select([mask1, mask2, mask3], [df[new_col], df[new_col]*2, df[new_col]*3], df[new_col])
Output:
A1 A2 A3 B1 B2 B3 0 A A 2 2 1 A C 2 22 2 A B C 2 44 44 3 A C D 2 22 444
Indexing and selecting data — pandas 0.8.1 documentation, Series: series[label] returns a scalar value; DataFrame: frame[colname] to [] to select columns in that order: If a column is not contained in the DataFrame, If you want to identify and remove duplicate rows in a DataFrame, there are two� Repeat or replicate the rows of dataframe in pandas python: Repeat the dataframe 3 times with concat function. Ignore_index=True does not repeat the index. So new index will be created for the repeated columns ''' Repeat without index ''' df_repeated = pd.concat([df1]*3, ignore_index=True) print(df_repeated) So the resultant dataframe will be
Maybe these four lines:
d1.loc[d1['A2'].eq('B') | d1['A2'].eq('C'), 'B2'] += d1.loc[d1['A2'].eq('B') | d1['A2'].eq('C'), 'B2'] d1.loc[d1['A2'].eq('D'), 'B2'] += d1.loc[d1['A2'].eq('D'), 'B2'] + d1.loc[d1['A2'].eq('D'), 'B2'] d1.loc[d1['A3'].eq('B') | d1['A3'].eq('C'), 'B3'] += d1.loc[d1['A3'].eq('B') | d1['A3'].eq('C'), 'B3'] d1.loc[d1['A3'].eq('D'), 'B3'] += d1.loc[d1['A3'].eq('D'), 'B3'] + d1.loc[d1['A3'].eq('D'), 'B3']
And now:
print(df)
Is:
A1 A2 A3 B1 B2 B3 0 A A 2 2 1 A C 2 22 2 A B C 2 44 44 3 A C D 2 22 444
Python Pandas : Select Rows in DataFrame by conditions on , Select Rows based on value in column; Select Rows based on any of the multiple values in� Dataframe cell value by Integer position. From the above dataframe, Let’s access the cell value of 1,2 i.e Index 1 and Column 2 i.e Col C. iat - Access a single value for a row/column pair by integer position. Use iat if you only need to get or set a single value in a DataFrame or Series.
Try below:
d1['B1'] = np.where( d1['A1'].isin(['B' , 'C']), d1['B1'] * 2, np.where(d1['A1'].isin(['D']), d1['B1'] * 3, d1['B1'])) d1['B2'] = np.where( d1['A2'].isin(['B' , 'C']), d1['B2'] * 2, np.where(d1['A2'].isin(['D']), d1['B2'] * 3, d1['B2'])) d1['B3'] = np.where( d1['A2'].isin(['B' , 'C']), d1['B3'] * 2, np.where(d1['A3'].isin(['D']), d1['B3'] * 3, d1['B3']))
count rows in a dataframe, Count the number of rows in a dataframe for which 'Age' column contains value more than 30 i.e.. # Get a bool series representing which row� Let’s see how to Select rows based on some conditions in Pandas DataFrame. Selecting rows based on particular column value using '>', '=', '=', '<=', '!=' operator.. Code #1 : Selecting all the rows from the given dataframe in which ‘Percentage’ is greater than 80 using basic method.
Repeat rows of a data.frame, How to sum all values of a column of in a data.frame? asked Aug 17, 2019 in R Programming by� I have a couple pandas dataframe questions. I would like to replace the values in only certain cells (based on a boolean condition) with a value identified from another cell. I have defined the dataframe from an imported text file, which returns a dataframe with column headers 'P' and 'F' and values in all of the cells.
Selecting pandas DataFrame Rows Based On Conditions. 20 Dec 2017. Preliminaries # Import modules import pandas as pd import numpy as np # Create a dataframe raw_data
Now let’s select rows from this DataFrame based on conditions, Select Rows based on value in column. Select rows in above DataFrame for which ‘Product’ column contains the value ‘Apples’, subsetDataFrame = dfObj[dfObj['Product'] == 'Apples'] It will return a DataFrame in which Column ‘Product‘ contains ‘Apples‘ only i.e.