How to replace column values based on previous value by condition and select rows from dataframe

pandas replace column values with another column based on condition
pandas replace values in column based on multiple condition
pandas update value based on condition
pandas swap column values based on condition
pandas replace values in dataframe with values from another dataframe
pandas replace specific values in column
dataframe change column value with condition
python replace column values by condition

I have dataframe with two columns X1 and X2

first thing: In X2 i have value 0 and 1 , if in X2 value is 1 when this change from 1 to zero then in next 20 rows should be 1 not zero.

for example :


desired X2=(0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)

Second thing: if X1=0 and X2=1 then select rows from dataframe until X2 value remains 1 I tried this piece of code but it selects only one row.

df1=df[(df['X1'] == 0) & (df['X2'] ==1)]

Edited to include both parts:

# First Thing:
df['X2'] = df['X2'].replace({0: np.nan}).ffill(limit=20).fillna(0)

# Second Thing:
df.loc[(df['X1'] == 0) & (df['X2'] == 1), 'new X2'] = 1
df.loc[(df['X2'] == 0), 'new X2'] = 0
df['new X2'] = df['new X2'].ffill()
df.loc[df['new X2'] == 1] # Selected Rows

Pandas DataFrame: replace all values in a column, based on , You need to select that column: In [41]: df.loc[df['First Season'] > 1990, 'First Season'] = 1 df Out[41]: Team First Season Total Games 0 Dallas Cowboys 1960 � The reason your code doesn't work is because using ['female'] on a column (the second 'female' in your w['female']['female']) doesn't mean "select rows where the value is 'female'". It means to select rows where the index is 'female', of which there may not be any in your DataFrame.

Your dataframe is not big so you could use easily loop to resolve your problem:

#first prog
index = 0
while index < df.shape[0]:
    if index + 1 < df.shape[0] and df['X2'][index] == 1 and df['X2'][index + 1] == 0:
        df.loc[index +1: index + 20,'X2'] = 1            #set 1 to next 20 rows
    index = index + 1 


#second prog assuming you have a column X1/X2
df['select'] = False
for index, row in df.iterrows():
    if index > 0 and df['select'][index - 1] == True and row.X2 == 1:
        df.loc[index, 'select'] = True
    if row.X1 == 0 and row.X2 == 1:
        df.loc[index, 'select'] = True

df = df[df['select'] == True].drop('select', axis=1) 


Pandas How to replace values based on Conditions, Using these methods either you can replace a single cell or all the values of a row and column in a dataframe based on conditions . Random point on a sphere Are there any space probes or landers which regained communication after being lost? Writing a worded mathemati

Here is a solution to the "First thing" using numpy.

import numpy as np

locs =np.where(df['X2'].diff() == -1)[0]
for loc in locs:
    df.loc[slice(loc, loc+20), 'X2'] = 1

pandas.DataFrame.replace — pandas 1.1.0 documentation, Value to replace any values matching to_replace with. For a DataFrame a dict of values can be used to specify which value to use for each column (columns not in � Select Rows based on any of the multiple values in column Select rows in above DataFrame for which ‘ Product ‘ column contains either ‘ Grapes ‘ or ‘ Mangos ‘ i.e subsetDataFrame = dfObj[dfObj['Product'].isin(['Mangos', 'Grapes']) ]

Pandas Tutorial: Replacing Values in DataFrames and Series, We have seen in the previous chapters of our tutorial many ways to create Series and We will show ways how to change single value or values matching strings or used to access a single value but also to access a group of rows and columns by a Conditional that returns a boolean Series with column labels specified. To select rows whose column value does not equal some_value, use !=: df.loc[df['column_name'] != some_value] isin returns a boolean Series, so to select rows whose value is not in some_values, negate the boolean Series using ~: df.loc[~df['column_name'].isin(some_values)]

Master Python's pandas library with these 100 tricks, Got bad data (or empty rows) at the top of your CSV file? Need to create a bunch of new columns based on existing columns? Want to select from a DataFrame by label *and* position? Want to know the *count* of rows that match a condition? Use where() to replace all other values with "Other" I have a dataset where I have the time in a game and the time of an event. EVENT GAME 0:34 0:43 NaN 0:23 2:34 3:43 NaN 4:50 I want to replace the NaN in the E

Using iloc, loc, & ix to select rows and columns in Pandas DataFrames, Selecting rows with a boolean / conditional lookup selects based on index values of any rows. Change the index to be based on the 'id' column Note that in the last example, data.loc[487] (the row with index value 487) is not equal to� Select rows from a DataFrame based on values in a column in pandas. I am looking for a function to range over a table and using text values as column names, replace the value of one arbitrary field based on the value of another arbitrary field in the. R add column values R add column valuesAs I wanted to replace only numeric values, I used Value.

  • Please provide a sample Daytaframe in order to understand your question better.
  • These are two different questions and should probably be asked separately. Also please provide the code you have tried so far.
  • @ecortazar for first problem X2 is column of dataframe
  • @john sloper the code that i tried for second problem
  • @Nickel The code for the Second thing is correct. Are you sure there are multiple rows that fulfill this in the dataframe?
  • That is very neat. It does change the dtype to float though. May or may not be a problem.
  • an astype(int) at the end couldn't hurt. It would just make the solution a bit uglier.
  • @ ecortazar first problem is solved , but second problem code is not working, its give the whole dataframe rows not only selected rows
  • @Nickel Sorry,. though it was clear you needed to do df[df['new X2']] to see only the selected. I've amended the answer.
  • if dataframe is not big your code is working well , but i think if my dataframe size will increase it will take long time to finish this loops
  • @ Frenchy Sorry i forget to mention data size , i have data more than 10000 rows
  • you have right, loop is not good when you begins to have more than 5000 rows in this case you have lot of others solutions but as you havent indicate the number of rows of your dataframe, i have choosen a soluton easy to understant.. for the first prog loop or not loop is not the question, because only the first test is trapped. and you dont win lot of time, its just for the second prog the problem could exist
  • for the second things you have just one step, followings your explanation there is only one step, you dont precise if the process goes on when the first is done (first next 20 rows = 1).