pandas: How to compare value of column with next value

pandas compare two columns row by row
pandas iterate over rows
compare two dataframes pandas
pandas : compare two columns of different data frame
pandas shift
pandas compare two rows
pandas compare row to previous row
pandas for loop

I have a dataframe which looks as follows:

  colA  colB
0    A    10
1    B    20
2    C     5
3    D     2
4    F    30

I would like to compare column 1 values to detect two successive decrements. That is, I want to report the index values where I have two successive decrements of column 1. For example, I want to report 'B' because there are two successive rows following B where column 1 values are decremented. I am not sure how to approach this without writing a loop. ( If there is no way to avoid a loop I'd like to know.)


You can use loc for this:

desired=frame.loc[(frame["colB"]>=frame["colB"].shift(-1)) &
          (frame["colB"].shift(-1)>=frame["colB"].shift(-2) )]

The output will be:

   colA colB
1   B   20

if you only wish to report the value B:

desired=frame["colA"].loc[(frame["colB"]>=frame["colB"].shift(-1)) &
          (frame["colB"].shift(-1)>=frame["colB"].shift(-2) )]

The output will be:


Pandas compare next row, This new column will contain the comparison results based on the following rules : If Price1 is equal to Price2, then assign the value of True; Otherwise, assign the � I have this Pandas Dataframe: A B 0 xyz Lena 1 NaN J. Brooke 2 NaN B. Izzie 3 NaN B. Rhodes 4 NaN J. Keith.. I want to compare the values of column B such that if row value begins with B then in it’s adjacent row of column A new should be written and similarly if J then old. Below is what I’m expecting:

Yes you can do this without using loop.

df = pd.DataFrame({'colA':['A', 'B', 'C', 'D', 'F'], 'colB':[10, 20, 5, 2, 30]})
>>> df['colC'] = df['colB'].diff(-1)
>>> df
  colA  colB  colC
0    A    10 -10.0
1    B    20  15.0
2    C     5   3.0
3    D     2 -28.0
4    F    30   NaN

'colC' is the difference between the consecutive row.

>>> df['colD'] = np.where(df['colC'] > 0, 1, 0)
>>> df
  colA  colB  colC  colD
0    A    10 -10.0     0
1    B    20  15.0     1
2    C     5   3.0     1
3    D     2 -28.0     0
4    F    30  -1.0     0

In 'colD' we are marking flag where the difference is greater than 0.

>>> df1['s'] = df1['colD'].shift(-1)
>>> df1
  colA  colB  colC  colD    s 
0    A    10 -10.0     0  1.0 
1    B    20  15.0     1  1.0 
2    C     5   3.0     1  0.0 
3    D     2 -28.0     0  0.0 
4    F    30  -1.0     0  NaN 

In column 's' we shift the value of 'colD'.

>>> df1['flag'] = np.where((df1['colD'] == 1) & (df1['colD'] == df1['s']), 1, 0)
>>> df1
  colA  colB  colC  colD    s  flag
0    A    10 -10.0     0  1.0     0
1    B    20  15.0     1  1.0     1
2    C     5   3.0     1  0.0     0
3    D     2 -28.0     0  0.0     0
4    F    30  -1.0     0  NaN     0

Then 'flag' is required column.

How to Compare Values in two Pandas DataFrames, Calculates the difference of a DataFrame element compared with another element in the DataFrame (default is the element in the same column of the previous row). Periods to shift for calculating difference, accepts negative values. axis{0 or� df1['new column that will contain the comparison results'] = np.where(condition,'value if true','value if false') For our example, here is the syntax that you can add in order to compare the prices (i.e., Price1 vs. Price2) under the two DataFrames:

Need a little bit logic here

s=df.colB.diff().gt(0) # get the diff 
df.loc[df.groupby(s.cumsum()).colA.transform('count').ge(3)&s,'colA'] # then we using count to see which one is more than 3 items (include the line start to two items decreasing )
1    B
Name: colA, dtype: object

pandas.DataFrame.diff — pandas 1.0.5 documentation, Whether to compare by the index (0 or 'index') or columns (1 or 'columns'). level int or label. Broadcast across a level, matching Index values on the passed� pandas.DataFrame.diff¶ DataFrame.diff (self, periods = 1, axis = 0) → ’DataFrame’ [source] ¶ First discrete difference of element. Calculates the difference of a DataFrame element compared with another element in the DataFrame (default is the element in the same column of the previous row). — pandas 1.0.5 documentation, Compare two DataFrame objects of the same shape and return a DataFrame types and values for their elements and column labels, which will return True. Pandas is one of those packages, and makes importing and analyzing data much easier. The most important thing in Data Analysis is comparing values and selecting data accordingly. The “==” operator works for multiple values in a Pandas Data frame too. Following two examples will show how to compare and select data from a Pandas Data frame.

pandas.DataFrame.equals — pandas 1.0.5 documentation, For heterogeneous data (e.g. some of the DataFrame's columns are not all the same These both will raise as you are trying to compare multiple values. float64 dtype: object In [213]: row = next(df_orig.iterrows())[1] In [214]: row Out[ 214]: int� pandas.Series.values¶ property Series.values¶. Return Series as ndarray or ndarray-like depending on the dtype.

Essential Basic Functionality — pandas 0.17.0 documentation, As show in the output image, for Gender= “Male”, the value in New Column is True and for “Female” and NaN values it is False. Example #2: Selecting Data In the� Applying an IF condition in Pandas DataFrame. Let’s now review the following 5 cases: (1) IF condition – Set of numbers. Suppose that you created a DataFrame in Python that has 10 numbers (from 1 to 10). You then want to apply the following IF conditions: If the number is equal or lower than 4, then assign the value of ‘True’

  • Elegant! Thanks.
  • Could you explain what shift does there?
  • shift gives the series "shifted" up or down by n - so shift(-1) gives for each row the value after
  • Nice. It is a generalization of the question asked. Good info. Thanks.