Check each value in one column with each value of other column in one dataframe

pandas merge
pandas match values in two dataframes
pandas : compare two columns of different data frame
python compare column values
pandas isin
pandas compare two dataframes row by row
pandas check if two columns are identical
pandas loc

I have following dataframe:

import pandas as pd
dict = {'val1':["3.2", "2.4", "-2.3", "-4.9","0"], 
        'class': ["1", "0", "0", "0", "1"],
       'val2':["3.2", "2.7", "1.7", "-7.1", "0"]} 
df = pd.DataFrame(dict) 
df
    val1    class   val2
0   3.2     1       3.2
1   2.4     0       2.7
2  -2.3     0       1.7
3  -4.9     0      -7.1
4   0.0     1       0.0

I want to check two things: 1) for the sign: if the sign of record in column val1 is not same with the sign of column val2 (for example: sign of the values at index 2 is not same), in this case change the sign of value 2 to the sign of value 1. Desired output is like this:

    val1    class   val2
0   3.2     1       3.2
1   2.4     0       2.7
2  -2.3     0      -1.7
3  -4.9     0      -7.1
4   0.0     1       0.0

2) Second check: if the value in val2 column is within the interval between value in val1 column +2 and -2. For example: record at index 2: 2.4 is in the range [2.7+2: 2.7-2]. If condition is true then i want to change class from 0 to 1. Desired output is :

    val1    class   val2
0   3.2     1       3.2
1   2.4     1       2.7
2  -2.3     1      -1.7
3  -4.9     0      -7.1
4   0.0     1       0.0

First convert values to floats if necessary and then set sign with numpy.sign and then for second use Series.between:

df['val1'] = df['val1'].astype(float)
df['val2'] = df['val2'].astype(float)

df['val2'] *= np.sign(df['val1']) * np.sign(df['val2'])
df['class'] = df['val2'].between(df['val1'] - 2, df['val1'] + 2).astype(int)
#alternative
#df['class'] =  np.where(df['val2'].between(df['val1'] - 2, df['val1'] + 2), 1, 0)
print (df)
   val1  class  val2
0   3.2      1   3.2
1   2.4      1   2.7
2  -2.3      1  -1.7
3  -4.9      0  -7.1
4   0.0      1   0.0

How do I compare columns in different data frames?, If you want to check equals values on a certain column let's say Name you can merge both Dataframes to a new one: mergedStuff = pd.merge(df1, df2,� Check if one value exists in a column. When you need to check if one value exists in a column in Excel, you can do this using the MATCH function or VLOOKUP. Here is a description of both with examples. Check if a value exists in a column using MATCH

Try this:

import numpy as np
# Check 1
df['val2'] = df.apply(lambda x: np.sign(x['val1']) * np.sign(x['val2']) * x['val2'], axis=1)

# Check 2
df['class'] = df.apply(lambda x: int(abs(x['val1'] - x['val2']) < 2) , axis=1)

pandas.DataFrame.equals — pandas 1.1.0 documentation, Test whether two objects contain the same elements. Compare two Series objects of the same length and return a Series where Raises an AssertionError if left and right are not equal. DataFrames df and different_column_type have the same element types and values, but have different types for the column labels ,� Check if value exists in another column with formula. To check if the values are in another column in Excel, you can apply the following formula to deal with this job. 1. First, you can copy the two columns of data and paste them into column A and Column C separately in a new worksheet, leave Column B blank to put the following formula. 2.

I think this will solve your query without using any other library:

def signfunc(x,y):
    if x*y >= 0:
        return y
    else:
        return -1*y

df['val1'] = df['val1'].astype(float)
df['val2'] = df['val2'].astype(float)
df['val2'] = df.apply(lambda x: signfunc(x.val1, x.val2), axis=1)
print(df)

df.loc[abs(df["val1"]-df["val2"])<=2, 'class'] = 1

print(df)

Indexing and selecting data — pandas 0.8.1 documentation, Series: series[label] returns a scalar value; DataFrame: frame[colname] returns a Series You may access a column on a dataframe directly as an attribute: In [ 553]: It's certainly possible to retrieve data slices along the other axes of a DataFrame or Panel. The MultiIndex object has code to explicity check the sort depth. Given a Pandas Dataframe, we need to check if a particular column contains a certain string or not. Overview A column is a Pandas Series so we can use amazing Pandas.Series.str from Pandas API which provide tons of useful string utility functions for Series and Indexes.

pandas.DataFrame.where — pandas 1.1.0 documentation, Where False, replace with corresponding value from other . The callable must not change input Series/DataFrame (though pandas doesn't check it). DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B']) >>> df A B 0 0 1 1 2 3 2 4 5� I'm trying to do a vlookup formula in column N, to match the values in column L with the values in column B and return the value from column E. The data in columns A:F is updated with a macro. The data in J:N is used in a report. I know that the problem is: the date and time in column B has an extra space in between, hence why the vlookup won't

Add a Column in a Pandas DataFrame Based on an If-Else Condition, Need to add a column to your pandas DataFrame based on values found a column to a pandas DataFrame based on the values in other columns of the DataFrame. it can get a bit complicated if we try to do it using an if-else conditional. numpy for data analysis, check out our interactive numpy and pandas course). *** Count unique values in a single column *** Number of unique values in column "Age" of the dataframe : 4 *** Count Unique values in each column including NaN *** Number of unique values in column "Age" including NaN 5 *** Count Unique values in each column *** Count of unique value sin each column : Name 7 Age 4 City 4 Experience 4 dtype

Deriving New Columns & Defining Python Functions, Derive new columns from existing data; Write and test functions using in , if , else Then, give the DataFrame a variable name and use the .head() method to Count the values in the platform column to get an idea of the distribution (for a feel for this, start by creating a new column that is not derived from another column. Often, you may want to subset a pandas dataframe based on one or more values of a specific column. Essentially, we would like to select rows based on one value or multiple values present in a column. Here are SIX examples of using Pandas dataframe to filter rows or select rows based values of a column(s).

Comments
  • as far as I see if val1 is positive and val2 is negative it won't change the sign of val2.
  • @Boendal even if the sign of val1 is positive and in val2 it is negative. val2 should change its sign to sign of val1
  • as shown here: repl.it/repls/FinancialDeadlyStatistic the last row is val2 still negative and is a copy of your solution.
  • @Boendal - Understand, then is necessary use df['val2'] *= np.sign(df['val1']) * np.sign(df['val2'])
  • @Sascha - It is shorcut - it is same like df['val2'] = df['val2'] * np.sign(df['val1']) * np.sign(df['val2'])