Boolean Dataframe filter for another Dataframe

pandas dataframe filter multiple conditions
pandas filter function
pandas dataframe filter by column value like
pandas dataframe filter multiple columns
pandas filter lambda
pandas filter by boolean column
pandas where
pandas filter like

The following dataframe df1 contains numerical values

   IDs          Value1      Value2        Value     Value4
   AB              1          1             1       5
   BC              2          2             2       3
   BG              1          1             4       1
   RF              2          2             2       7

and this dataframe df2 contains Boolean values:

   Index          0                1             2         3
   1              True           False          True       True
   2              False          False          True       False
   3              False          False          True       False
   4              False          False          False      False

with the same number of columns and rows.

What I need is to subset df1 in the following manner: get only the columns that in df2 have at least on True value.

Meaning the following:

   IDs          Value1         Value3     Value4
   AB              1              1       5
   BC              2              2       3
   BG              1              4       1
   RF              2              2       7

I have tried the following code:

df2_true = np.any(df2,axis=1)

However, the line above returns a list which can not be used here:

result = df1[:,df2_true]

Any help would be welcome

I think it will work

df1.loc[:,df2.any(0).values.tolist()]
Out[741]: 
     Value1  Value  Value4
IDs                       
AB        1      1       5
BC        2      2       3
BG        1      4       1
RF        2      2       7

Pandas filter columns of a DataFrame with bool, and another DataFrame (dfBool) containing dtype: bool 0 True 1 False 2 False 3 True. What is the easiest way to split this DataFrame by columns into two  One way to filter by rows in Pandas is to use boolean expression. We first create a boolean variable by taking the column of interest and checking if its value equals to the specific value that we want to select/keep. For example, let us filter the dataframe or subset the dataframe based on year’s value 2002.

Use loc with np.any per index (axis=0):

result = df1.loc[:, np.any(df2.values,axis=0)]
print (result)
     Value1  Value  Value4
IDs                       
AB        1      1       5
BC        2      2       3
BG        1      4       1
RF        2      2       7

How To Filter Pandas Dataframe By Values of Column?, One way to filter by rows in Pandas is to use boolean expression. easy to combine one Pandas command with another Pandas command or  In boolean indexing, we will select subsets of data based on the actual values of the data in the DataFrame and not on their row/column labels or integer locations. In boolean indexing, we use a boolean vector to filter the data. Boolean indexing is a type of indexing which uses actual values of the data in the DataFrame.

Your already in the right direction, however since your interested in masking the columns you just need to apply the np.any() operation on the other axis and then apply your boolean mask to the columns attribute of the original dataframe:

masked_df = df1.columns[df2.any(axis=0)]

Filtering Data in Python with Boolean Indexes, Filter and segment data using boolean indexing; Partially match text with .str.​contains() Then, give the DataFrame a variable name and use the .head() method to For example, google.com, google.co.in, and all of the other local Google  When we want to filter our DataFrame by multiple conditions, we can use the Boolean operators. An important note here is that when we want to use Boolean operators with pandas, we must use them as follows: & for and | for or ~ for not; When we apply a Boolean operation on 2 Boolean series with the same size, the Boolean operation will apply for

pandas.DataFrame.filter, Subset the dataframe rows or columns according to the specified index labels. Note that this routine does not filter a dataframe on its contents. The filter is applied  Filter Pandas Dataframe by Row and Column Position Suppose you want to select specific rows by their position (let's say from second through fifth row). We can use df.iloc[ ] function for the same.

7 Ways To Filter A Pandas Dataframe, What pandas dataframe filtering options are available and how to use them has to be a boolean expression) to filter your dataframe using the query function. Pandas Filter Filtering rows of a DataFrame is an almost mandatory task for Data Analysis with Python. Given a Data Frame, we may not be interested in the entire dataset but only in specific rows.

Selecting Subsets of Data in Pandas: Part 2 - Dunder Data, When you perform boolean indexing, each row of the DataFrame (or value of a different Series with a different index than the DataFrame it is indexing on. pandas.DataFrame.filter¶ DataFrame.filter (self: ~FrameOrSeries, items=None, like: Union[str, NoneType] = None, regex: Union[str, NoneType] = None, axis=None) → ~FrameOrSeries [source] ¶ Subset the dataframe rows or columns according to the specified index labels. Note that this routine does not filter a dataframe on its contents.

Comments
  • it should be result = df1.loc[:, np.any(df2.values,axis=1)] right?
  • @user37143 - No, need axis=0
  • @user37143 - it working here because same number of columns and rows, try add one row and axis=1 not working.