Pandas filter by more than one "contains" for not one cell but entire column

pandas groupby
pandas dataframe filter multiple values
pandas filter multiple columns
pandas select rows by multiple conditions
pandas query
pandas filter by year
pandas filter not equal
pandas filter by another column

I have a bunch of dataframes, and I want to find the dataframes that contains both the words i specify. For example, I want to find all dataframes that contain the words hello and world. A & B would qualify, C would not.

I've tried: df[(df[column].str.contains('hello')) & (df[column].str.contains('world'))] which only picks up B and df[(df[column].str.contains('hello')) | (df[column].str.contains('world'))] which picks up all three.

I need something that picks only A & B

A=

    Name    Data   
0   Mike    hello    
1   Mike    world    
2   Mike    hello   
3   Fred    world
4   Fred    hello
5   Ted     world

B =

    Name    Data   
0   Mike    helloworld
1   Mike    world    
2   Mike    hello   
3   Fred    world
4   Fred    hello
5   Ted     world

C=

    Name    Data   
0   Mike    hello
1   Mike    hello    
2   Mike    hello   
3   Fred    hello
4   Fred    hello
5   Ted     hello

You want a single bool value for if 'hello' is found anywhere and 'world' is found anywhere in one column:

df.Data.str.contains('hello').any() & df.Data.str.contains('world').any()

If you have a list of words and need to check over the entire DataFrame try:

import numpy as np

lst = ['hello', 'world']
np.logical_and.reduce([any(word in x for x in df.values.ravel()) for word in lst])

Sample Data
print(df)
   Name   Data   Data2
0  Mike  hello  orange
1  Mike  world  banana
2  Mike  hello  banana
3  Fred  world  apples
4  Fred  hello   mango
5   Ted  world    pear

lst = ['apple', 'hello', 'world']
np.logical_and.reduce([any(word in x for x in df.values.ravel()) for word in lst])
#True

lst = ['apple', 'hello', 'world', 'bear']
np.logical_and.reduce([any(word in x for x in df.values.ravel()) for word in lst])
# False

Multiple Criteria Filtering, Applying multiple filter criter to a pandas DataFrame¶. In [1]:. import pandas as pd . In [2]:. url = 'http://bit.ly/imdbratings' # Create movies  Filter pandas dataframe by rows position and column names Here we are selecting first five rows of two columns named origin and dest. df.loc[df.index[0:5],["origin","dest"]]


Using

import re 

bool(re.search(r'^(?=.*hello)(?=.*world)', df.sum().sum())
Out[461]: True

Pandas dataframe filter with Multiple conditions, Selecting or filtering rows from a dataframe can be sometime tedious if you don't know the exact methods and how to filter rows with multiple  One way to filter by rows in Pandas is to use boolean expression. We first create a boolean variable by taking the column of interest and checking if its value equals to the specific value that we want to select/keep.


If hello and world are standalone strings in your data, df.eq() should do the job and you don't need str.contains. Its not a string method and works on entire dataframe.

(((df == 'hello').any()) & ((df == 'world').any())).any()

True

Select Rows With Multiple Filters, Select rows of the dataframe where df.score is greater than 1 and less and 5 df[( df['score'] > 1) & (df['score'] < 5)]  Multiple Criteria Filtering Applying multiple filter criter to a pandas DataFrame This introduction to pandas is derived from Data School's pandas Q&A with my own notes and code.


How do I apply multiple filter criteria to a pandas DataFrame , DataFrame Rows Based on multiple print("Filtering Series" , filteringSeries,  Sort the pandas Dataframe by Multiple Columns In the following code, we will sort the pandas dataframe by multiple columns (Age, Score). We will first sort with Age by ascending order and then with Score by descending order 1


Python Pandas : Select Rows in DataFrame by conditions on , Filtering a pandas DataFrame by multiple columns results in a new DataFrame containing only the rows from the original DataFrame that have values meeting  And if get_group() isn't the right method to do "many-grp-to-one-df", we need either a more advanced get_groups(), or a method with a different name, to satisfy this need. Yes we can do pd.concat([grouped.get_group(name) for name in groups]) , but we can also do something more elegant and powerful.


How to filter a pandas DataFrame by multiple columns in Python, Applying multiple filters to a pandas DataFrame results in a DataFrame that only contains values that satisfy the various filter conditions. Use boolean indexing to   pandas.DataFrame.filter¶ DataFrame.filter (self: ~ FrameOrSeries, items = None, like: Union [str, NoneType] = None, regex: Union [str, NoneType] = None, axis = None) → ~FrameOrSeries [source] ¶ Subset the dataframe rows or columns according to the specified index labels. Note that this routine does not filter a dataframe on its contents.