In python,how to get the rows from a data frame where a particular string is present in any of the column (String value)

My data frame contains name, age, Task1, Task2, Task3. Now I need to get all the rows that satisfy a string value in either of Task1, Task2, Task3 columns. Say I want to check 'Drafting', key word. If 'Drafting' is present as part of any of these column value, then, that entire row has to be added to resultant frame.

I tried isin() but I am getting true or false. I need to extract such 'N' rows, that contain a particular keyword. I tried, df.columns[df.Task1.str.contains("Drafting")] , but this compare and give single column . Any one know how to use, str.contains or any other method to compare string values of columns and get all rows that satisfy the checking condition.

  Name  Age              Task1    Task2            Task3
0  Ann   43  Drafting a Letter  sending           paking
1  Juh   29            sending   paking  Letter Drafting
2  Jeo   42            Pasting  sending           paking
3  Sam   59            sending  pasting  Letter Drafting

I need to check if the key word 'Drafting' is present in any of the columns[The column contains 3 to 4 words, need to check Drafting is present in this words/sentence]; the result should be:

  Name  Age              Task1    Task2            Task3
0  Ann   43  Drafting a Letter  sending           paking
1  Juh   29            sending   paking  Letter Drafting
3  Sam   59            sending  pasting  Letter Drafting

Or just(Note this will check entire df not specific columns):

df[df.astype(str).apply(lambda x: x.str.contains('Drafting')).any(axis=1)]
#for case insensitive use below
#df[df.astype(str).apply(lambda x: x.str.contains('Drafting',case=False)).any(axis=1)]

  Name  Age              Task1    Task2            Task3
0  Ann   43  Drafting a Letter  sending           paking
1  Juh   29            sending   paking  Letter Drafting
3  Sam   59            sending  pasting  Letter Drafting

A quick comparison of given answers on 20k rows of data-

@Alollz (in comments)

%timeit df.loc[df.filter(like='Task').applymap(lambda x: 'Drafting' in x).any(1)]
25.2 ms ± 2.09 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

@Sergey Bushmanov

%timeit df[df.Task1.str.contains("Drafting") | df.Task2.str.contains("Drafting") | df.Task3.str.contains("Drafting")]
58.7 ms ± 9.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


%timeit df[df.filter(like='Task').apply(lambda x: x.str.contains('Drafting')).any(axis=1)]
88.6 ms ± 12.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit df[df.astype(str).apply(lambda x: x.str.contains('Drafting')).any(axis=1)]
128 ms ± 14.9 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


%timeit  df.loc[df.filter(like='Task').stack().str.split(expand=True).eq('Drafting').any(1).any(level=0)]
290 ms ± 29.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

You may try:

new_df = df[df.Task1.str.contains("Drafting") | df.Task2.str.contains("Drafting") | df.Task3.str.contains("Drafting")]

This will return a new_df with rows containing "Drafting" in any of the "Task1,2,3" columns.

This can be achieved using np.where:

df = pd.DataFrame({
    'Name': ['Ann', 'Juh', 'Jeo', 'Sam'],
    'Age': [43,29,42,59],
    'Task1': ['Drafting a letter', 'Sending', 'Pasting', 'Sending'],
    'Task2': ['Sending', 'Paking', 'Sending', 'Pasting'],
    'Task3': ['Packing', 'Letter Drafting', 'Paking', 'Letter Drafting']

df_new = df.iloc[df.index[np.concatenate(
                np.where(df['Task1'].str.contains('Drafting')) +
                np.where(df['Task2'].str.contains('Drafting')) +


  Name  Age              Task1    Task2            Task3
0  Ann   43  Drafting a letter  Sending          Packing
1  Juh   29            Sending   Paking  Letter Drafting
3  Sam   59            Sending  Pasting  Letter Drafting

You can try something like this,

new_df = df[(df['Task1'] == 'Drafting') | (df['Task2'] == 'Drafting') | (df['Task3'] == 'Drafting')]

This will select all the rows if the columns Task1 or Task2 or Task3 contains 'Drafting`.

  • its advisable to create a sample (small) dataframe which demonstrates the issue and post as text. Also please do post an expected output showing the difference inpyt v/s output. This will help users to get a clear picture of what is needed and will drive more answers. Thanks
  • Hi, lay out please a small example of the original data
  • The string contains logic you want to implement is for complete word matches? Should 'raft' match with 'Drafting' or only the isolated word 'raft' (which may appear in a sentence: 'I like to use a raft'?
  • yes I need to get the columns that contain exactly 'Drafting' . No other combination(Regular expression is not useful).
  • I like this, though may need to match on something like '(?<![a-zA-Z\-])Drafting' if a word like redrafting is not meant to match.
  • @anky many thanks, as this is working for more than one word. Many thanks for your kind reply.
  • @Shara My pleasure. :)
  • @anky_91 could you please help me with the problem…
  • @dondapati str.contains('|'.join(list_of_words)) ?? if not , please can you post a fresh question. :) Thanks
  • how about df.filter(like='Task').apply(lambda x: x.str.contains('Drafting')).any(axis=1) ??
  • Bit less than previous but not close to @Sergey.
  • thanks for that. :) think calling the series individually wins on time but calling them individually is manual.
  • I think the fastest is df.loc[df.filter(like='Task').applymap(lambda x: 'Drafting' in x).any(1)]. Though at the expense of the Series.str.contains NaN and error handling.
  • equals, not contains
  • I have to check a key word in the sentence of a column. Not the exact keyword. I have added a sample result requirement.