Keeping only the rows that satisfies a condition with respect to an another column

pandas select rows by multiple conditions
pandas select columns by condition
pandas filter rows by condition
drop column pandas
pandas loc
pandas count rows with condition
pandas drop index
pandas find value in any column

So right now I have a Pandas DF like this:

Name     Year      Label

Jeff     2018        0
Jeff     2019        1
Matt     2018        0
John     2018        0
Mary     2018        1
Mary     2019        1

I want to keep all the rows for each unique name that has both Years 2018 and 2019.

The result should look something like this:

Name     Year      Label

Jeff     2018        0
Jeff     2019        1
Mary     2018        1
Mary     2019        1

Matt and John were removed because they didn't have both 2018 AND 2019.

Any ideas would be appreciated!

Using crosstab select all the name with two year , then using isin

s=pd.crosstab(df.Name,df.Year)[[2018,2019]].eq(1).sum(1)==2
df.loc[df.Name.isin(s.index[s])]
Out[463]: 
   Name  Year
0  Jeff  2018
1  Jeff  2019
4  Mary  2018
5  Mary  2019

How to Drop rows in DataFrame by conditions on column values, How do you remove rows from a DataFrame with a condition on a column? tobias-May 23rd, 2019 at 11:56 pm none Comment author #25844 on Pandas : count rows in a dataframe | all or those only that satisfy a condition by thispointer.com hi, thanks, good examples! In example 1: “Count the number of rows in a dataframe for which ‘Age’ column contains value more than 30 i.e.”

Using groupby + transform:

m1 = df.Year.eq(2018)   
m2 = df.Year.eq(2019)

df[m1.groupby(df.Name).transform('any') & m2.groupby(df.Name).transform('any')]

  Name  Year
0  Jeff  2018
1  Jeff  2019
4  Mary  2018
5  Mary  2019

Generalising:

years = [2018, 2019]
M = [df.Year.eq(year) for year in years]
df[np.logical_and.reduce([m.groupby(df.Name).transform('any') for m in M])]

   Name  Year
0  Jeff  2018
1  Jeff  2019
4  Mary  2018
5  Mary  2019

How to change values in a pandas DataFrame column based on a , How do you select rows of pandas DataFrame using multiple conditions? Selecting rows based on multiple column conditions using '&' operator. Code #1 : Selecting all the rows from the given dataframe in which ‘Age’ is equal to 21 and ‘Stream’ is present in the options list using basic method.

You can do an inner merge on 'Name', once selecting both years independently in df, to get the 'Name' that have both years, then use isin:

df.loc[df.Name.isin(df[df.Year == 2018].merge(df[df.Year == 2019],
                                              on='Name',how='inner').Name)]
   Name  Year  Label
0  Jeff  2018      0
1  Jeff  2019      1
4  Mary  2018      1
5  Mary  2019      1

Drop rows from the dataframe based on certain condition applied on , How do you change a value in a DataFrame based on a condition? I assume you have a data.frame, array, matrix called Mat with A, B, C as column names; then all you need to do is: In the case of one condition on one column, lets say column A. Mat[which(Mat[,'A'] == 10), ] In the case of multiple conditions on different column, you can create a dummy variable.

filter: Return rows with matching conditions in dplyr: A Grammar of , Retain all those rows for which the applied condition on the given column evaluates to True . to filter out such rows from the dataset which satisfy the applied condition. As we can see in the output, the returned dataframe only contains those Create a new column in Pandas DataFrame based on the existing columns  Unlike base subsetting with [, rows where the condition evaluates to NA are dropped. filter: Return rows with matching conditions in dplyr: A Grammar of Data Manipulation rdrr.io Find an R package R language docs Run R in your browser R Notebooks

SQL Tuning: Generating Optimal Execution Plans, Only rows where the condition evaluates to TRUE are kept. eye_color == "​black") # The filtering operation may yield different results on grouped # tibbles Here is how to filter the columns # `mass` and `height` relative to objects of the same  Varun July 8, 2018 Python Pandas : Select Rows in DataFrame by conditions on multiple columns 2018-08-19T16:56:45+05:30 Pandas, Python No Comment In this article we will discuss different ways to select rows in DataFrame based on condition on single or multiple columns.

Subset rows using column values, Individual-condition filter selectivity The fraction of table rows that satisfy a single condition on that table. Innerjoin An Rows are returned only in combination. Selecting pandas DataFrame Rows Based On Conditions. 20 Dec 2017. Preliminaries # Import modules import pandas as pd import numpy as np # Create a dataframe raw_data

Comments
  • Does your year column only have these two years? Or is your actual problem to find names that are present in all groups? The solutions are different, so please be specific.
  • the columns ONLY has 2018 and 2019