Only show specific groups in a df pandas

pandas group by where clause
pandas groupby to dataframe
pandas groupby multiple columns
pandas groupby filter
pandas groupby aggregate multiple columns
pandas groupby transform
pandas groupby apply
pandas group by count

Hel lo, I need to focus on specific group within a table.

Here is an exemple:

groups col1 
A 3
A 4
A 2
A 1
B 3
B 3
B 4
C 2
D 4
D 3

and I would like to only show groups that contain 3 and 4 but no other number. Here I should get :

groups col1 
B 3
B 3
B 4
D 4
D 3

Here are possible 2 approaches - test values by Series.isin for membership and then get all groups with all Trues by GroupBy.transform and GroupBy.all, last filter by boolean indexing:

df1 = df[df['col1'].isin([3,4]).groupby(df['groups']).transform('all')]
print (df1)
  groups  col1
4      B     3
5      B     3
6      B     4
8      D     4
9      D     3

Another approach is first get all groups values, which NOT contains values 3,4 and pass to another isin function with inverted mask:

df1 = df[~df['groups'].isin(df.loc[~df['col1'].isin([3,4]), 'groups'])]
print (df1)
  groups  col1
4      B     3
5      B     3
6      B     4
8      D     4
9      D     3

pandas.DataFrame.groupby — pandas 1.1.0 documentation, Group DataFrame using a mapper or by a Series of columns. A groupby If True: only show observed values for categorical groupers. If False: show all values� Pandas Categorical array: df.groupby(bins.values) As you can see, .groupby() is smart and can handle a lot of different input types. Any of these would produce the same result because all of them function as a sequence of labels on which to perform the grouping and splitting.


We can also use GroupBy.filter:

new_df=df.groupby('groups').filter(lambda x: x.col1.isin([3,4]).all() )
print(new_df)

  groups  col1
4      B     3
5      B     3
6      B     4
8      D     4
9      D     3

an alternative to remove Series.isin from the lambda function:

df['aux']=df['col1'].isin([3,4])
df.groupby('groups').filter(lambda x: x.aux.all()).drop('aux',axis=1)

Group by: split-apply-combine — pandas 1.1.0 documentation, Transformation: perform some group-specific computations and return a like- indexed object. See the cookbook for some advanced strategies. Of course df.groupby('A') is just syntactic sugar for df.groupby(df['A']) , but it makes life simpler. If the axis is a MultiIndex (hierarchical), group by a particular level or levels. as_index bool, default True. For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output. sort bool, default True. Sort group keys.


Using df.loc[] and then searching by normal logic should work.

import pandas as pd

data = [['A', 3],
        ['A', 4],
        ['A', 2],
        ['A', 1],
        ['B', 3],
        ['B', 3],
        ['B', 4],
        ['C', 2],
        ['D', 4],
        ['D', 3]]
df = pd.DataFrame(data, columns=["col1", "col2"])

df = df.loc[df["col2"] >= 3]
print(df.head())

pandas.core.groupby.DataFrameGroupBy.filter — pandas 1.1.0 , Return a copy of a DataFrame excluding filtered elements. Elements from groups are filtered if they do not satisfy the boolean criterion specified by func. Pandas Subplots. With **subplot** you can arrange plots in a regular grid. You need to specify the number of rows and columns and the number of the plot. Using layout parameter you can define the number of rows and columns. Here we are plotting the histograms for each of the column in dataframe for the first 10 rows(df[:10]).


Group By: split-apply-combine — pandas 0.23.0 documentation, Transformation: perform some group-specific computations and return a like- indexed object. See the cookbook for some advanced strategies. Of course df.groupby('A') is just syntactic sugar for df.groupby(df['A']) , but it makes life simpler. Pandas is one of those packages and makes importing and analyzing data much easier. Let’s discuss all different ways of selecting multiple columns in a pandas DataFrame . Method #1: Basic Method


Pandas DataFrame Group by Consecutive Certain Values, Grouping Pandas DataFrame by consecutive certain values appear in look at the following DataFrame which is created for this intuition only,� Method 1: Using Boolean Variables. # Create variable with TRUE if nationality is USA american = df ['nationality'] == "USA" # Create variable with TRUE if age is greater than 50 elderly = df ['age'] > 50 # Select all cases where nationality is USA and age is greater than 50 df [american & elderly]


Pandas GroupBy: Your Guide to Grouping Data in Python – Real , You can read the CSV file into a Pandas DataFrame with read_csv() : SELECT state, count(name) FROM df GROUP BY state ORDER BY state; here: in the Pandas version, some states only display one gender. df.groupby(['Country', 'Item_Code']).agg({'Y1961': np.sum, 'Y1962': [np.sum, np.mean]}) # Added example for two output columns from a single input column This will display only the group by columns, and the specified aggregate columns. In this example I included two agg functions applied to 'Y1962'.