Group by and aggregate by multiple columns

pandas groupby aggregate multiple columns
groupby multiple columns pandas
pandas groupby sum multiple columns
pandas groupby average multiple columns
pandas groupby aggregate to list
pandas aggregate
pandas groupby apply
pandas groupby count

Example tables

taccount

tuser

tproject

What I want to achieve:

accountName count(u.id) count(p.id)
-----------------------------------
Account A   1           1
Account B   1           1
Account C   2           3

In other words I want a single query to join these tables together and count user's and project's per account

I tried:

SELECT
    a.name as "accountName",
    count(u.name),
    count(p.id)
FROM "taccount" a
INNER JOIN "tuser" u ON u.account_id = a.id
INNER JOIN "tproject" p ON p.admin_id = u.id
GROUP BY u.name, a.name, p.id

But it's not grouping by account. It's giving me the following result

Any advice?

You can try below

SELECT
    a.name as "accountName",
    count(distinct u.name),
    count(p.id)
FROM "taccount" a
INNER JOIN "tuser" u ON u.account_id = a.id
INNER JOIN "tproject" p ON p.admin_id = u.id
GROUP BY  a.name

Apply multiple functions to multiple groupby columns, A dictionary mapped from column names to aggregation functions is still a perfectly good way to perform an aggregation. df.groupby('group').agg({'a':['sum',​  Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. Learn more Python pandas groupby aggregate on multiple columns, then pivot

When you do Aggregate Function and If there are Column are not do Aggregate you must put in your Group By, because Aggregate functions perform a calculation on a set of rows and return a single row.

SELECT
   a.name as "accountName",
   count(distinct u.name),
   count(p.id)
FROM 
   "taccount" a
   INNER JOIN "tuser" u ON u.account_id = a.id
   INNER JOIN "tproject" p ON p.admin_id = u.id
GROUP BY  
   a.name

So you need just Group By your column "accountName"

Summarising, Aggregating, and Grouping data in Python Pandas , Renaming grouped aggregation columns. We'll examine two methods to group Dataframes and rename the column results in your work. Grouping, calculating  In Texis, the GROUP BY clause is used to divide the rows of a table into groups that have matching values in one or more columns. The form of this clause is: GROUP BY column-name1 [,column-name2] and it fits into the SELECT expression in the following manner. SELECT column-name1 [,column-name2]

change your group by column name

SELECT
    a.name as "accountName",
    count(distinct u.account_id),
    count(p.id)
FROM "taccount" a
INNER JOIN "tuser" u ON u.account_id = a.id
INNER JOIN "tproject" p ON p.admin_id = u.id
GROUP BY  a.name

Combining multiple columns in Pandas groupby with dictionary , To calculate the Total_Viewers we have used the .sum() function which sums up all the values of the respective rows. I need to do two group_by function, first to group all countries together and after that group genders to calculate loan percent. Total loan amount = 2525 female_prcent = 175+100+175+225/2525 = 26.73 male_percent = 825+1025/2525 = 73.26 The output should be as below:

this will work:

select a.name,count(u.id),count(p.id) from 
taccount a,tuser b, tproject where
a.id=b.account_id and
b.id=c.admin_id
group by a.name;

Pandas GroupBy, Groupby concept is really important because it's ability to aggregate data efficiently, both in Now we select an object grouped on multiple columns. filter_none. Multiple columns can be included in the GROUP BY clause, separated by commas. In this case, the grouping is done based on each unique combination of the values in the columns, in the given order.

Group By: split-apply-combine, For DataFrame objects, a string indicating a column to be used to group. In [41​]: s.groupby(level='second').sum() Out[41]: second one 0.980950 two 1.991575  groupBy and aggregate on multiple DataFrame columns Similarly, we can also run groupBy and aggregate on two or more DataFrame columns, below example does group by on department, state and does sum() on salary and bonus columns. This yields the below output.

'groupby' multiple columns and 'sum' multiple columns with different , 'groupby' multiple columns and 'sum' multiple columns with different types #​13821. Closed. pmckelvy1 opened this issue on Jul 27, 2016 · 7 comments. Closed  The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country". The GROUP BY statement is often used with aggregate functions (COUNT, MAX, MIN, SUM, AVG) to group the result-set by one or more columns.

Spark Groupby Example with DataFrame, and finally, we will also see how to do group and aggregate on multiple columns. import spark.implicits._ val simpleData = Seq((  LINQ Group By Multiple Columns. Finally, we get to our original question. Let’s now group by multiple columns. We’ll imagine that we want to find not just the number of recipes per author, but also the number of recipes an author has per the different categories. Here’s the SQL we want to duplicate in LINQ:

Comments
  • You typically GROUP BY the columns you select, except those who are arguments to set functions. E.g try GROUP BY u.name.
  • give tables ...........
  • Most people here want sample table data as formatted text, not as images.
  • give them as texts
  • Thanks, that distinct was needed indeed. Can't accept as answer yet, still have to wait a few mins.
  • @Jim-Y, yes without distinct you'll not get proper count for u.name
  • Yes.. It will work.. But you are doing implicit join.. And the community are not support this join
  • @dwir182 it will work can i ask whats wrong with it??
  • Even postgresql see this as same semantic.. But community are not support this type.. :)