Using FIRST_VALUE without including inner columns in group by

I am using a table that looks like this:

userID, eventDate, eventName
1  2019-01-01  buySoup
2  2019-01-01  buyEggs
2  2019-01-03  buyMilk
2  2019-01-04  buyMilk
3  2019-01-02  buyBread
3  2019-01-03  buyBread

My current query is:

SELECT
    userID,
    FIRST_VALUE(eventName) OVER (
        PARTITION BY userID ORDER BY eventDate ASC
    ) AS firstBought 
FROM table 
GROUP BY userID

I feel like this should return:

userID, firstBought
1  buySoup
2  buyEggs
3  buyBread

Instead, it gives the error:

'ERROR: Column "table.eventName" must appear in the GROUP BY clause or be used in an aggregate function'

Is there a way to grab this value without including it in the group by function, or creating a sub query? I'm using PostgreSQL.

If I do include it in the group by clause, it returns

userID, firstBought
1  buySoup
2  buyEggs
2  buyEggs
2  buyEggs
3  buyBread
3  buyBread

I understand that I could make it a subquery and then group by userID, firstBought, but I'd rather not create another subquery.

Instead of group by, use select distinct:

select distinct userID,
       FIRST_VALUE(eventName) over (partition by userID order by eventDate ASC) as firstBought 
from table ;

Or, you can use arrays:

select userId,
       (array_agg(eventName order by eventDate))[1] as firstBought
from table
group by userId;

Postgres doesn't have a "first" aggregation function, but this works pretty well.

Using GROUP BY with FIRST_VALUE and LAST_VALUE, Column '#MinuteData.MinuteBar' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. FIRST_VALUE is not an aggregate function. It is an analytic window function. So your base query does not need a GROUP BY clause. It should be re-written as: SELECT userID, FIRST_VALUE(eventName) over (PARTITION BY userID ORDER BY eventDate ASC) AS firstBought FROM table;

I agree with A. Saunders.

You need an outside query.

With the exception of SELECT DISTINCT , which actually boils down to a GROUP BY all columns of the SELECT list, you can't mix OLAP and GROUP BY functions into the same SELECT.

So , if you do have MAX(), you have to:

WITH -- your input data ...
input(userID,eventDate,eventName) AS (
          SELECT 1,DATE '2019-01-01','buySoup'
UNION ALL SELECT 2,DATE '2019-01-01','buyEggs'
UNION ALL SELECT 2,DATE '2019-01-03','buyMilk'
UNION ALL SELECT 2,DATE '2019-01-04','buyMilk'
UNION ALL SELECT 3,DATE '2019-01-02','buyBread'
UNION ALL SELECT 3,DATE '2019-01-03','buyBread'
)
,
getfirstbought AS (
  SELECT 
    userid
  , eventdate
  , FIRST_VALUE(eventname) OVER (
      PARTITION BY userid ORDER BY eventdate
   ) AS firstbought
  FROM input
)
SELECT
  userid
, firstbought
, MAX(eventdate) AS maxdt
FROM getfirstbought
GROUP BY 1,2;
-- out  userid | firstbought |   maxdt    
-- out --------+-------------+------------
-- out       2 | buyEggs     | 2019-01-04
-- out       3 | buyBread    | 2019-01-03
-- out       1 | buySoup     | 2019-01-01
-- out (3 rows)
-- out 
-- out Time: First fetch (3 rows): 22.157 ms. All rows formatted: 22.208 ms

SQL Server FIRST_VALUE() Function Explained By Practical , In this tutorial, you will learn how to use the SQL Server FIRST_VALUE() function to get The scalar_expression can be a column, subquery, or expression that This example uses FIRST_VALUE() function to return category name with the The PARTITION BY clause was not specified therefore the whole result set was� To group rows into groups, you use the GROUP BY clause. The GROUP BY clause is an optional clause of the SELECT statement that combines rows into groups based on matching values in specified columns. One row is returned for each group. You often use the GROUP BY in conjunction with an aggregate function such as MIN, MAX, AVG, SUM, or COUNT to

I guess PostgreSQL's DISTINCT ON could do the trick:

SELECT DISTINCT ON (userid)
       userid, eventdate, eventname
FROM "table"
ORDER BY (eventdate);

This will give you the row per userid with the minimum eventdate.

Ignoring NULLs with FIRST_VALUE - Data with Bert, We've got a an integer identity column, two groups of rows, and NULLs that If we write a query that uses the FIRST_VALUE function, you'll notice that our ## Data d INNER JOIN ( SELECT DISTINCT GroupId, FIRST_VALUE(Value1) Both of the above queries will return the first value without NULLs. In SQL Server you can only select columns that are part of the GROUP BY clause, or aggregate functions on any of the other columns. I've blogged about this in detail here. So you have two options: Add the additional columns to the GROUP BY clause: GROUP BY Rls.RoleName, Pro.[FirstName], Pro.[LastName] Add some aggregate function on the relevant

FIRST_VALUE is not an aggregate function. It is an analytic window function. So your base query does not need a GROUP BY clause. It should be re-written as:

SELECT 
        userID,
        FIRST_VALUE(eventName) over (PARTITION BY userID ORDER BY eventDate ASC) AS firstBought
FROM table;

From one of your above comments it sounds like there are other functions that you are using including aggregate functions like MAX. To accomplish what you are trying to do, you will need to use the above query as a subquery. This will allow you to use aggregate functions and get unique values from your base query. The query can look something like this (I added a price column as an example).

SELECT userID, firstBought, MAX(price)
FROM (
        SELECT userID, price, FIRST_VALUE(eventName) over (partition by userID order by eventDate ASC) as firstBought 
        from test
) x
GROUP BY userId, firstBought;

This should do the trick! You can use other aggregate functions on the outside query and an additional window functions in the subquery.

PostgreSQL FIRST_VALUE() Function By Practical Examples, In this tutorial, you will learn how to use the PostgreSQL FIRST_VALUE() function to The expression can be an expression, column, or subquery evaluated against And for each product group, it returns the product with the lowest price: Was this tutorial helpful ? YesNo. Previous PostgreSQL DENSE_RANK Function. The OVER clause lets us execute the COUNT function without the need for a group by and in the above example it will return the count of all records returned in the query.. As its a windowing function though we are able to do some clever things like for example say we wanted to count the number of records that had the same first two characters we could add a PARTITION clause like we did in the

SQLite FIRST_VALUE() Function By Practical Examples, In this tutorial, you will learn how to use the SQLite FIRST_VALUE() function to It is not possible to use a subquery or another window function in the expression . printf() function format the numeric values in the Bytes column with commas (,). SQLite Self-Join � SQLite Full Outer Join � SQLite Group By � SQLite Having� Step 1: Select the data that you are using to group the column in excel. Step 2: Go to data option in the excel toolbar and select group option in the outline toolbar as shown in below screenshot. Step 3: When you click on the group, it will enable you to group the particular column in your excel spreadsheet.

SQL FIRST_VALUE Function, This tutorial shows you how to use the SQL FIRST_VALUE() function to return the first The FIRST_VALUE() is a window function that returns the first value in an BY salary ) lowest_salary FROM employees e INNER JOIN departments d ON to get first value in an ordered set of values. Was this tutorial helpful ? YesNo. For each list of variable arguments, we want to group using the first variable and then summarise the grouped data frame by calculating the mean of the second variable. Here, dynamic argument construction really comes into account, because we programmatically construct the arguments of summarise_() , e.g. mean_mpg = mean(mpg) using string

FIRST_VALUE (Transact-SQL), Is the value to be returned. scalar_expression can be a column, subquery, or other If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the logical order in which the The following example uses FIRST_VALUE to return the name of the product� group by students. studentId, name, surname, class, gender, birthdate, point We must write all column names that was listed after the group by clause like the example. We can’t use students.* after group by clause in sql.

Comments
  • Although that's useful, and works exactly as intended, my table actually also includes more columns, and my query includes max() and other aggregate functions on those columns. This means I do need the group by at the end. Is there any other solution?
  • @Cyborgcanoe . . . This answer provides two solutions. The second one uses group by.
  • thank you, the second answer is great. Unfortunately, I'm actualy using Vertica Analytic Database v9.1.1-5, a branch of postgres which doesn't include array_agg. Thanks anyways!
  • @Cyborgcanoe . . . You should correctly tag your questions.