Return only a single value for a column in cases where there are multiple rows

multiple row subquery
single row subquery operators
multiple column subquery
single-row subquery returns how many rows
subquery returning multiple values in sql
sql case when multiple values
sql subquery
multiple case when sql

I would like to return a single arbitrarily selected value for a query against a data source that has multiple rows.

The raw data
user_id   account   role
paa2013   52501050  PD/PI
paa2013   52501050  Principal Investigator
What I want
user_id   account   role
paa2013   52501050  PD/PI
My query
select distinct 
  user_id, 
  account,
  case 
    when role = 'PD/PI' then 'PD/PI'
    when role = 'Principal Investigator' then 'Principal Investigator'
  end  
from table
where account = '52501050' 
group by 
  user_id, 
  account,
  case 
    when role = 'PD/PI' then 'PD/PI'
    when role = 'Principal Investigator' then 'Principal Investigator'
  end
What I get
user_id   account   role
paa2013   52501050  PD/PI
paa2013   52501050  Principal Investigator

Thanks for any help!

To literally answer your question, you just need to use MAX() as PD comes after Pr.

SELECT
  user_id,
  account,
  MAX(role)   AS max_role
FROM
  table
WHERE
  account = '52501050'
GROUP BY
  user_id,
  account

To be more generalised there are a lot of options.

WITH
  roles AS
(
  SELECT 1 AS rank, 'PD/PI' AS role
  UNION ALL
  SELECT 2 AS rank, 'Principal Investigator' AS role
  UNION ALL
  SELECT 3 AS rank, 'another' AS role
),
  grouped_data AS
(
  SELECT
    table.user_id,
    table.account,
    MIN(roles.rank)  AS min_role_rank
  FROM
    table
  INNER JOIN
    roles
      ON roles.role = table.role
  GROUP BY
    table.user_id,
    table.account
)
SELECT
  *
FROM
  grouped_data
INNER JOIN
  roles
    ON roles.role = grouped_data.min_role_rank

Or...

WITH
  ranked_data AS
(
  SELECT
    table.*,
    ROW_NUMBER() OVER (PARTITION BY table.user_id,
                                    table.account
                           ORDER BY role_rank.id
                      )
                         AS user_role_rank
  FROM
    table
  CROSS APPLY
  (
    SELECT
      CASE table.role
        WHEN 'PD/PI'                  THEN 1
        WHEN 'Principal Investigator' THEN 2
        WHEN 'an other'               THEN 3
                                      ELSE 4
      END
          AS id
  )
    role_rank
)
SELECT
  *
FROM
  ranked_data 
WHERE
  user_role_rank = 1

Or...

WITH
  roles AS
(
  SELECT 1 AS rank, 'PD/PI' AS role
  UNION ALL
  SELECT 2 AS rank, 'Principal Investigator' AS role
  UNION ALL
  SELECT 3 AS rank, 'another' AS role
),
  ranked_data AS
(
  SELECT
    table.*,
    ROW_NUMBER() OVER (PARTITION BY table.user_id,
                                    table.account
                           ORDER BY roles.rank
                      )
                         AS user_role_rank
  FROM
    table
  INNER JOIN
    roles
      ON roles.role = table.role
)
SELECT
  *
FROM
  ranked_data 
WHERE
  user_role_rank = 1

In a more perfect world, you would have one user or account table, that is constrained so that this can't happen. Then a second user_role table for the 0..many roles that a user/account may be associated to.

 id | account                user_id | role_id
----+---------              ---------+---------
 11 | aaaaaaa                   11   |     1
 22 | bbbbbbb                   11   |     2
                                22   |     2
                                22   |     3

Then you'd have a role table with things such as the ranking ordinals....

 role_id | rank | name | etc
---------+------+------+-----
     1   |  30  |  aa  | ???
     2   |  10  |  bb  | ???
     3   |  20  |  cc  | ???

Then the query becomes relatively concise...

SELECT
  *
FROM
  user
CROSS APPLY
(
  SELECT TOP 1 role.*
    FROM user_role
    JOIN role ON role.id = user_role.role_id
   WHERE user_role.user_id = user.user_id
ORDER BY role.rank
)
  AS role

(This demonstrates both a different structure and a different approach, either or both may be helpful to you)

EDIT:

I've also noticed that SQL SERVER now supports WITH TIES to ive yet another approach. *(Similar to the ROW_NUMBER() approach, with slight shorter code...

  SELECT TOP(1) WITH TIES
    table.*
  FROM
    table
  CROSS APPLY
  (
    SELECT
      CASE table.role
        WHEN 'PD/PI'                  THEN 1
        WHEN 'Principal Investigator' THEN 2
        WHEN 'an other'               THEN 3
                                      ELSE 4
      END
          AS id
  )
    role_rank
  ORDER BY
    ROW_NUMBER() OVER (PARTITION BY table.user_id,
                                    table.account
                           ORDER BY role_rank.id
                      )

This can be confusing at first. It selects the first row (TOP(1)) but also all the rows that are tied with it based on the ORDER BY. So, it's functionally the same as doing WHERE ROW_NUMBER() = 1 (But SQL Server doesn't allow ROW_NUMBER() to be in a WHERE clause.)

Database Systems: Design, Implementation, & Management, The subquery returns all rows from Table P. UPDATE PRODUCT SET Again, in this case, there is only one column of data with multiple value instances. For instance, you can request the information from only the first three rows in the table. DISTINCT: Allows you to request information from only one row of identical rows. For instance, in a Login table, you can request loginName but specify no duplicate names, thus limiting the response to one record for each member.

You can use row_number() with an ORDER BY clause in which you assign priorities to the roles.

SELECT user_id,
       account,
       role
       FROM (SELECT user_id,
                    account,
                    role,
                    row_number() OVER (PARTITION BY user_id,
                                                    account
                                       ORDER BY CASE role
                                                  WHEN 'PD/PI' THEN
                                                    1
                                                  WHEN 'Principal Investigator' THEN
                                                    2
                                                  ...
                                                END) rn
                    FROM table) x
       WHERE rn = 1;

Database Systems: Design, Implementation & Management, same attributes. The subquery returns all rows from Table P. Again, in this case, there is only one column of data with multiple value instances. This type of​  A CASE statement can return only single column not multiple columns. You need two different CASE statements to do this

Simply use 'Limit' function if you want to retain the top row for a given selection of columns. The argument next to Limit function is for number of rows that should be returned satisfying a given query.

select user_id, account, role from raw_data limit 1;

However, if you want to retain first entry for a given userid-account-role combination, subset the data to given condition and use the limit. For ex, below patch will restrict the select query to a particular account (= 52501050) and will return the top row.

select user_id, account, role from raw_data where account = '52501050' limit 1;

Single-row and multiple-row subqueries, The subquery then returns a set of rows, but only a single column. The IN keyword treats each value as a member of a set and tests whether each row in the main  If we fill the formula down the cells in column “G”, the App named “Fightrr” appears repeatedly, a behavior like the earlier VLOOKUP results. We need to find a way to have the row_num’s return value change from “3” to “4” to “5” to “7”.

SQL CASE, CASE can include multiple conditions and be used with aggregate functions. This table is pretty self-explanatory—one row per player, with columns that describe attributes But what if you don't want null values in the is_a_senior column? the WHEN / THEN statements will get evaluated in the order that they'​re written. Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. Learn more Check multiple rows for value, return only row with MAX/MIN

4. Inserting, Updating, Deleting - SQL Cookbook [Book], But you can also update whole sets of records at once, and in very powerful ways​. Use the INSERT statement with the VALUES clause to insert one row at a time​: In this case, all columns will be set to their default values. You want to take rows returned by a query and insert those rows into multiple target tables. From the title you might think that, but as you see that's not quite right. I'm wanting to concatenate many rows into a single text string AND pair them with their Persons based on a third table. I'm looking for the query that will do all of this, not just the concatenation. – radicalbiscuit Dec 5 '12 at 19:29

SQL joins and how to use them, In most cases this join condition is created using the primary key of one table and The color_id column of the shapes table is a Foreign Key which references the id join table contains only rows where there is a definite match between the values in The query above returns the addresses and users tables, cross joined. Vlookup multiple values and return results in a column' and using same cells (a3:b13) the other only difference is that the name in my case 'Adam' is not in cell D2 as per example above but in D5 and I want the results in D6 no in D3 as per example above.

Comments
  • Do you only have those two values (PD/PI and Principal Investigator)? Or are there other values in the role column? And which SQL Dialect are you using? (MySQL, MS SQL Server, PostgreSQL, Oracle, etc?)
  • I'm using SQL Server. There are 5,000 rows with different combinations of person IDs, accounts, and roles. The ultimate goal is to return one role per account and person ID.
  • I suppose I could do a subquery and assign an ordinal value to each role. For example, PD/PI = 1, Principal Investigator = 2, etc. and then take the MIN or MAX of that.
  • @PaulAlbert Put your ordinals in a table, but otherwise, yeah.
  • Congrats on solving this three times. :-) I ended up using the last option you gave me. Thanks so much!
  • LIMIT in SQL Server should be TOP.
  • @Eric - To be fair, the answer pre-dated the tag (which was not included in the original question).
  • @MatBailie Shouldn't have answered until sure which dbms is used.
  • @eric You were still unnecessarily harsh in my opinion