GROUP BY and Mysql 5.7 - wrong grouping

mysql group by
mysql enable only_full_group_by
mysql 5.7 having clause
mysql group by non aggregate columns
expression #2 of select list is not in group by clause and contains nonaggregated column
mysql-error 1055
mysql any_value
mysql group by default

My hosting changed mysql version from 5.6 to 5.7 (only_full_group_by disabled) and I got a problem with my query.

SELECT  `id`, `category`, `name`, `number`, `lang`
FROM `test`
WHERE `category` = 'Cat1'
ORDER BY FIELD(lang, 'EN', 'JP')

This shows good results like this:

1 Cat1 Test1 23 EN
2 Cat1 Test2 21 EN
2 Cat1 Test1 23 JP
4 Cat1 Test1 23 JP
5 Cat1 Test2 21 JP

Then I am grouping results to get just one, like this:

SELECT *
FROM
(
SELECT  `id`, `category`, `name`, `number`, `lang`
FROM `test`
WHERE `category` = 'Cat1'
ORDER BY FIELD(lang, 'EN', 'JP')
) as table
GROUP BY number
ORDER BY number DESC

In mysql 5.6 it worked. In 5.7 not.

Result from 5.6:

1 Cat1 Test2 21 EN
2 Cat1 Test1 23 EN

Result from 5.7:

1 Cat1 Test2 21 JP
2 Cat1 Test1 23 EN

Why in mysql 5.7 GROUP BY doesn't take first element, but random one?

Using GROUP BY without aggregate function picks values arbitrarily. MySQL doesn't ensure this.

SELECT 
T.*
FROM
`test` T
INNER JOIN
(
    SELECT  
    MIN(`id`) min_id, 
    `category`,
    `number`
    FROM `test`
    WHERE `category` = 'Cat1'
    GROUP BY `number`
) as t
ON T.`id` = t.min_id AND T.category = t.category AND t.`number` = T.`number`
ORDER BY T.number DESC;

EDIT:

In order to get a single row for each number under category Cat1 with lang='EN' as highest priority :

SELECT 
*
FROM 
(
    SELECT 
    *,
    IF(@sameNumber = `number`, @rn := @rn + 1 ,
       IF(@sameNumber := `number`, @rn := 1, @rn := 1)
       ) AS groupWiseRankNumber
    FROM test 
    CROSS JOIN (SELECT @sameNumber := NULL , @rn := 1) var
    WHERE category = 'Cat1'
    ORDER BY `number` , FIELD(lang, 'EN','JP')
) AS t
WHERE t.groupWiseRankNumber <= 1;

MySQL 5.7 Reference Manual :: 12.20.3 MySQL Handling , MySQL 5.7.5 and later also permits a nonaggregate column not named in a GROUP that the selected column is functionally dependent on a grouping column. ERROR 1140 (42000): In aggregated query without GROUP BY, expression #1  A MySQL extension permits a column that does not appear in the GROUP BY list to be named in the select list. (For information about nonaggregated columns and GROUP BY, see Section 12.20.3, “MySQL Handling of GROUP BY”.)

You are merely observing what MySQL documentation has said all along: non-aggregated columns in the select of a group byquery come from indeterminate rows. You should not be using this syntax anyway. It is a MySQL (mis)feature that simply causes more problems than it solves, and it is not available in any other non-MySQL-derived database.

Here is a method to do what you want that should be much more efficient:

select t.*
from test t
where number = 21 and lang = 'EN'
union all
select t.*
from test t
where number = 21 and lang = 'JP' and
      not exists (select 1 from test t2 where t2.number = t.number and t2.lang = 'EN');

This can take advantage of an index on test(number, lang) for optimal performance.

MySQL 5.7 Reference Manual :: 12.20.2 GROUP BY , ROLLUP has a more complex effect when there are multiple GROUP BY columns​. In this case, each time there is a change in value in any but the last grouping  MySQL 5.7.5 and later also permits a nonaggregate column not named in a GROUP BY clause when ONLY_FULL_GROUP_BY SQL mode is enabled, provided that this column is limited to a single value, as shown in the following example:

It seems that all ORDER BY will be ignored if they stand in sub-query before GROUP BY in mysql 5.7.

I have the same issue in my query. I found work-around. You can add "ORDER BY" by key column. If id is primary key in your table, you can add

GROUP BY `id`

before

ORDER BY FIELD(lang, 'EN', 'JP')

Query:

`SELECT *
FROM
(
  SELECT  `id`, `category`, `name`, `number`, `lang`
  FROM `test`
  WHERE `category` = 'Cat1'
  GROUP BY `id`
  ORDER BY FIELD(lang, 'EN', 'JP')
) as table
GROUP BY number
ORDER BY number DESC`

MySQL 5.7 Reference Manual :: 8.2.1.15 GROUP BY , The first method applies the grouping operation together with all range predicates (if any). The second method first performs a range scan, and then groups the  The most efficient way to process GROUP BY is when an index is used to directly retrieve the grouping columns. With this access method, MySQL uses the property of some index types that the keys are ordered (for example, BTREE ).

You can using limit in subquery to make order by work:

SELECT
    *
FROM
    (
        SELECT
            `id`,
            `category`,
            `name`,
            `number`,
            `lang`
        FROM
            `test`
        WHERE
            `category` = 'Cat1'
        ORDER BY
            FIELD(lang, 'EN', 'JP')
        LIMIT 999999999
    ) AS TABLE
GROUP BY
    number
ORDER BY
    number DESC

Or using GROUP_CONCAT() to do it:

SELECT
    `id`,
    `category`,
    `name`,
    `number`,
    SUBSTRING_INDEX(
        GROUP_CONCAT(
            lang
            ORDER BY
                FIELD(lang, 'EN', 'JP')
        ),
        ',',
        1
    ) AS lang
FROM
    `test`
WHERE
    `category` = 'Cat1'
GROUP BY
    number
ORDER BY
    number DESC

WL#8693: Remove the syntax for GROUP BY ASC and , This is better in comparison with having to sort all the rows before grouping the data. Deprecation of the syntax GROUP BY ASC/DESC in MySQL 5.7. At the same time MySQL comes with a number of aggregate functions. They can process data from the individual rows while GROUP BY is executing. For example I can aggregate by user_id, but also remember and then list all the values from id column: mysql> SELECT user_id, GROUP_CONCAT(id) _id, COUNT(1) FROM bets WHERE user_id = 99 GROUP BY user_id;

Removal of implicit and explicit sorting for GROUP BY, mysql-5.7> SELECT id, SUM(cnt) FROM t GROUP BY id;. +------+----------+ One of them is to sort the rows before grouping them. This makes it  Luckily, the correct behaviour can be enabled by adding a flag (i.e. ONLY_FULL_GROUP_BY) or by using MySQL v5.7.5 or higher. In that case, the query would raise the following error: In that case

MySQL 5.7 -> Getting Error when trying to GROUP BY? Try this (Part I), The question that the MySQL server asks now is “Ok, I grouped it by name, but WHICH group should I show first? Bob? Jim? or Frank? This is  In MySQL 5.7 and below versions, users find the following warning in the manual. “ GROUP BY implicitly sorts by default (that is, in the absence of ASC or DESC designators for GROUP BY columns). However, relying on implicit GROUP BY sorting (that is, sorting in the absence of ASC or DESC designators) or explicit sorting for GROUP BY (that is, by using explicit ASC or DESC designators for GROUP BY columns) is deprecated.

GROUP BY, are you sure you know it? – gabi.dev, MySQL 5.7 is full of new features, like virtual columns, virtual indexes What MySQL is complaining about here is this: you grouped rows by  In MySQL Group Replication, a set of servers forms a replication group. A group has a name, which takes the form of a UUID. The group is dynamic and servers can leave (either voluntarily or involuntarily) and join it at any time. If a server joins

Comments
  • It always took a random element. It just happened to be the first one in your case. It is a documented behaviour.
  • In general, in the absence of any aggregating functions, there should be no GROUP BY clause
  • Try to add LIMIT 1000000 in the subquery. That worked for MariaDB after they "optimized" away the ORDER BY clause in the subqueries. But i can't say, if it works for MySQL 5.7. However this is just a fast fix. You should rewrite all those queries. There are many other ways to solve that problem. And most of them are faster.
  • The OP's sample data returns three rows, not one.
  • But the later query expects one row in the result set.
  • Where is lang sorting? :)
  • Would you please check the query under edit section? @Matt