MySQL query gets slower when uses an index
I got to a point were I can not understand why the following MySQL query gets slower when I use an index in my where clause. The column that makes me crazy is called deleted. The table contains 4.8M rows.
SELECT SQL_NO_CACHE SUM(amount)/100 FROM transactions WHERE (type="Payment" or type="Refund") and deleted is NULL
That query takes slightly above 11 seconds when the column is an Index and 3 seconds when its not indexed or when I use
USE INDEX() which tell the optimizer not to use any index.
MySQL version 5.6, tested in AWS Aurora db.r5.xlarge (4CPU/32GB)
id int(11) NOT NULL,
type enum('Charge','Payment','Refund','Credit Adjustment','Debit Adjustment','Transfer') NOT NULL,
amount int(11) NOT NULL,
deleted datetime DEFAULT NULL,
deleted_by int(11) DEFAULT NULL
ENGINE=InnoDB DEFAULT CHARSET=utf8;
ADD KEY type (type),
ADD KEY deleted (deleted)
I would appreciate any clues here!
I used "explain" to check the above query if the index can be used or not. As my result, the index doesn't work for either "OR" operator or "IN", so I think "UNION" is better choice. And I think you don't need to add index for "deleted" column, because it doensn't work as well.
"explain" result for IN operator:
"explain" result for OR operator:
index on "deleted" column doesn't work:
Making slow queries fast using composite indexes in MySQL, Goes over the basics of composite indexes, EXPLAIN and index hints. Making slow queries fast using composite indexes in MySQL The table has a bunch of rows, in the example we're going to use dataset with around Query 2 (indexes, no composite index) performs a non-unique key lookup on foreign_key_idx index and uses 500K records with a total duration of 0.6 seconds. Query 3 (composite index only) performs an index range scan on composite index and uses 480K records with a total duration of 0.13 seconds.
I think I came up with a logical idea why using an indexed column would cause a delay. The problem should be in the data of that column and especially in its highly malformed distribution of unique values - respectively binary three nodes. It consists of 4.8 M rows with the same NULL value and just 30 K rows with 3 K unique values.
When deleted index is used to find the NULL values it does not have significant effect of reducing the subset of rows that MySQL will further process but adds very significant amount of overhead activity dealing with the binary tree index. I suspect that without the index summing operation is faster enough so that it outperforms, even making full table scan, the benefits of reduced subset of rows that the index can provide but at the cost of significant indexing overhead.
The data in that deleted column pumps up deleted index cardinality and makes it preferable for the optimizer over the type column index which has cardinality just 10. If values distribution in both columns was normal then its logical to prioritize using the one with higher cardinality and result a smaller subset for further processing. However this deleted column values' distribution is very malformed toward the null values. In the same manner as described above using deleted index for finding null values adds a lot of overhead but does not do much for the performance, prevent using the other more relevant indexes and thus results delay.
Very Slow MySQL Queries, even with indexes, Not a DBA or MySQL expert here, but let's try :). So let's take your second query - a bit smaller than the 1st one - and simplify the table names. We have 2/ How long does it take when you get rid of the order by / limit like below ? SELECT LO.* FROM all credit go here use force index to speed up queires. Query 2 (indexes, no composite index) performs a non-unique key lookup on foreign_key_idx index and uses 500K records with a total duration of 0.6 seconds. Query 3 (composite index only) performs an index range scan on composite index and uses 480K records with a total duration of 0.13 seconds .
If you remove the index on just
deleted and add this "composite" index:
INDEX(deleted, type) -- in this order
it may run faster. Note that the
= column comes first (
IS NULL counts), then the
IN (which your
OR turns into).
Even faster might be to make the index "covering":
INDEX(deleted, type, amount) -- in this order
UNION is a good trick, but not necessary here.
deleted is rarely
NULL, then the Optimizer may prefer that index, even if it turns out to be less efficient. (This may explain the problem you pose. My composite index avoids the issue.)
Independent issue: Why have
deleted? Can't you simply have
NULL to indicate the same thing?
MySQL query slow even with indexes, I think your explain plan is doing a heavy index scan. What you might need is an index with those exact three columns. Please create this index As you can see, using index hints can further increase the query speed. The difference in USE INDEX And FORCE INDEX comes from (Quote from MySQL documentation): The FORCE INDEX hint acts like USE INDEX (index_list), with the addition that a table scan is assumed to be very expensive. In other words, a table scan is used only if there is no way to use one of the named indexes to find rows in the table.
(Edit: Apparently this is wrong for this particular situation. This answer only applies if the OR'd conditions involved different fields....or create a range check that prevents taking advantage of fields farther into an index. See comments for details.)
MySQL does not take advantage of indexes very well when presented with
OR conditions. Often you can speed up a query like
SELECT a FROM b WHERE y = n1 OR y = n2
by expanding it to a union like this
SELECT a FROM b WHERE y = n1 UNION SELECT a FROM b WHERE y = n2
I've heard more recent versions have made such conditions expressed in the form
y IN (n1, n2) a little more efficient, but my primary work in the last few years has been in MS SQL, so I can't say how much it has improved.
This can even be used in the case of your straight forward summing with a little more expansion....
SELECT SUM(subt) FROM ( SELECT SUM(amount)/100 AS subt FROM transactions WHERE type="Payment" and deleted is NULL UNION SELECT SUM(amount)/100 AS subt FROM transactions WHERE type="Refund" and deleted is NULL ) AS subq
How to Optimize MySQL: Indexes, Slow Queries, Configuration , indexes. Leave your ORM behind and get your hands dirty! How to Optimize MySQL: Indexes, Slow Queries, Configuration. By Bruno Otherwise, use a simple text editor like vim by executing sudo vim /etc/mysql/my.cnf . MySQL has a built-in slow query log. To use it, open the my.cnf file and set the slow_query_log variable to "On." Set long_query_time to the number of seconds that a query should take to be considered slow, say 0.2. Set slow_query_log_file to the path where you want to save the file. Then run your code and any query above the specified threshold will be added to that file.
4. Query Performance Optimization, As discussed in Chapter 2, the standard slow query logging feature in This query will return 10 rows, and EXPLAIN shows that MySQL uses the ref Good indexes help your queries get a good access type and examine only the And I got 200-400ms query time. If I force it to use the right index like: SELECT SQL_NO_CACHE * FROM votes USE INDEX (voter_timestamp) WHERE voter_id = 1099 AND rate = 1 AND subject_name = 'medium' ORDER BY updated_at DESC LIMIT 20 OFFSET 100; Mysql can return the results in 1-2ms. and here is the explain:
MySQL: Checking The Slow Query Log And Simple Indexing , There are several situations in which mysql slowness can originate. When a query cannot make use of an index, the MySQL server has to The USE INDEX is useful in case the EXPLAIN shows that the Query Optimizer uses the wrong index from the list of possible indexes. In this tutorial, you have learned how to use the MySQL USE INDEX hint to instruct the Query Optimizer to use the only list of specified indexes to find rows in a table.
How to speed up your MySQL queries 300 times, MySQL has a built-in slow query log. Learn how to speed it Use indexes to avoid unnecessary passes through tables. For example, you can Enable the slow log and collect for some time (slow_query_log = 1) Stop collection and process the log with pt-query-digest; Begin reviewing the top queries in times of resource usage; Note – you can also use the performance schema to identify queries, but setting that up is outside the scope of this post.