MYSQL Index strategy when no WHERE Clause
mysql indexing best practices
mysql create index
mysql index where clause
mysql index optimization
mysql index create table
indexing in mysql w3schools
mysql use index join
I have this table
CREATE TABLE gotrax1.wifi_log ( WifiID int(11) NOT NULL AUTO_INCREMENT, UnitID int(11) DEFAULT NULL, ServerTime timestamp NULL DEFAULT CURRENT_TIMESTAMP (), FileTime bigint(20) DEFAULT NULL, WLANTYPE text DEFAULT NULL, MACSRC varchar(25) DEFAULT NULL, MACDST varchar(25) DEFAULT NULL, BSSID varchar(25) DEFAULT NULL, SIG int(11) DEFAULT NULL, ESSID text DEFAULT NULL, PRIMARY KEY (WifiID) )
I need to run this query on it
SELECT COUNT(DISTINCT(MACDST)) AS MACDST, COUNT(DISTINCT(MACSRC)) AS MACSRC, COUNT(DISTINCT(BSSID)) AS BSSID, COUNT(DISTINCT(MACDST))-COUNT(DISTINCT(MACSRC)) AS UnitDIFF, UnitID, FileTime, WLANTYPE FROM wifi_log GROUP BY FileTime,UnitID,WLANTYPE ORDER BY FileTime DESC;
It is dog slow and does a full file sort. Normally I know to add an index following the order of a where clause. I have no idea how to do it with this query and this table to avoid the filesort. Any suggestions would be terrific thankyou.
You can't create an index on
WLANTYPE as it is, because if you try to index a TEXT or BLOB, you get this error:
ERROR 1170 (42000): BLOB/TEXT column 'wlantype' used in key specification without a key length
I would question whether you need WLANTYPE to be TEXT. Perhaps a shorter VARCHAR would be just as good.
alter table wifi_log modify wlantype varchar(10);
Then you can add a covering index:
alter table wifi_log add key (filetime,unitid,wlantype,macdst,macsrc,bssid);
Also get rid of the
ORDER BY FileTime so you don't have to sort the result. Sort the result after fetching the result in your application, if it isn't already in the order you want.
EXPLAIN SELECT COUNT(DISTINCT(MACDST)) AS MACDST, COUNT(DISTINCT(MACSRC)) AS MACSRC, COUNT(DISTINCT(BSSID)) AS BSSID, COUNT(DISTINCT(MACDST))-COUNT(DISTINCT(MACSRC)) AS UnitDIFF, UnitID, FileTime, WLANTYPE FROM wifi_log GROUP BY FileTime,UnitID,WLANTYPE ORDER BY NULL\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: wifi_log partitions: NULL type: index possible_keys: FileTime key: FileTime key_len: 366 ref: NULL rows: 1 filtered: 100.00 Extra: Using index
The type: index in this explain report shows that it still has to scan the whole index, which is nearly as expensive as a table-scan. But that's natural for your query, which needs to get counts from every row.
The advantage of making this an index scan may be that it has to examine fewer pages. One index, even on 6 columns, is smaller than the whole table.
Also getting rid of the filesort will help.
8.3 Optimization and Indexes, The index entries act like pointers to the table rows, allowing the query to quickly determine which rows match a condition in the WHERE clause, and retrieve the� In some cases, MySQL can read rows from the index without even consulting the data file. If all columns used from the index are numeric, only the index tree is used to resolve the query. Before each row is output, those that do not match the HAVING clause are skipped. Some examples of queries that are very fast:
I have a technique for approximating
COUNT(DISTINCT..) from summarized data. Could you build daily summaries of the data? Then roll up the data for the totals? Such is easy for
COUNT (sum of counts) and
SUM (sum of sums), but rather tricky for 'uniques'. It gives only approximations, usually within 1% of the exact result. Here is an overview of the technique: http://mysql.rjweb.org/doc.php/uniques
8.3.1 How MySQL Uses Indexes, Without an index, MySQL must begin with the first row and then read through the entire table to find the To find the rows matching a WHERE clause quickly. The WHERE clause when used together with the NOT IN keyword DOES NOT affects the rows whose values matches the list of values provided in the NOT IN keyword. The following query gives rows where membership_number is NOT 1, 2 or 3 SELECT * FROM `members` WHERE `membership_number` NOT IN (1,2,3);
I would suggest you to create a column to hold the hash of your WLANTYPE.
add index to the hash's column and a trigger to set it in insert / update..
and change your query a bit for:
SELECT COUNT(DISTINCT(MACDST)) AS MACDST, COUNT(DISTINCT(MACSRC)) AS MACSRC, COUNT(DISTINCT(BSSID)) AS BSSID, COUNT(DISTINCT(MACDST))-COUNT(DISTINCT(MACSRC)) AS UnitDIFF, UnitID, FileTime, max(WLANTYPE) as WLANTYPE FROM wifi_log GROUP BY FileTime,UnitID,WLANTYPEHash ORDER BY FileTime DESC;
MySQL 8.0 Reference Manual :: 220.127.116.11 WHERE Clause , Because work on the MySQL optimizer is ongoing, not all of the optimizations In some cases, MySQL can read rows from the index without even consulting the � My dilemma: if I index the table, then the inserts are going to be slow but I can query. If I do not index the table, then the inserts are fast but queries are slow and can slow down the inserts. What is the proper way to model/design/architect for situations like this? I know there is no single answer and the solution can be very complex.
MySQL 8.0 Reference Manual :: 8.9.4 Index Hints, (For the general syntax for specifying tables in a SELECT statement, see Section If an index hint includes no FOR clause, the scope of the hint is to apply to all� The key column shows the key (index) that MySQL actually decides to use, which must be included in possible_ Keys. If no index is selected, the key is null. To force Mysql to use or ignore possible_ The index in the keys column. Use force index, use index or ignore index in the query. 7、 Key_ Len. Represents the number of bytes used in the index.
MySQL 8.0 Reference Manual :: 18.104.22.168 Index Condition , Without ICP, the storage engine traverses the index to locate rows in the base table see Section 22.214.171.124, “Optimizing Subqueries with the EXISTS Strategy”.). MySQL found no good index to use, but found that some of indexes might be used after column values from preceding tables are known. For each row combination in the preceding tables, MySQL checks whether it is possible to use a range or index_merge access method to retrieve rows.
Covering index: an index containing all queried columns, It not only avoids accessing the table to evaluate the where clause, but avoids accessing the table The index-only scan is an aggressive indexing strategy. Unless the index is covering for the query however you may well not see this. The tipping point for a query using a non covering index and look ups is typically very low. (at most selectivity of a single digit percent). So unless, say, 99% of the table does have value X the <> will match too many rows for that plan to be chosen.
COUNT(DISTINCT)will also slow things down.
explain select ...to see the execution plan.
- You can have 2 indexes 1. complex index [FileTime,UnitID,WLANTYPE] and 1. for fileTime in a desc order. check this
- Thanks I had run explain prior to posting and all is says is using filesort (which I included above)
- thanks I added these two indexes (I had put DESC on the filetime but it didnt seem to stick. no difference still using filesort ALTER TABLE gotrax1.wifi_log ADD INDEX IDX_wifi_log_FileTime (FileTime); ALTER TABLE gotrax1.wifi_log ADD INDEX IDX_wifi_log2 (FileTime, UnitID, WLANTYPE (1));
- I doubt if there is an extra sort for the
ORDER BYsince it adequately matches the
GROUP BY. This would indicate whether there is an extra sort:
EXPLAIN FORMAT=JSON SELECT...
- I see you're right! I get "using_filesort": false, in the JSON EXPLAIN report.
- So the best optimization is to use the index-scan instead of a table-scan. I doubt that will make much difference at the end of the day.
- First, "covering" gives a speedup. Then there are two potential sorts to try to avoid. I would expect the first 3 columns of your index to avoid the 'filesort' for
GROUP BY; does it? Then, the similarity between
ORDER BYshould obviate the filesort for
ORDER BY. And, in getting rid of the first filesort, the "temp table" may go away. I see that a 4 possible speedups. (Need to see the schema and
EXPLAIN FORMAT=JSONto know for sure.) Also, is it InnoDB??
- It's always InnoDB. What else? :-) I didn't populate the table with any data, which could also make a difference. If you want to do all that work and post your own answer, be my guest. :)