"select count(id) from table" takes up to 30 minutes to calculate in SQL Azure
I have a database in SQL Azure which is not taking between 15 and 30 minutes to do a simple:
select count(id) from mytable
The database is about 3.3GB and the count is returning approx 2,000,000 but I have tried it locally and it takes less than 5 seconds!
I have also run a:
ALTER INDEX ALL ON mytable REBUILD
On all the tables in the database.
Would appreciate if anybody could point me to some things to try to diagnose/fix this.
(Please skip to UPDATE 3 below as I now think this is the issue but I still do not understand it).
UPDATE 1: It appears to take 99% of the time in a clustered index scan as image below shows. I have
UPDATE 2: And this is what the statistics messages come back as when I do:
SET STATISTICS IO ON SET STATISTICS TIME ON select count(id) from TABLE
SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 0 ms. SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 317037 ms. SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. (1 row(s) affected) Table 'TABLE'. Scan count 1, logical reads 279492, physical reads 8220, read-ahead reads 256018, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. (1 row(s) affected) SQL Server Execution Times: CPU time = 297 ms, elapsed time = 438004 ms. SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 0 ms. SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms.
UPDATE 3: OK - I have another theory now. The Azure portal is suggesting each time I do test this simply select query it is maxing out my DTU percentage to nearly 100%. I am using a Standard Azure SQL instance with performance level S1 (20 DTUs). Is it possible that this simple query is being slowed down by my DTU limit?
I realize this is old, but I had the same issue. I had a table with 2.5 million rows that I imported from an on-prem database into Azure SQL and ran at S3 level.
Select Count(0) from Table resulted in a 5-7 minute execution time vs milliseconds on-premise.
In Azure, index and table scans seem to be penalized tremendously in performance, so adding a 'useless'
WHERE to the query that forces it to perform an index seek on the clustered index helped.
In my case, this performed almost identical
Select count(0) from Table where id > 0 resulted in performance matching the on premise query.
Quick refinement of @FoggyDay post. If your tables are partitioned, you'll want to sum the rowcount.
SELECT t.name, SUM(s.row_count) row_count FROM sys.tables t JOIN sys.dm_db_partition_stats s ON t.object_id = s.object_id AND t.type_desc = 'USER_TABLE' AND t.name not like '%dss%' AND s.index_id = 1 GROUP BY t.name
- Have you checked to make sure there is no deadlock?
- Not formally but I am pretty sure there is not - I have turned off any updates so my query is the only thing which should be hitting the DB.
- Q: So did you ever resolve your Azure performance problem?
- @FoggyDay Not yet, I think I have narrowed it to be something to do with my understanding of DTUs but need to find time to investigate further. I can't understand how a simply count(id) can max out my quota so I must be missing something.
- The main answer is that DTU is a terrible metric because some queries are IO bound and others CPU bound, and DTU is some black box "blend" of both. Upgrade to a higher performance tier, but even then you may be throttled... Seems to be a design choice by MS - it's about small transactions, not aggregation or analytics.
- WOW. This worked for me too. I have a rather complex query, which contains a subquery. When i add the "useless" where clause to the subquery it went from 22 seconds to 1 second on Azure and 600ms to 160ms locally. However, this is a query I have control over. My app also uses EF which causes me enough uncertainty to look into moving back to VM.
- The select count(*) has not helped - actually, after a long period I got "A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)" I will check your other points soon.
- Have ammended my answer above after viewing also the plan - seems a clustered index scan is taking up all the time. Should this really take so long though for a simply select count(id)?
- Thank you for getting the statistics. You're doing a full table scan of 2M rows - that's bad. POSSIBLE WORKAROUND: query system table "row_count" instead: blogs.msdn.com/b/arunrakwal/archive/2012/04/09/…
- Thanks for the response. The problem is, this is only a sample query - but what if I want to do something like "select count(id) from TABLE where column = 1" for example? It is still slow. Should such a simple query really be so slow? Is 2M rows really a lot for SQL Server such that it would take over 10 minutes? Do I need to manually add new indexes? I thought part of the SQL Azure benefit was minimal management.
- I've had similar issues on a very small 500 megabyte table - full scans on this table take milliseconds when running from a SQL Server VM locally, but in Azure on a 10 DTU instance, it takes minutes to do a simple SELECT max(column). The myth of "full table scans are bad" persists. If my workload is analysis & aggregation, SQL Azure significantly underperforms. It really seems it's designed for OLTP workloads that can be easily sharded.