"select count(id) from table" takes up to 30 minutes to calculate in SQL Azure

I have a database in SQL Azure which is not taking between 15 and 30 minutes to do a simple:

select count(id) from mytable

The database is about 3.3GB and the count is returning approx 2,000,000 but I have tried it locally and it takes less than 5 seconds!

I have also run a:

ALTER INDEX ALL ON mytable REBUILD

On all the tables in the database.

Would appreciate if anybody could point me to some things to try to diagnose/fix this.

(Please skip to UPDATE 3 below as I now think this is the issue but I still do not understand it).

UPDATE 1: It appears to take 99% of the time in a clustered index scan as image below shows. I have

UPDATE 2: And this is what the statistics messages come back as when I do:

SET STATISTICS IO ON
SET STATISTICS TIME ON
select count(id) from TABLE

Statistics:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 317037 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

(1 row(s) affected)
Table 'TABLE'. Scan count 1, logical reads 279492, physical reads 8220, read-ahead reads 256018, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 297 ms,  elapsed time = 438004 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

UPDATE 3: OK - I have another theory now. The Azure portal is suggesting each time I do test this simply select query it is maxing out my DTU percentage to nearly 100%. I am using a Standard Azure SQL instance with performance level S1 (20 DTUs). Is it possible that this simple query is being slowed down by my DTU limit?

I realize this is old, but I had the same issue. I had a table with 2.5 million rows that I imported from an on-prem database into Azure SQL and ran at S3 level. Select Count(0) from Table resulted in a 5-7 minute execution time vs milliseconds on-premise.

In Azure, index and table scans seem to be penalized tremendously in performance, so adding a 'useless' WHERE to the query that forces it to perform an index seek on the clustered index helped.

In my case, this performed almost identical Select count(0) from Table where id > 0 resulted in performance matching the on premise query.

Quick refinement of @FoggyDay post. If your tables are partitioned, you'll want to sum the rowcount.

SELECT t.name, SUM(s.row_count) row_count
FROM sys.tables t
JOIN sys.dm_db_partition_stats s
ON t.object_id = s.object_id
  AND t.type_desc = 'USER_TABLE'
  AND t.name not like '%dss%'
  AND s.index_id = 1
GROUP BY t.name

Comments
  • Have you checked to make sure there is no deadlock?
  • Not formally but I am pretty sure there is not - I have turned off any updates so my query is the only thing which should be hitting the DB.
  • Q: So did you ever resolve your Azure performance problem?
  • @FoggyDay Not yet, I think I have narrowed it to be something to do with my understanding of DTUs but need to find time to investigate further. I can't understand how a simply count(id) can max out my quota so I must be missing something.
  • The main answer is that DTU is a terrible metric because some queries are IO bound and others CPU bound, and DTU is some black box "blend" of both. Upgrade to a higher performance tier, but even then you may be throttled... Seems to be a design choice by MS - it's about small transactions, not aggregation or analytics.
  • WOW. This worked for me too. I have a rather complex query, which contains a subquery. When i add the "useless" where clause to the subquery it went from 22 seconds to 1 second on Azure and 600ms to 160ms locally. However, this is a query I have control over. My app also uses EF which causes me enough uncertainty to look into moving back to VM.
  • The select count(*) has not helped - actually, after a long period I got "A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)" I will check your other points soon.
  • Have ammended my answer above after viewing also the plan - seems a clustered index scan is taking up all the time. Should this really take so long though for a simply select count(id)?
  • Thank you for getting the statistics. You're doing a full table scan of 2M rows - that's bad. POSSIBLE WORKAROUND: query system table "row_count" instead: blogs.msdn.com/b/arunrakwal/archive/2012/04/09/…
  • Thanks for the response. The problem is, this is only a sample query - but what if I want to do something like "select count(id) from TABLE where column = 1" for example? It is still slow. Should such a simple query really be so slow? Is 2M rows really a lot for SQL Server such that it would take over 10 minutes? Do I need to manually add new indexes? I thought part of the SQL Azure benefit was minimal management.
  • I've had similar issues on a very small 500 megabyte table - full scans on this table take milliseconds when running from a SQL Server VM locally, but in Azure on a 10 DTU instance, it takes minutes to do a simple SELECT max(column). The myth of "full table scans are bad" persists. If my workload is analysis & aggregation, SQL Azure significantly underperforms. It really seems it's designed for OLTP workloads that can be easily sharded.