0

We have a query that creates a temporary table to store IDs for deletion and then deletes the corresponding records from a table in an Azure SQL database using transactions. The database is part of an Azure SQL Elastic Pool. While this query worked perfectly on one Azure SQL server, it did not perform as expected on another. On the problematic server, the query keeps running for hours without any noticeable change in the file size of the table. What could be the root cause of this issue, and how can we investigate it?

DECLARE @r INT;
DECLARE @batchsize INT;
 
create table #UnusedBlobIDs (
    ID UNIQUEIDENTIFIER PRIMARY KEY (ID)
);
 
SET @r = 1;
SET @batchsize=1000;
 
WITH [ExistingBlobs] ([BlobId])
    AS
    (SELECT [Blobs].[BlobId]
    FROM [Blobs]
    JOIN [SharedFields]
    ON '{' + CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) + '}' = [SharedFields].[Value]
    UNION
    SELECT [Blobs].[BlobId]
    FROM [Blobs]
    JOIN [SharedFields]
    ON CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) = [SharedFields].[Value]
    UNION
    SELECT [Blobs].[BlobId]
    FROM [Blobs]
    JOIN [VersionedFields]
    ON '{' + CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) + '}' = [VersionedFields].[Value]
    UNION
    SELECT [Blobs].[BlobId]
    FROM [Blobs]
    JOIN [VersionedFields]
    ON CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) = [VersionedFields].[Value]
    UNION
    SELECT [Blobs].[BlobId]
    FROM [Blobs]
    JOIN [UnversionedFields]
    ON '{' + CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) + '}' = [UnversionedFields].[Value]
    UNION
    SELECT [Blobs].[BlobId]
    FROM [Blobs]
    JOIN [UnversionedFields]
    ON CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) = [UnversionedFields].[Value]
    UNION
    SELECT [Blobs].[BlobId]
    FROM [Blobs]
    JOIN [ArchivedFields]
    ON '{' + CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) + '}' = [ArchivedFields].[Value]
    UNION
    SELECT [Blobs].[BlobId]
    FROM [Blobs]
    JOIN [ArchivedFields]
    ON CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) = [ArchivedFields].[Value])
     
    INSERT INTO #UnusedBlobIDs (ID) SELECT DISTINCT  [Blobs].[BlobId]
    FROM [Blobs]
    WHERE NOT EXISTS
    (  SELECT NULL
    FROM [ExistingBlobs]
    WHERE [ExistingBlobs].[BlobId] = [Blobs].[BlobId])
     
WHILE @r > 0
   BEGIN
       BEGIN TRANSACTION;
        DELETE TOP (@batchsize) FROM [Blobs] where [Blobs].[BlobId] IN (SELECT ID from #UnusedBlobIDs);
        SET @r = @@ROWCOUNT;
        COMMIT TRANSACTION;
    END
 

DROP TABLE #UnusedBlobIDs;

On further investigation I am doubting that insert in to temporary database itself is not getting successful.

Solution that worked for me: Instead using single query with joins on the multiple table I created a temporary table by querying tables one by one.

3
  • could you please the query that you're running? Commented Jun 16 at 5:26
  • Also Is there any difference in service tier, compute size, or storage usage between the two servers? Commented Jun 16 at 5:26
  • Query ran successfully on Standard:50 eDTUs elastic pool. It is not running in Standard: 200 eDTUs Commented Jun 16 at 5:28

2 Answers 2

2

The query likely runs slowly on one Azure SQL server due to inefficient join logic, missing indexes, tempdb contention, or differences in query plans. Here are the key areas to investigate:

  1. CTE complexity and join pattern:

Your CTE ExistingBlobs uses multiple UNION and JOIN like:

'{' + CONVERT(NVARCHAR(MAX), [BlobId]) + '}' = [Value]

These string concatenations and conversions disable index usage, resulting in full table scans. On large datasets, this can drastically increase query time.

  1. Missing or Ineffective Indexes:

Check if the [Value] column in [SharedFields], [VersionedFields], etc. is indexed.

If it's not indexed and you're doing a scalar string conversion (CONVERT(NVARCHAR(MAX)) on one side of the join, indexes won't be used even if they exist.

  1. High tempdb usage or contention:

Since you're inserting into a temp table, a large INSERT SELECT may hit tempdb contention. To check tempdb usage use:

SELECT session_id, user_objects_alloc_page_count, internal_objects_alloc_page_count
FROM sys.dm_db_session_space_usage
WHERE session_id = @@SPID;
  1. Run a simplified version of the insert logic to confirm if it's the slow part:
SELECT COUNT(*) 
FROM [Blobs] 
WHERE NOT EXISTS (
    SELECT 1 
    FROM [SharedFields] 
    WHERE [SharedFields].[Value] LIKE '%' + CAST([Blobs].[BlobId] AS NVARCHAR) + '%'
);

If this takes a long time, then your joins in ExistingBlobs are the root cause.

  1. Query Plan Differences:

The slow server may be using a bad query plan. This can happen if statistics differ or Query Store forced a suboptimal plan. Use Query Store to compare plans between the fast and slow servers:

SELECT qs.query_id, rs.avg_duration, qp.query_plan
FROM sys.query_store_query_text qt
JOIN sys.query_store_query qs ON qt.query_text_id = qs.query_text_id
JOIN sys.query_store_plan qp ON qs.query_id = qp.query_id
JOIN sys.query_store_runtime_stats rs ON qp.plan_id = rs.plan_id
WHERE qt.query_sql_text LIKE '%UnusedBlobIDs%';

If a better plan exists on another server, you can use sp_query_store_force_plan to apply it to the slow one.

Check for reference: sp_query_store_force_plan

Sign up to request clarification or add additional context in comments.

Comments

1

Not knowing the full structure of your tables, I can see a couple of issues with your statements.

  • You join every table twice in your with block, as you want to support 2 formats of 'value' in your secondary tables.

    • This is not required as a value containing '12341234-1234-1234-1234-123412341234' and '{12341234-1234-1234-1234-123412341234}' both can convert to a uniqueidentifier.

    • By moving the cast away from Blobs.BlobId, the system will also be able to use the primary key of Blobs to do the lookup.

  • You use Union, which results in a fragmented execution plan.

    • In your scenario, you can actually combine the execution plan by using left joins and a not null check.

    • The not exist check can be replaced in a same manner with a null check.

    • Now the execution plan will no longer contain that many merges to calculate the end result.

Try the following statement

INSERT INTO #UnusedBlobIDs (ID) 
SELECT DISTINCT [Blobs].[BlobId] 
FROM [Blobs] 
LEFT OUTER JOIN [SharedFields] ON [Blobs].[BlobId] = CAST([SharedFields].[Value] AS UNIQUEIDENTIFIER)
LEFT OUTER JOIN [VersionedFields] ON [Blobs].[BlobId] = CAST([SharedFields].[Value] AS UNIQUEIDENTIFIER)
LEFT OUTER JOIN [UnversionedFields] ON [Blobs].[BlobId] = CAST([SharedFields].[Value] AS UNIQUEIDENTIFIER)
LEFT OUTER JOIN [ArchivedFields] ON [Blobs].[BlobId] = CAST([SharedFields].[Value] AS UNIQUEIDENTIFIER)
LEFT OUTER JOIN [ExistingBlobs] ON [ExistingBlobs].[BlobId] = [Blobs].[BlobId]
WHERE [ExistingBlobs].[BlobId] IS NULL
AND 
(
  [SharedFields].[Value] IS NOT NULL OR 
  [VersionedFields].[Value] IS NOT NULL OR 
  [UnversionedFields].[Value] IS NOT NULL OR 
  [ArchivedFields].[Value] IS NOT NULL
)

Note: I can't be completely sure this will change anything as I do not know the full setup.
Note 2: LEFT JOIN on ExistingBlobs or using a NOT EXISTS should make no real difference to the execution plan. I just wanted to introduce the option as a thought experiment

2 Comments

Hi Kevin, I tried by breaking the query in to smaller queries and that way it worked for me. But I am not able to figure out how the same query worked on a 50 eDTU sql server but not on the 200 eDTU. I checked most of the configurations all seems to be the same.
Query plans are a weird thing. 2 Servers sometimes come up with different behaviour. But the answer made by Mihir Saxena actually goes into detail on that and is accurate (I can not add much more). If query plans are not your thing (as is the case for many people), know this: Sometimes it's better to fragment your logic (as you did) to create faster handling. Giving Sql Server smaller chunks to handle will, more often than not, increase performance as the plans will be smaller and simple to handle. It's a very generic answer and not true in some scenarios, but it will help most o

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.