How to optimize this t-sql script code by avoiding loop?

Question

I use following sql query to update MyTable. the code take between 5 to 15 min. to update MyTabel as long as ROWS <= 100000000 but when Rows > 100000000 it take exponential time to update MYTable. How can I change this code to use set-base instead of while loop?

DECLARE @startTime DATETIME
DECLARE @batchSize INT
DECLARE @iterationCount INT
DECLARE @i INT
DECLARE @from INT
DECLARE @to INT

SET @batchSize = 10000
SET @i = 0

SELECT @iterationCount = COUNT(*) / @batchSize
FROM MyTable
WHERE LitraID = 8175
    AND id BETWEEN 100000000 AND 300000000

WHILE @i <= @iterationCount BEGIN

    BEGIN TRANSACTION T

    SET @startTime = GETDATE()
    SET @from = @i * @batchSize
    SET @to = (@i + 1) * @batchSize - 1

    ;WITH data
    AS (
        SELECT DoorsReleased, ROW_NUMBER() OVER (ORDER BY id) AS Row
        FROM MyTable
        WHERE LitraID = 8175
            AND id BETWEEN 100000000 AND 300000000
    )
    UPDATE data
    SET DoorsReleased = ~DoorsReleased
    WHERE row BETWEEN @from AND @to

    SET @i = @i + 1

    COMMIT TRANSACTION T

END

Please provide a recovery model for your database and execution plan for single iteration — Devart
– Devart, Commented Mar 14, 2016 at 11:45
Use some real and indexed column to count batches instead of ROW_NUMBER processing. Why don't you use ID itself? — IVNSTN
– IVNSTN, Commented Mar 14, 2016 at 12:26
thanks for responses I can not lock the table in production for a long time therefore use 10000 in every loop and transaction. — Nokomo
– Nokomo, Commented Mar 15, 2016 at 6:34

Andy Nichols · Accepted Answer · 2016-03-15 10:02:16Z

1

One of your issues is that your select statement in the loop fetches all records for LitraID = 8175, sets row numbers, then filters in the update statement. This happens on every iteration.

One way round this would be to get all ids for the update before entering the loop and storing them in a temporary table. Then you can write a similar query to the one you have, but joining to this table of ids.

However, there is an even easier way if you know approximately how many records have LitraID = 8175 and if they are spread throughout the table, not bunched together with similar ids.

DECLARE @batchSize INT
DECLARE @minId INT
DECLARE @maxId INT

SET @batchSize = 10000 --adjust according to how frequently LitraID = 8175, larger numbers if infrequent
SET @minId = 100000000

WHILE @minId <= 300000000 BEGIN

    SET @maxId = @minId + @batchSize - 1
    IF @maxId > 300000000 BEGIN
        SET @maxId = 300000000
    END

    BEGIN TRANSACTION T

        UPDATE MyTable
        SET DoorsReleased = ~DoorsReleased
        WHERE id BETWEEN @minId AND @maxId

    COMMIT TRANSACTION T

    SET @minId = @maxId + 1
END

This will use the value of id to control the loop, meaning you don't need the extra step to calculate @iterationCount. It uses small batches so that the table isn't locked for long periods. It doesn't have any unnecessary SELECT statements and the WHERE clause in the update is efficient assuming id has an index.

It won't have exactly the same number of records updated in every transaction, but there's no reason it needs to.

answered Mar 15, 2016 at 10:02

Andy Nichols

3,0122 gold badges22 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Nokomo Over a year ago

Still take long time. I stopped query after near 3 hours and not finished.

paparazzo Over a year ago

A statement is a transaction so that transaction does nothing. Still good answer.

paparazzo Over a year ago

@Nokomo Then you need to look at another overall design. There is not a TSQL fix. Flopping million of rows on an active table is just odd.

paparazzo · Accepted Answer · 2016-03-16 12:03:36Z

1

This will eliminate the loop

UPDATE MyTable
   set DoorsReleased = ~DoorsReleased
 WHERE LitraID = 8175
   AND id BETWEEN 100000000 AND 300000000 
   AND DoorsReleased is not null -- if DoorsReleased is nullable
-- AND DoorsReleased <> ~DoorsReleased</strike>

if you are set on looping
below will NOT work
I thought ~ was part of the column name but it is a not operator

select 1;
WHILE (@@ROWCOUNT > 0)
BEGIN
    UPDATE top (100000) MyTable
       set DoorsReleased = ~DoorsReleased
     WHERE LitraID = 8175
       AND id BETWEEN 100000000 AND 300000000 
       AND (       DoorsReleased <> ~DoorsReleased 
             or (  DoorsReleased is null and ~DoorsReleased is not null )
           )
END

Inside a transaction I don't think looping would have value as the transaction log cannot clear. And a batch size of 10,000 is small.\

as stated in a comment if you want to loop then try using id as row_number() all those loops is expensive

you might be able to use OFFSET

edited Mar 16, 2016 at 12:03

answered Mar 14, 2016 at 12:24

paparazzo

45.2k24 gold badges110 silver badges180 bronze badges

6 Comments

Andy Nichols Over a year ago

If DoorsReleased is a non-nullable bit then DoorsReleased <> ~DoorsReleased is always true. This means that the loop will be an infinite loop as it will keep processing the same 100,000 records, switching the values from 0 to 1 then back again next time.

paparazzo Over a year ago

@AndyNichols DoorsReleased <> ~DoorsReleased is not true if DoorsReleased = ~DoorsReleased

paparazzo Over a year ago

@AndyNichols I did not know ~ was a not operator for bit

Nokomo Over a year ago

DoorsRelease is nullable.

Nokomo Over a year ago

This query locks MyTable in production for long time.

|

Collectives™ on Stack Overflow

How to optimize this t-sql script code by avoiding loop?

2 Answers 2

3 Comments

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related