0

I have a table T with some 500000 records. That table is a hierarchical table. My goal is to update the table by self joining the same table based on some condition for parent - child relationship The update query is taking really long because the number of rows is really high. I have created an unique index on the column which helps identifying the rows to update (meanign x and Y). After creating the index the cost has reduced but still the query is performing a lot slower.

This my query format

update T
set a1, b1
= (select T.parent.a1, T.parent.b1
 from T T.paremt, T T.child
where T.parent.id = T.child.Parent_id
and T.X = T.child.X
and T.Y = T.child.Y

after creating the index the execution plan shows that it is doing an index scan for CRS.PARENT but going for a full table scan for for CRS.CHILD and also during update as a result the query is taking for ever to complete.

Please suggest any tips or recommendations to solve this problem

1
  • 1
    Please run EXPLAIN PLAN FOR update T ....(resto uf your query)...., then SELECT * FROM table( DBMS_XPLAN.Display ), then copy a result of last query (as a text!!!- not bitmap!!!) and append it to the question. More on explain plan command you can find here: docs.oracle.com/cd/B28359_01/server.111/b28274/… Commented Feb 10, 2017 at 17:44

1 Answer 1

3

You are updating all 500,000 rows, so an index is a bad idea. 500,000 index lookups will take much longer than it needs to.

You would be better served using a MERGE statement.

It is hard to tell exactly what your table structure is, but it would look something like this, assuming X and Y are the primary key columns in T (...could be wrong about that):

MERGE INTO T
USING ( SELECT  TC.X,
                TC.Y,
                TP.A1,
                TP.A2 
        FROM    T TC 
        INNER JOIN T TP ON TP.ID = TC.PARENT_ID ) U
ON ( T.X = U.X AND T.Y = U.Y )
WHEN MATCHED THEN UPDATE SET T.A1 = U.A1, 
                             T.A2 = U.A2;
Sign up to request clarification or add additional context in comments.

7 Comments

If you are updating all 500,000, it's probably even better to re-create the table, and do INSERT... AS SELECT...
@BobC You might be right, but updating two columns in a 500k row table shouldn't take too long. I don't think I would introduce the complexity of DDL into this process unless I really needed to maximize the performance.
The question as posted was about performance, which is my area of expertise. You are right that performing an update might be "good enough", but I wanted to at least show that excellent performance can be achieved. The DDL/DML will not be that complex. However it opens up the possibility of direct path load (via the APPEND hint) and parallelism. Doing a transformation vs modification with direct path load will also eliminate most of the overhead of undo and redo. Again, I just want readers of the post to understand what is possible.
@BobC Fair enough. But to be clear, it also introduces at least a short window during which the data will be unavailable. Unless you intend to delete instead of truncate, in which case you lose most, if not all, of the performance benefits.
@Matthew McPeak,, thank you for understanding the question and providing the best solution.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.