0

I am currently facing an issue in our old database(postgres 9.4) table which contains some duplicate rows. I want to ensure that no more duplicate rows should be generated.

But I also want to keep the duplicate rows that already has been generated. Due to which I could not apply unique constraint on those columns(multiple column).

I have created a trigger which would check the row if already exists and raise exception accordingly. But it is also failing when concurrent transactions are in processing.

Example :

TAB1

col1   |  col2  |  col3  |
------------------------------------
1      |  A     |  B     |   -- 
2      |  A     |  B     |   -- already present duplicates for column col2 and col3(allowed)
3      |  C     |  D     |

INSERT INTO TAB1 VALUES(4 , 'A' , 'B') ; -- This insert statement will not be allowed.

Note: I cannot use on conflict due to older version of database.

2
  • 1
    Can you put more information like table structure and some rows of example? This will permit to others to simulate your problem and do faster and better answers to you. Commented May 23, 2020 at 11:29
  • @WilliamPrigolLopes I've added an example. Commented May 23, 2020 at 12:14

2 Answers 2

5

Presumably, you don't want new rows to duplicate historical rows. If so, you can do this but it requires modifying the table and adding a new column.

alter table t add duplicate_seq int default 1;

Then update this column to identify existing duplicates:

update t
    set duplicate_seq = seqnum
    from (select t.*, row_number() over (partition by col order by col) as seqnum
          from t
         ) tt
    where t.<primary key> = tt.<primary key>;

Now, create a unique index or constraint:

alter table t add constraint unq_t_col_seq on t(col, duplicate_seq);

When you insert rows, do not provide a value for duplicate_seq. The default is 1. That will conflict with any existing values -- or with duplicates entered more recently. Historical duplicates will be allowed.

Sign up to request clarification or add additional context in comments.

2 Comments

It's really nice solution and I will be implementing it. But just for curiosity, is there any other way without altering the table.
@SABER-FICTIONALCHARACTER . . . It might be possible to use your trigger in combination with the filtered unique constraint suggested by pilfor although I'm not 100% sure. You could also handle it by using table locks on the inserts, but presumably you don't want to incur so much overhead
0

You can try to create a partial index to have the unique constraint only for a subset of the table rows:

For example:

create unique index on t(x) where (d > '2020-01-01');

1 Comment

Although a reasonable solution, this allows new rows to duplicate historical values.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.