Randomly selecting rows and updating another table per row in Postgres

Question

I'm using postgres 10 and I'm looking to randomise some data.

I start by creating a temporary table and fill it with 1,000 rows of random data.

I then want to merge that into another table that may have less or more rows than the random data.

For each row in my dimension table I want to select a random row from the random data in the temporary table, setting the values in the dimension table to the randomly selected rows values in the temporary table.

eg.

I have a table called reference.tv_shows with the fields Name and Category.

I have a temporary table called random_tv_shows with the fields Name and Category. This data is completely random and consists of 1,000 rows.

I want to go through EACH row in the reference.tv_shows and pick a random row in the random_tv_shows table and set the reference.tv_shows Name and Category to be that of the selected row in random_tv_shows.

I tried running a fairly simple select but it looks as though it evaluates itself once then updates (Or maybe RANDOM() is only random once per TX?).

UPDATE reference.tv_shows SET "Name" = (SELECT "Name" FROM random_tv_shows ORDER BY RANDOM() LIMIT 1)

Is there a way to do this in postgres?

You did forget to explain what you are trying to do (in a clear way), You just told what you are doing, not what you have, and what you want to get. Please add sample input, and desired output (like in: minimal reproducible example) — Luuk
– Luuk, Commented Mar 14, 2022 at 10:34
@Luuk Wow, really? I'm quite surprised. I figured that it was quite clear but obviously not. I'll try and fix it so that it's easier to understand... — Alex
– Alex, Commented Mar 14, 2022 at 10:39
@Luuk I suppose it depends how you read it. I could re-phrase it a little bit as it may come across like I am suggesting "I tried this and it didn't work" but really I was trying to ask "How would I do this?". Let me re-phrase :) — Alex
– Alex, Commented Mar 14, 2022 at 10:41
So, you are really (trying to) pick a random row from a table that has random values (random_tv_shows) ? — Luuk
– Luuk, Commented Mar 14, 2022 at 10:53

Luuk · Accepted Answer · 2022-03-14 13:00:12Z

2

When I have a test table, with the field a which is an integer,

If I do this:

update test set a=random()*1000;

If wil get random values for every record in my table.

But when I do this:

update test set a=(select random()*1000);

All values for a will be the same.

This is shown in this DBFIDDLE

Because, when updating the table reference.tv_shows, you only want 1 tv_show to be updated, you need to have a unique identifier for every tv_show. currently that info is not available in the question.

EDIT: I tried to reproduce your data (less records, and lack of imagination on categories, but... 😉).

When you have a unique id in your tables you can do:

UPDATE tv_shows 
SET Name = rts.Name,
    Category =  rts.Category
FROM tv_shows ts
INNER JOIN (SELECT ROW_NUMBER() OVER () R, Name, Category 
            FROM random_tv_shows
            ORDER BY RANDOM()) rts on rts.R = ts.id
WHERE tv_shows.id = ts.id

see DBFIDDLE

edited Mar 14, 2022 at 13:00

answered Mar 14, 2022 at 11:54

Luuk

15.4k5 gold badges28 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Alex Over a year ago

Haha, bingo! Thankyou. Sorry I didn't provide any data. I thought it might be quick thing but it looks as though it was a bit more complex and some data may have helped. Thankyou again.

Nishant Ghodke Over a year ago

I am working on a similar task and wondering how to update each row without a common unique id between both the tables.

Luuk Over a year ago

@NishantGhodke: When you have a new question, please ask a new question

Collectives™ on Stack Overflow

Randomly selecting rows and updating another table per row in Postgres

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related