2

I have a jsonb column (called info) in Postgres which structure looks like this:

{ name: 'john', house_id: null, extra_attrs: [{ attr_id: 4, attr_value: 'a value' }, { attr_id: 5, attr_value: 'another value' }] }

It can have N extra_attrs but we know that each of them will have just two keys: the attr_id and the attr_value.

Now, what is the best way to query for info that has extra_attrs with a specific attr_id and attr_value. I have done it like this, and it works:

Given the following data structure to query for:

[{ attr_id: 4, values: ['a value', 'something else'] }, { attr_id: 5, values: ['another value'] }]

The following query works:

select * from people
where (info @> '{"extra_attrs": [{ "attr_id": 4, "attr_value": "a value" }]} OR info @> '{"extra_attrs": [{ "attr_id": 4, "attr_value": "something else" }]) AND info @> '{"extra_attrs": [{ "attr_id": 5, "attr_value": "another value" }]}

I am wondering if there is a better way to do so or this is fine.

1 Answer 1

1

One alternate method would involve json functions and transforming data to apply the filter on:

SELECT people.info
FROM people,
LATERAL (SELECT DISTINCT True is_valid
         FROM jsonb_array_elements(info->'extra_attrs') y
         WHERE (y->>'attr_id', y->>'attr_value') IN (
             ('4', 'a value'), 
             ('4', 'something else'),
             ('5','another value')
         )
) y
WHERE is_valid

I believe this method more convenient for dynamic filters since the id/value pairs are added in only 1 place.

A similar (and perhaps slightly faster) method would use WHERE EXISTS and compare json documents like below.

SELECT people.info
FROM people
WHERE EXISTS (SELECT TRUE 
       FROM jsonb_array_elements(info->'extra_attrs') attrs
       WHERE attrs @> ANY(ARRAY[
           JSONB '{ "attr_id": 4, "attr_value": "a value" }',
           JSONB '{ "attr_id": 4, "attr_value": "something else" }',
           JSONB '{ "attr_id": 5, "attr_value": "another value" }'
              ]
       )
)
Sign up to request clarification or add additional context in comments.

1 Comment

I'm not sure which method would be faster. it would probably depend on the data. if you have a test dataset, please run the three queries (yours, and the two I suggest) and post the execution times here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.