3
t1
id|entity_type
9|3
9|4
9|5
2|3
2|5
           
t2  
id|entity_type
1|3
1|4
1|5     

SELECT t1.id, array_agg(t1.entity_type)
    FROM t1
GROUP BY
    t1.id
HAVING ARRAY_AGG(t1.entity_type ORDER BY t1.entity_type) = 
    (SELECT ARRAY_AGG(t2.entity_type ORDER BY t2.entity_type) 
        FROM t2
    WHERE t2.id = 1
    GROUP BY t2.id);

Result:

t1.id = 9|array_agg{3,4,5}      
    

I have two tables t1 and t2. I want to get value of t1.id where t1.entity_type array equals t2.entity_type array.

In this scenario everything works fine. For t2.id = 1 I receive t1.id = 9. Both have the same array of entity_type: {3,4,5}

Now I'd like to get t1.id not only for equal sets, but also for smaller sets. If I modify t2 this way:

t2  
id|entity_type
1|3
1|4

and modify query this way:

SELECT t1.id, array_agg(t1.entity_type)
    FROM t1
GROUP BY
    t1.id
HAVING ARRAY_AGG(t1.entity_type ORDER BY t1.entity_type) >= /*MODIFICATION*/
    (SELECT ARRAY_AGG(t2.entity_type ORDER BY t2.entity_type) 
        FROM t2
    WHERE t2.id = 1
    GROUP BY t2.id);
    

I don't receive the expected result:

t1.id = 1 has {3, 4, 5}     
t2.id = 1 has {3, 4}

Arrays in t1 that contain the array in t2 should qualify. I expect to receive results as in first case but I get no rows.
Is there any method like: ARRAY_AGG contains another ARRAY_AGG?

2
  • 1
    That's not what >= does. I think you want @>. It's all explained in the docs: postgresql.org/docs/9.3/static/functions-array.html Commented Oct 22, 2014 at 14:19
  • Is the combination of (id, entity_type) unique in both tables? Ideally, you should provide complete table definitions to clarify the details. And your version of Postgres, of course. Commented Oct 22, 2014 at 15:40

1 Answer 1

5

Clean up

It would be inefficient to use two differing invocations of array_agg(). Use the same (ORDER BY in SELECT list and HAVING clause):

SELECT id, array_agg(entity_type ORDER BY entity_type) AS arr
FROM   t1
GROUP  BY 1
HAVING array_agg(entity_type ORDER BY entity_type) = (
   SELECT array_agg(entity_type ORDER BY entity_type)
   FROM   t2
   WHERE  id = 1
   -- GROUP  BY id   -- not needed
   );

Syntax basics in the manual.

"contains" operator @>

Like Nick commented, your 2nd query would work with the "array contains" operator @>

SELECT id, array_agg(entity_type ORDER BY entity_type) AS arr
FROM   t1
GROUP  BY 1
HAVING array_agg(entity_type ORDER BY entity_type) @> (
   SELECT array_agg(entity_type ORDER BY entity_type)
   FROM   t2
   WHERE  id = 1
   );

But this is very inefficient for big tables.

Faster query

The problem can be cast as a case of relational division. Depending on your table definition, there are more efficient techniques. We have gathered a whole arsenal under this related question:

Assuming (id, entity_type) is unique in both tables, this should be substantially faster for big tables, especially because it can use an index on t1 (as opposed to your original query):

SELECT t1.id
FROM   t2
JOIN   t1 USING (entity_type)
WHERE  t2.id = 1
GROUP  BY 1
HAVING count(*) = (SELECT count(*) FROM t2 WHERE id = 1);

You need two indexes:

First on t2(id), typically covered by the primary key.
Second:

CREATE INDEX t1_foo_idx ON t1 (entity_type, id);

The added id column is optional to allow index-only scans. Sequence of columns is essential:

fiddle
Old sqlfiddle

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.