4

I have a table like this:

CREATE TABLE preferences (name varchar, preferences varchar[]);
INSERT INTO preferences (name, preferences) 
VALUES 
    ('John','{pizza, spaghetti}'), 
    ('Charlie','{spaghetti, rice}'), 
    ('Lucy','{rice, potatoes}'), 
    ('Beth','{bread, cheese}'), 
    ('Trudy','{rice, milk}');

So from the table

John      {pizza, spaghetti}
Charlie   {spaghetti, rice}
Lucy      {rice, potatoes}
Beth      {bread, cheese}
Trudy     {rice, milk}

I would like group all rows that have elements in common (even if it is through other people). So in this case I would like to end up with:

{John,Charlie,Lucy,Trudy}     {pizza,spaghetti,rice,potatoes,milk}
{Beth}                        {bread, cheese}

because Johns preferences intersect with those of Charlie, and those of Charlie intersect with those of Lucy and with those of Trudy.

I already haven an array_intersection function like this:

CREATE OR REPLACE FUNCTION array_intersection(anyarray, anyarray)
  RETURNS anyarray
  language sql
as $FUNCTION$
    SELECT ARRAY(
        SELECT UNNEST($1)
        INTERSECT
        SELECT UNNEST($2)
    );
$FUNCTION$;

and know the array_agg function to aggregate arrays, but how to turn those into a grouping like I want is the step I am missing.

4
  • how you plan to use your function?.. create aggregation on it?.. compare with window?.. Commented Sep 29, 2017 at 9:52
  • you want array_intersects, because you can't use && operator?.. how is it different?.. Commented Sep 29, 2017 at 10:04
  • @VaoTsun This is just a sample to reduce the question to the bare essentials. IRL the data is more complex, but basically I want to group by in the way illustrated in the example. From the first table to the second. On all other columns in the real world case I will use aggregate functions. Commented Sep 29, 2017 at 12:58
  • @VaoTsun I can indeed (and did) remove my array_intersects function in favor of &&. I kind of forgot about &&. But it doesn't change the question. Commented Sep 29, 2017 at 12:59

1 Answer 1

3

This is a typical task for recursion. You need an auxiliary function to merge and sort two arrays:

create or replace function public.array_merge(arr1 anyarray, arr2 anyarray)
    returns anyarray
    language sql immutable
as $function$
    select array_agg(distinct elem order by elem)
    from (
        select unnest(arr1) elem 
        union
        select unnest(arr2)
    ) s
$function$;

Use the function in the recursive query:

with recursive cte(name, preferences) as (  
    select *
    from preferences
union
    select p.name, array_merge(c.preferences, p.preferences)
    from cte c
    join preferences p 
    on c.preferences && p.preferences 
    and c.name <> p.name
)
select array_agg(name) as names, preferences
from (
    select distinct on(name) *
    from cte
    order by name, cardinality(preferences) desc
    ) s
group by preferences;

           names           |             preferences              
---------------------------+--------------------------------------
 {Charlie,John,Lucy,Trudy} | {milk,pizza,potatoes,rice,spaghetti}
 {Beth}                    | {bread,cheese}
(2 rows)    
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.