Is there something like a zip() function in PostgreSQL that combines two arrays?

Question

I have two array values of the same length in PostgreSQL:

{a,b,c} and {d,e,f}

and I'd like to combine them into

{{a,d},{b,e},{c,f}}

Is there a way to do that?

Erwin Brandstetter · Accepted Answer · 2021-05-22 22:14:41Z

43

Postgres 9.5 or later

has array_agg(array expression):

array_agg ( anyarray ) → anyarray

Concatenates all the input arrays into an array of one higher dimension. (The inputs must all have the same dimensionality, and cannot be empty or null.)

This is a drop-in replacement for my custom aggregate function array_agg_mult() demonstrated below. It's implemented in C and considerably faster. Use it.

Postgres 9.4

Use the ROWS FROM construct or the updated unnest() which takes multiple arrays to unnest in parallel. Each can have a different length. You get (per documentation):

[...] the number of result rows in this case is that of the largest function result, with smaller results padded with null values to match.

Use this cleaner and simpler variant:

SELECT ARRAY[a,b] AS ab
FROM   unnest('{a,b,c}'::text[] 
            , '{d,e,f}'::text[]) x(a,b);

Postgres 9.3 or older

Simple zip()

Consider the following demo for Postgres 9.3 or earlier:

SELECT ARRAY[a,b] AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
        , unnest('{d,e,f}'::text[]) AS b
    ) x;

Result:

  ab
-------
 {a,d}
 {b,e}
 {c,f}

Note that both arrays must have the same number of elements to unnest in parallel, or you get a cross join instead.

You can wrap this into a function, if you want to:

CREATE OR REPLACE FUNCTION zip(anyarray, anyarray)
  RETURNS SETOF anyarray LANGUAGE SQL AS
$func$
SELECT ARRAY[a,b] FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;
$func$;

Call:

SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);

Same result.

zip() to multi-dimensional array:

Now, if you want to aggregate that new set of arrays into one 2-dimenstional array, it gets more complicated.

SELECT ARRAY (SELECT ...)

or:

SELECT array_agg(ARRAY[a,b]) AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
         ,unnest('{d,e,f}'::text[]) AS b
    ) x

or:

SELECT array_agg(ARRAY[ARRAY[a,b]]) AS ab
FROM  ...

will all result in the same error message (tested with pg 9.1.5):

ERROR: could not find array type for data type text[]

But there is a way around this, as we worked out under this closely related question.
Create a custom aggregate function:

CREATE AGGREGATE array_agg_mult (anyarray) (
   SFUNC    = array_cat
 , STYPE    = anyarray
 , INITCOND = '{}'
);

And use it like this:

SELECT array_agg_mult(ARRAY[ARRAY[a,b]]) AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
        , unnest('{d,e,f}'::text[]) AS b
    ) x

Result:

{{a,d},{b,e},{c,f}}

Note the additional ARRAY[] layer! Without it and just:

SELECT array_agg_mult(ARRAY[a,b]) AS ab
FROM ...

You get:

{a,d,b,e,c,f}

Which may be useful for other purposes.

Roll another function:

CREATE OR REPLACE FUNCTION zip2(anyarray, anyarray)
  RETURNS SETOF anyarray LANGUAGE SQL AS
$func$
SELECT array_agg_mult(ARRAY[ARRAY[a,b]])
FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;
$func$;

Call:

SELECT zip2('{a,b,c}'::text[],'{d,e,f}'::text[]); -- or any other array type

Result:

{{a,d},{b,e},{c,f}}

edited May 22, 2021 at 22:14

answered Sep 13, 2012 at 21:14

Erwin Brandstetter

669k160 gold badges1.2k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Craig Ringer Over a year ago

In PostgreSQL 9.3 and above it'll be possible to rewrite this to use the SQL-standard LATERAL instead of PostgreSQL's sometimes "interesting" set-returning-functions-in-select-list feature. Until then, beware when using arrays of different lengths, you might be surprised by the results.

Erwin Brandstetter Over a year ago

@CraigRinger: Right. For the purpose of this question, the OP defined arrays of the same length, though.

Craig Ringer Over a year ago

Yep, just making sure it's noted, because others who find this might not. Personally I'd write it in PL/PgSQL and add a sanity check.

Craig Ringer · Accepted Answer · 2013-07-30 02:23:26Z

8

Here's another approach that's safe for arrays of differing lengths, using the array multi-aggregation mentioned by Erwin:

CREATE OR REPLACE FUNCTION zip(array1 anyarray, array2 anyarray) RETURNS text[]
AS $$
SELECT array_agg_mult(ARRAY[ARRAY[array1[i],array2[i]]])
FROM generate_subscripts(
  CASE WHEN array_length(array1,1) >= array_length(array2,1) THEN array1 ELSE array2 END,
  1
) AS subscripts(i)
$$ LANGUAGE sql;

regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);
         zip         
---------------------
 {{a,d},{b,e},{c,f}}
(1 row)


regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f,g}'::text[]);
             zip              
------------------------------
 {{a,d},{b,e},{c,f},{NULL,g}}
(1 row)

regress=> SELECT zip('{a,b,c,z}'::text[],'{d,e,f}'::text[]);
             zip              
------------------------------
 {{a,d},{b,e},{c,f},{z,NULL}}
(1 row)

If you want to chop off the excess rather than null-padding, just change the >= length test to <= instead.

This function does not handle the rather bizarre PostgreSQL feature that arrays may have a stating element other than 1, but in practice nobody actually uses that feature. Eg with a zero-indexed 3-element array:

regress=> SELECT zip('{a,b,c}'::text[], array_fill('z'::text, ARRAY[3], ARRAY[0]));
          zip           
------------------------
 {{a,z},{b,z},{c,NULL}}
(1 row)

wheras Erwin's code does work with such arrays, and even with multi-dimensional arrays (by flattening them) but does not work with arrays of differing length.

Arrays are a bit special in PostgreSQL, they're a little too flexible with multi-dimensional arrays, configurable origin index, etc.

In 9.4 you'll be able to write:

SELECT array_agg_mult(ARRAY[ARRAY[a,b])
FROM unnest(array1) WITH ORDINALITY as (o,a)
NATURAL FULL OUTER JOIN
unnest(array2) WITH ORDINALITY as (o,b);

which will be a lot nicer, especially if an optimisation to scan the functions together rather than doing a sort and join goes in.

edited Jul 30, 2013 at 2:23

answered Jul 29, 2013 at 6:03

Craig Ringer

329k84 gold badges742 silver badges820 bronze badges

3 Comments

dan Over a year ago

Thanks for the answer. So you're not a fan of using the Array datatype in Postgres?

Craig Ringer Over a year ago

@dan On the contrary, it's extremely useful, particularly as an intermediate result in complex queries and as an optimisation. It's just that the way arrays are implemented in Pg makes their behaviour a little ... surprising ... at times. Arrays with configurable starting indexes, multi-dimensional arrays under a single array type, etc. I just wish there was a more conventional simplified array type.

Erwin Brandstetter Over a year ago

Nice add and interesting bit of news from pg 9.4. BTW.: One can use arr_var[-2147483648:2147483647] to "normalize" array subscripts; by courtesy of Daniel.

Collectives™ on Stack Overflow

Is there something like a zip() function in PostgreSQL that combines two arrays?

2 Answers 2

Postgres 9.5 or later

Postgres 9.4

Postgres 9.3 or older

Simple zip()

zip() to multi-dimensional array:

3 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Postgres 9.5 or later

Postgres 9.4

Postgres 9.3 or older

Simple zip()

zip() to multi-dimensional array:

3 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related