Postgresql: Comparing two arrays but including the order

Question

I wish to compare two arrays in a Postgres query, returning true when the first array is embedded within the second array. The smaller array can occur at any point within the larger one. It's probably best shown with an example. For the following, *cmp* is the magical operator that I'm hoping to find.

{b,c}   *cmp* {a,b,b,c,d} -- true
{b,d}   *cmp* {a,b,b,c,d} -- false
{a,b,b} *cmp* {a,b,b,c,d} -- true
{a,b}   *cmp* {a,b,b,c,d} -- true
{a,b,c} *cmp* {a,b,b,c,d} -- false

I know of the <@ operator, which is a good start, but does not take into account the order of elements.

   {b,d} <@ {a,b,b,c,d} -- true, but I want false

I have in my code a workaround which is quite ugly (perl's DBD::Pg uses '?' as a placeholder)

array_values::text similar to '%({|,)' || ? || '(,|})%'

Seems to work, but I'd love to be able to use an index here. It will also fall over whenever quotes are used in the text representation, but fortunately that won't happen for my use case. Am I missing a trick?

EDIT

I probably should have made better examples. Here are some more

{bb,c}   *cmp* {a,b,bb,c,d} -- true
{b,c}    *cmp* {a,b,bb,c,d} -- false
{a,b,bb} *cmp* {a,b,bb,c,d} -- true
{a,b,b}  *cmp* {a,b,bb,c,d} -- false
{c,d}    *cmp* {a,b,bb,c,d} -- true

I cannot think of anything smarter than a substring match either. That can be made fast with a trigram index. — Laurenz Albe
– Laurenz Albe, Commented Nov 12, 2019 at 11:38
You can write a function that doesn't use string concatenation, but I don't see a way to make that use an index — user330315
– user330315, Commented Nov 12, 2019 at 14:22

Jeremy · Accepted Answer · 2019-11-12 15:41:01Z

2

You can do this without comparing text versions. I'm not sure if the performance will be better though. Basically, check using the @> operator as a fast-fail (hopefully) and then look for the first item of the array in the test array. Grab a slice starting from that position and see if it's the same as the test array.

CREATE TABLE test (a text[]);
INSERT INTO test VALUES ('{bb,c}'), ('{b,c}'), ('{a,b,bb}'), ('{a,b,b}'), ('{c,d}');
SELECT a, 
       '{a, b, bb, c, d, b}' @> a AND (
         SELECT bool_or(
               ('{a, b, bb, c, d, b}'::text[])[x:(x+array_length(a,1) - 1)] = a
               ) 
         FROM unnest(array_positions('{a, b, bb, c, d, b}', a[1])) as pos(x)
       )
FROM test;
    a     | ?column?
----------+----------
 {bb,c}   | t
 {b,c}    | f
 {a,b,bb} | t
 {a,b,b}  | f
 {c,d}    | t

I added an extra 'b' to the test array so array_positions would return more than one result for the second test.

answered Nov 12, 2019 at 15:41

Jeremy

6,78322 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Christopher Causer Over a year ago

Wow, that's quite an involved process! Similar in intent to @jjanes solution, but this one benefits from being general purpose whereas jjanes one will fail for elements with quotes embedded in them.

jjanes · Accepted Answer · 2019-11-12 14:18:23Z

0

Use @> to get the index, then recheck for the order using your current method.

array_values::text similar to '%({|,)' || $1 || '(,|})%' and array_values @> ('{'||$1||'}')::text[]

How fast this will be will depend on how many rows have all the right values but not in the right order.

I use $1 rather than ? so that you don't have to specify the same parameter twice from your perl.

answered Nov 12, 2019 at 14:18

jjanes

45k5 gold badges39 silver badges48 bronze badges

Collectives™ on Stack Overflow

Postgresql: Comparing two arrays but including the order

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related