PostgreSQL - How to get distinct on two columns separately?

Question

I've a table like this:

Source table "tab"
column1   column2
      x         1
      x         2
      y         1
      y         2
      y         3
      z         3

How can I build the query to get result with unique values in each of two columns separately. For example I'd like to get a result like one of these sets:

column1   column2
      x         1
      y         2
      z         3

or

column1   column2
      x         2
      y         1
      z         3

or ...

Thanks.

@Rory: "3" should be in column2 results, because it's unique. — ptb
– ptb, Commented Mar 1, 2014 at 10:59
This is a complex combinatorics problem that cannot easily be solved with standard SQL commands. The difficulty is to pick existing combinations so that every element is represented once in the result. A solution is often impossible - consider the set (x,1), (y, 1). — Erwin Brandstetter
– Erwin Brandstetter, Commented Mar 1, 2014 at 20:30

Rory · Accepted Answer · 2014-03-02 16:59:29Z

1

What you're asking for is difficult because it's weird: SQL treats rows as related fields but you're asking to make two separate lists (distinct values from col1 and distinct values from col2) then display them in one output table not caring how the rows match up.

You can so this by writing the SQL along those lines. Write a separate select distinct for each column, then put them together somehow. I'd put them together by giving each row in each results a row number, then joining them both to a big list of numbers.

It's not clear what you want null to mean. Does it mean there's a null in one of the columns, or that there's not the same number of distinct values in each column? This one problem from asking for things that don't match up with typical relational logic.

Here's an example, removing the null value from the data since that confuses the issue, different data values to avoid confusing rowNumber with data and so there are 3 distinct values in one column and 4 in another. This works for SQL Server, presumably there's a variation for PostgreSQL.

if object_id('mytable') is not null drop table mytable;
create table mytable ( col1 nvarchar(10) null, col2 nvarchar(10) null) 
insert into mytable 
            select 'x', 'a'
union all   select 'x', 'b'
union all   select 'y', 'c'
union all   select 'y', 'b'
union all   select 'y', 'd'
union all   select 'z', 'a'

select c1.col1, c2.col2
from 
    -- derived table giving distinct values of col1 and a rownumber column
(   select col1
        , row_number() over (order by col1) as rowNumber 
    from ( select distinct col1 from mytable ) x ) as c1
full outer join 
    -- derived table giving distinct values of col2 and a rownumber column
(   select col2
        , row_number() over (order by col2) as rowNumber 
    from ( select distinct col2 from mytable ) x ) as c2
on c1.rowNumber = c2.rowNumber

edited Mar 2, 2014 at 16:59

answered Mar 1, 2014 at 10:40

Rory

42.2k56 gold badges192 silver badges271 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

ptb Over a year ago

I've put unfortunately NULL into the table. We can forget it.

ptb Over a year ago

"Write a separate select distinct for each column, then put them together somehow. I'd put them together by giving each row in each results a row number, then joining them both to a big list of numbers." Can you explane it in exapmle?

Collectives™ on Stack Overflow

PostgreSQL - How to get distinct on two columns separately?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related