3

What is the equivalent of pandas df.groupby('v1').apply(lambda x:['v2'].nunique()) with posgres sql?

i.e. given a table I want to know the number of unique values of v2 for each v1.

3 Answers 3

3

Maybe you mean

SELECT v1, count(DISTINCT v2)
FROM df
GROUP BY v1;
Sign up to request clarification or add additional context in comments.

2 Comments

can potentially be sorted as well?
Yes, just add an ORDER BY clause.
0

SELECT v1, COUNT(v2) FROM t GROUP BY v1;

OR

SELECT v1, COUNT(DISTINCT v2) FROM t GROUP BY v1;

Comments

0

Also check his post array_agg. It was helpful to me. It will give you an array list. I just did something like:

SELECT directory, ARRAY_AGG(file_name) FROM table WHERE type = 'ZIP' GROUP BY directory;

And the result was something like:

parent_directory | array_agg | ------------------------+----------------------------------------+ /home/postgresql/files | {zip_1.zip,zip_2.zip,zip_3.zip} | /home/postgresql/files2 | {file1.zip,file2.zip} |


This post also helped me a lot: "Group By" in SQL and Python Pandas. It basically says that it is more convenient to use only SQL when possible, but that Python Pandas can be useful to achieve extra functionalities in the filtering process.

I hope it helps

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.