1

I have a dataframe a thousands of rows long that looks like this:

ID  Email Address
1   ...    ... 
2   ...    ... 
3   ...    ... 
4   ...    ... 
1   ...    ... 
2   ...    ... 
5   ...    ... 
5   ...    ... 
6   ...    ... 

what I want to do is drop duplicates of ID so there is only one ID per person. I can't use drop_duplicates() because most people don't have ID's and this drops them too (not good!)

Is there a way to remove specific rows and only keep one instance of the IDs.

I have a dataframe of all the duplicate ID I want to remove if that helps. e.g. for the example I gave above:

ID  Email  Address
1   ...    ...
2   ...    ...
5   ...    ...

Maybe there's a way to turn this to a series/array of IDs and remove from the df that way?

4
  • What is expected output? Commented Dec 21, 2018 at 11:34
  • @nixon I think that blank entries are also being considered as duplicates so thousands of rows are being removed just because an ID is not present Commented Dec 21, 2018 at 11:42
  • Thanks @user8322222 Commented Dec 21, 2018 at 11:43
  • @user8322222 - Please check edited answer. Commented Dec 21, 2018 at 11:51

2 Answers 2

1

I believe you need chain 2 conditions - duplicated with keep=False for all dupes with no parameter for first dupes:

df = df[df.duplicated(subset='ID', keep=False) & df.duplicated(subset='ID')]
print (df)
   ID Email Address
4   1   ...     ...
5   2   ...     ...
7   5   ...     ...
Sign up to request clarification or add additional context in comments.

1 Comment

@user8322222 - Super, glad can help!
1

Is this what you want?

df[df.duplicated(subset='ID')]

    ID Email Address
4   1   ...     ...
5   2   ...     ...
7   5   ...     ...

4 Comments

Hi nixon, unfortunately this seems to be dropping blank entries for ID too (same issue as drop_duplicates() I imagine)
Blank entries for ID? Please could you give an example of your desired output?
Seing that what you want from the other answer, you can simply do this
hi nixon, I was looking for the following: ID Email Address 1 ... ... 2 ... ... 3 ... ... 4 ... ... 5 ... ... and it was answered. Thanks for your help and time though! :D

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.