0

I'm using python pandas with Postgresql,and I have a table named stock:

        open    high    low close      volume   datetime
        383.97  384.22  383.66  384.08  1298649 2022-12-16 14:25:00
        383.59  384.065 383.45  383.98  991327  2022-12-16 14:20:00
        383.59  384.065 383.45  383.98  991327  2022-12-16 14:20:00
        383.59  384.065 383.45  383.98  991327  2022-12-16 14:20:00
        383.64  384.2099 383.54 383.61  1439271 2022-12-16 14:15:00

How can I remove the rows that have duplicated datetime ,and only keep 1 row of it,only keep the latest row of it.

The output should be:

        open    high    low close      volume   datetime
        383.97  384.22  383.66  384.08  1298649 2022-12-16 14:25:00
        383.59  384.065 383.45  383.98  991327  2022-12-16 14:20:00
        383.64  384.2099 383.54 383.61  1439271 2022-12-16 14:15:00

Something like:

delete from stock where datetime duplicated > 1

1 Answer 1

3

Use drop_duplicates() from pandas:

# Remove duplicates and keep only the last row
df = df.drop_duplicates(subset='datetime', keep='last')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.