0

Here is the scenario I am researching a solution for at work. We have a table in postgres which stores events happening on network. Currently the way it works is, rows get inserted as network events come and at the same time older records which match the specific timestamp get deleted in order to keep table size limited to some 10,000 records. Basically, similar idea as log rotation. Network events come in burst of thousands at a time, hence rate of transaction is too high which causes performance degradation, after sometime either server just crashes or becomes very slow, on top of that, customer is asking to keep table size up to million records which is going to accelerate performance degradation (since we have to keep deleting record matching specific timestamp) and cause space management issue. We are using simple JDBC to read/write on table. Can tech community out there suggest better performing way to handle inserts and deletes in this table?

1
  • 1
    Do you have any numbers on the magnitude here ? 10k or even 2 million records is in itself very little. Then again it's quite a lot if you get bursts of 2 million records every second, while 2 million per hour isn't so much. Also, provide the DB schema for this table, including the indexes you have and typical queries you do will help a lot in suggesting improvements. Commented Jan 26, 2011 at 15:19

2 Answers 2

4

I think I would use partitioned tables, perhaps 10 x total desired size, inserting into the newest, and dropping the oldest partition.

http://www.postgresql.org/docs/9.0/static/ddl-partitioning.html

This makes load on "dropping oldest" much smaller than query and delete.

Update: I agree with nos' comment though, the inserts/deletes may not be your bottleneck. Maybe some investigation first.

Sign up to request clarification or add additional context in comments.

2 Comments

I would settle for 24 partions, 1 partition for every hour of the day. And then use TRUNCATE on the partitions that can be emptied because TRUNCATE doesn't need VACUUM.
Thanks a lot for quick and very helpful responses. I just started here last week, I will investigate more on our db schema, indexes, queries and what else can be bottleneck and soon will post all the details.
0

Some things you could try -

  • Write to a log, have a separate batch proc. write to the table.
  • Keep the writes as they are, do the deletes periodically or at times of lower traffic.
  • Do the writes to a buffer/cache, have the actual db writes happen from the buffer.

A few general suggestions -

  • Since you're deleting based on timestamp, make sure the timestamp is indexed. You could also do this with a counter / auto-incremented rowId (e.g. delete where id< currentId -1000000).
  • Also, JDBC batch write is much faster than individual row writes (order of magnitude speedup, easily). Batch writing 100 rows at a time will help tremendously, if you can buffer the writes.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.