efficient db operations

Question

Here is the scenario I am researching a solution for at work. We have a table in postgres which stores events happening on network. Currently the way it works is, rows get inserted as network events come and at the same time older records which match the specific timestamp get deleted in order to keep table size limited to some 10,000 records. Basically, similar idea as log rotation. Network events come in burst of thousands at a time, hence rate of transaction is too high which causes performance degradation, after sometime either server just crashes or becomes very slow, on top of that, customer is asking to keep table size up to million records which is going to accelerate performance degradation (since we have to keep deleting record matching specific timestamp) and cause space management issue. We are using simple JDBC to read/write on table. Can tech community out there suggest better performing way to handle inserts and deletes in this table?

Do you have any numbers on the magnitude here ? 10k or even 2 million records is in itself very little. Then again it's quite a lot if you get bursts of 2 million records every second, while 2 million per hour isn't so much. Also, provide the DB schema for this table, including the indexes you have and typical queries you do will help a lot in suggesting improvements. — nos
– nos, Commented Jan 26, 2011 at 15:19

john personna · Accepted Answer · 2011-01-26 15:21:19Z

4

I think I would use partitioned tables, perhaps 10 x total desired size, inserting into the newest, and dropping the oldest partition.

http://www.postgresql.org/docs/9.0/static/ddl-partitioning.html

This makes load on "dropping oldest" much smaller than query and delete.

Update: I agree with nos' comment though, the inserts/deletes may not be your bottleneck. Maybe some investigation first.

answered Jan 26, 2011 at 15:21

john personna

4423 silver badges8 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Frank Heikens Over a year ago

I would settle for 24 partions, 1 partition for every hour of the day. And then use TRUNCATE on the partitions that can be emptied because TRUNCATE doesn't need VACUUM.

alwaysLearning Over a year ago

Thanks a lot for quick and very helpful responses. I just started here last week, I will investigate more on our db schema, indexes, queries and what else can be bottleneck and soon will post all the details.

Steve B. · Accepted Answer · 2011-01-26 15:32:31Z

0

Some things you could try -

Write to a log, have a separate batch proc. write to the table.
Keep the writes as they are, do the deletes periodically or at times of lower traffic.
Do the writes to a buffer/cache, have the actual db writes happen from the buffer.

A few general suggestions -

Since you're deleting based on timestamp, make sure the timestamp is indexed. You could also do this with a counter / auto-incremented rowId (e.g. delete where id< currentId -1000000).
Also, JDBC batch write is much faster than individual row writes (order of magnitude speedup, easily). Batch writing 100 rows at a time will help tremendously, if you can buffer the writes.

answered Jan 26, 2011 at 15:32

Steve B.

57.6k12 gold badges99 silver badges134 bronze badges

Collectives™ on Stack Overflow

efficient db operations

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related