5

I would like to partition a table in Postgres by previously unknown value. In my scenario that value would be device_id which is a string.

This is current situation:

Table 'device_data' - stores sensor data which is sent from devices, defined by DDL:

CREATE TABLE warehouse.device_data (
  id INTEGER PRIMARY KEY NOT NULL DEFAULT nextval('device_data_id_seq'::regclass),
  device_id TEXT NOT NULL,
  device_data BYTEA NOT NULL,
--   contains additional fields which are omitted for brevity
  received_at TIMESTAMP WITHOUT TIME ZONE DEFAULT now()
);

Table currently holds millions of records and queries are taking huge amount of time. Most of queries contain WHERE device_id='something' clause.

Solution I have in mind is to create table partitions for each device_id.

Is it possible in Postgres to create table partitions for each device_id?

I went through Postgres documentation and couple of examples I found but all of them use fixed boundaries to create partitions. My solution would require:

  1. create new table partition on the fly when new device_id is first encountered
  2. store to an existing partition if the device_id is already known and partition for that device_id already exist

I would like this to be done using table partitions as it would allow querying across multiple device_ids.

2
  • 3
    why you want partition by device_id? a simple index for device_id will do the job. Performance questions should include EXPLAIN ANALYZE and some information about table size, index, current time performance, desire time, etc. Slow is a relative term and we need a real value to compare. Commented Oct 18, 2016 at 14:38
  • Have a look at pg_pathman. It is a tool to simplify partition management under PostgreSQL and it specifically supports a HASH partitioning strategy. The drawback is that the number of partitions is fixed, and chosen when you initialize your partitions set. Commented May 22, 2018 at 16:06

1 Answer 1

6

I like the idea of dynamic partitioning. I do not know though how it will affect the performance as I have never used it.

Change the type of id to int default 0 and manually create the sequence to avoid multiple nextval() calls on a single insert:

create table device_data (
    id int primary key default 0,
    device_id text not null,
    device_data text not null, -- changed for tests
    received_at timestamp without time zone default now()
);
create sequence device_data_seq owned by device_data.id;

Use dynamic sql in the trigger function:

create or replace function before_insert_on_device_data()
returns trigger language plpgsql as $$
begin
    execute format(
        $f$
            create table if not exists %I (
            check (device_id = %L)
            ) inherits (device_data)
        $f$, 
        concat('device_data_', new.device_id), 
        new.device_id);
    execute format(
        $f$
            insert into %I
            values (nextval('device_data_seq'), %L, %L, default)
        $f$, 
        concat('device_data_', new.device_id), 
        new.device_id, 
        new.device_data);
    return null;
end $$;

create trigger before_insert_on_device_data
    before insert on device_data
    for each row execute procedure before_insert_on_device_data();

Test:

insert into device_data (device_id, device_data) values
    ('first', 'data 1'),
    ('second', 'data 1'),
    ('first', 'data 2'),
    ('second', 'data 2');

select * from device_data_first;

 id | device_id | device_data |        received_at         
----+-----------+-------------+----------------------------
  1 | first     | data 1      | 2016-10-18 19:50:40.179955
  3 | first     | data 2      | 2016-10-18 19:50:40.179955
(2 rows)

select * from device_data_second;

 id | device_id | device_data |        received_at         
----+-----------+-------------+----------------------------
  2 | second    | data 1      | 2016-10-18 19:50:40.179955
  4 | second    | data 2      | 2016-10-18 19:50:40.179955
(2 rows)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.