I'm new to PostgreSQL (version 9) partitioning and I need an advice please. I have a set of parts for country A and another set of parts for country B. Each set has about two million records. I need to load both sets into the database. Each set needs to be updated weekly. By updating the set I mean clearing all data of a set and loading it again from a file. Sets should be updated independently so when clearing data of set A I must not clear data of set B. I understood that when I store each set in a separate partition I can truncate each partition independently which is much faster then deleting records. So I decided to do it this way (there are more columns in the table 'part' but they aren't important for this question):
CREATE TABLE part (
country CHAR(3) NOT NULL,
manufacturer_code CHAR(2) NOT NULL,
part_code CHAR(4) NOT NULL,
part_description VARCHAR(100)
);
CREATE TABLE part_cze (
CHECK (country = 'CZE')
) INHERITS (part);
CREATE INDEX idx__part_cze ON part_cze (part_code, manufacturer_code);
CREATE TABLE part_svk (
CHECK (country = 'SVK')
) INHERITS (part);
CREATE INDEX idx__part_svk ON part_svk (part_code, manufacturer_code);
The application queries table 'part' to fetch the data. The query may look like this:
SELECT * FROM part WHERE country='CZE' AND part_code='4578' AND manufacturer_code='22'
My questions:
- Is the above scheme correct or would you recommend something else?
- Do I need indexes on table
partorpart_cze? If I understand well Postgres fetches data from tablepart_czewhich has the index so tablepartshouldn't need it. - Should the index
idx__part_czecontain column country or is it sufficient that data are separated to partitions by the country? - If I created index on table
partshould it contain columncountry?