Postgres database dump size larger than physical size

Question

I just made an pg_dump backup from my database and its size is about 95GB but the size of the direcory /pgsql/data is about 38GB.

I run a vacuum FULL and the size of the dump does not change. The version of my postgres installation is 9.3.4, on a CentOS release 6.3 server.

It is very weird the size of the dump comparing with the physical size or I can consider this normal?

Thanks in advance!

Regards.

Neme.

This can happen if you have a lot of (not NULLable, high-valued) numerical fields. The dump is basically ASCII, and a maximum-value 4byte integer field takes about 10 bytes in ASCII (plus one byte for the \t or \n separators) Apparently you don't have many indexes on your tables, since indexes are not included in the dump, only the DDL to reconstruct them. — wildplasser
– wildplasser, Commented May 16, 2016 at 14:50

Jim Nasby · Accepted Answer · 2016-05-16 17:11:15Z

3

The size of pg_dump output and the size of a Postgres cluster (aka 'instance') on disk have very, very little correlation. Consider:

pg_dump has 3 different output formats, 2 of which allow compression on-the-fly
pg_dump output contains only schema definition and raw data in a text (or possibly "binary" format). It contains no index data.
The text/"binary" representation of different data types can be larger or smaller than actual data stored in the database. For example, the number 1 stored in a bigint field will take 8 bytes in a cluster, but only 1 byte in pg_dump.

This is also why VACUUM FULL had no effect on the size of the backup.

Note that a Point In Time Recovery (PITR) based backup is entirely different from a pg_dump backup. PITR backups are essentially copies of the data on disk.

answered May 16, 2016 at 17:11

Jim Nasby

1,27810 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Lifeboy Over a year ago

Jim, your answer says the exact opposite of what the original question was. The backup / dump is twice the size of the actual database on disk. This seems unlikely, but I'm seeing the exact same thing.

Tim Biegeleisen · Accepted Answer · 2016-05-16 14:50:16Z

1

Postgres does compress its data in certain situations, using a technique called TOAST:

PostgreSQL uses a fixed page size (commonly 8 kB), and does not allow tuples to span multiple pages. Therefore, it is not possible to store very large field values directly. To overcome this limitation, large field values are compressed and/or broken up into multiple physical rows. This happens transparently to the user, with only small impact on most of the backend code. The technique is affectionately known as TOAST (or "the best thing since sliced bread").

answered May 16, 2016 at 14:50

Tim Biegeleisen

526k32 gold badges324 silver badges399 bronze badges

Collectives™ on Stack Overflow

Postgres database dump size larger than physical size

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related