How do I specify columns when loading new rows into PostgreSQL using pg_bulkload

Question

I'm experimenting with using the pg_bulkload project to import millions of rows of data into a database. However, none of the new rows have a primary key and only two of several columns are avalable in my input file. How do I tell pg_bulkload which columns I'm importing and how do I generate the primary key field? Do I need to edit my import file to match exactly what the output of a COPY command would be and generate the id field myself?

For example, lets say my database columns might be:

id         title        body        published

The data that I have is limited to title and published and are listed in a tab delimited file. My .ctl file looks like this:

TABLE = posts
INFILE = stdin
TYPE = CSV
DELIMITER = "   "

Tometzky · Accepted Answer · 2010-09-27 07:33:48Z

4

You can use FILTER functionality of pg_loader. Something like:

In database

CREATE FUNCTION pg_bulkload_filter(text, text) RETURNS record
AS $$
  SELECT nextval('tablename_id_seq'), NULL, NULL, $1, $2, NULL
$$ LANGUAGE SQL;

And in pg_bulkload control file:

FILTER = pg_bulkload_filter

answered Sep 27, 2010 at 7:33

Tometzky

24.2k5 gold badges64 silver badges79 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

thetaiko Over a year ago

This does the trick. Looking back, it is in the documentation but it isn't too clear. Also, I had to cast everything, even the NULL values, to the appropriate types. Thanks for your help.

Collectives™ on Stack Overflow

How do I specify columns when loading new rows into PostgreSQL using pg_bulkload

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related