Better late than never :-)
I've found that by far the easiest way to do this is to load the data from a web server or file based end point using the 'COPY FROM' command
Postgres manual for copy
Practical example
Lets imagine you have the following CSV data:
1,Fred,Flintstone
2,Barney,Rubble
3,Willma,Flintstone
4,Betty,Rubble
With the columns respectively being pkid, firstname and surname
If you create this file on a web server (Perhaps a server your running locally but can be reached from the outside), you should then be able to type:
http://myserver.blah/flintstones.csv
into your browser and see the file appear.
Once your able to do this, and assuming the server you've used is public facing (So that amazons servers can see it), you then need to fire up a tool such as PGAdmin or anything else that allows you to run sql on your postgres install.
how you run these commands is a matter for debate, I've used all manner of methods in the past.
One that works really well is to set up ssh login on your Amazon appliance host, then use an SSH client that allows you to tunnel from your local host to the RDS instance, doing it this way allows you to use programs such as PGAdmin.
If you can't use a tunnel, then you could always hack together a quick ruby/php/nodejs script that allows you to run the 2 sql commands you need.
Once you have the ability to run SQL commands against your RDS instance, you need to do 2 things:
- 1) Create the destination table
- 2) Use the copy command to import the data
Creating the destination table is easy, that's just a simple create table command.
For our example:
CREATE TABLE theflintstones
(
pkid integer primary key,
firstname text,
surname text
)
The second command is a little more tricky
If your going to load the data from a file system, then you need to make sure that you copy the CSV file to a file system location that RDS has access to.
In my past experience however, I can't recall ever getting access to the direct file system on an RDS instance, so you'll highly likley have to use the remote http method.
The problem with using the http method is that the rds instance may not have either the wget or the curl tool installed.
In practice I've yet to come across one that does not have at least wget installed, as wget is quite often needed by the underlying OS to grab things it needs from the web. Often curl is installed too.
once your ready to import the data, you then need to use the following command:
COPY theflintstones FROM PROGRAM 'curl -s http://myserver/flintstones.csv' WITH(format csv)
Where 'myserver' should be replaced with the web or ip address where you stored the CSV data file, and the 'flintstones.csv' should be replaced with the actual file name you want to load.
'curl -s [url]' is used to run curl in silent mode, if you have to use wget then you should specify the program as 'wget -qO- [url]' instead
If all goes well, postgres should load the CSV from the remote source, then use the contents of that file to populate the columns in your table.
If you only need to populate some columns in your table, then use the table and column syntax:
COPY table(column, column, column ... )
and the csv will only populate those columns that are named setting the rest to their default values.
opento open a file, and maybe thecsvmodule if your file is CSV. Then, you'd use thepsycopgmodule to talk to Postgres.