What's the easiest way to load a large csv file into a Postgres RDS database in AWS using Python?
To transfer data to a local postgres instance, I have previously used a psycopg2 connection to run SQL statements like:
COPY my_table FROM 'my_10gb_file.csv' DELIMITER ',' CSV HEADER;
However, when executing this against a remote AWS RDS database, this generates an error because the .csv file is on my local machine rather than the database server:
ERROR: must be superuser to COPY to or from a file
SQL state: 42501
Hint: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.
This answer explains why this doesn't work.
I'm now looking for the Python syntax to automate this using psql. I have a large number of .csv files I need to upload, so I need a script to automate this.
psql -c "\COPY my_table FROM 'my_10gb_file.csv' DELIMITER ',' CSV HEADER;"would work...