1

I'm a little bit confused as I need to import one batch of csv files into a PostgreSQL database in the first phase , and then add another batch in the second phase. For example the second phase csv's are the additional "update" csv. This would need to be in a single program.

So my question is how would i go about this in python, do i need to use two phase commits or is there an another solution?

Any help is greatly appreciated!

1 Answer 1

1

You can use psycopg2's copy_from method. A rough example (change user, filename, separator, etc to your preference):

import psycopg2

def read_csv_file(file_name, table_name):
    con = psycopg2.connect("host=your_host dbname=your_db user=your_user")
    cur = con.cursor()
    with open(file_name, "r") as f:
        cur.copy_from(f, table_name, sep=",")
    con.commit()

csv_list = [
    ("file1.csv", "table1"),
    ("file2.csv", "table2"),
    ("file3.csv", "table3"),
]

for file_name, table_name in csv_list:
    read_csv_file(file_name, table_name)

Because copy_from appends to the table, you can repeat this for "update" csvs.

Sign up to request clarification or add additional context in comments.

2 Comments

So would i need to write the with statement for every csv file or could i enter the path and select all csv files of the first phase and etc, or would it be a problem as the csv files need to populate around 8 tables?
One call to copy_from reads one CSV. So you could put it in a function and call that multiple times.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.