0

I created a python script with a date argument which allows to extract data from a file (4.2 MB) to feed a table; when executing it shows me this error :

  File "./insert_pru_data.py", line 136, in <module>
    importYear(year)
  File "./insert_pru_data.py", line 124, in importYear
    SQLrequest += "(" + ", ".join(data_to_insert[i]) + "),\n"
MemoryError

My Code:

def importYear(year):
  go = True
  if isAlreadyInserted(year):
    if replace == False:
      print("donnees pour annee " + year + " deja inserees, action annulee")
      go = False
    else:
      print("donnees pour annee " + year + " deja inserees, les donnees seront remplacees")
      deleteData(year)

  if go:
    data_to_insert = getDataToInsert(data)
    SQLrequest = "INSERT INTO my_table (date_h, day, area, h_type, act, dir, ach) VALUES\n"
    i = 0
    print(data_to_insert)
    while i < len(data_to_insert) - 1:
        data_to_insert[i] = ["None" if element == None else element for element in data_to_insert[i]]
        SQLrequest += "(" + ", ".join(data_to_insert[i]) + "),\n"
    SQLrequest += "(" + ", ".join(data_to_insert[len(data_to_insert) - 1]) + ");"

    with psycopg2.connect(connString) as conn:    # Ouverture connexion a la base
      with conn.cursor() as cur:
        cur.execute(SQLrequest)
        cur.execute("COMMIT")
        cur.close()

importYear(year)

, someone help me to know how to solve this problem?

3
  • Sounds like you have a ridiculously huge amount of data in data_to_insert, and you're running out of memory (possibly on a 32 bit build of Python where you're limited to ~2 GB of virtual address space no matter how much RAM you have). We have no idea where data_to_insert came from, so that's all that can be said. Commented Jan 21, 2021 at 11:49
  • data_to_insert is list came from file csv Commented Jan 21, 2021 at 13:11
  • I get data from csv file and with anathor function return data_to_insert like a list Commented Jan 21, 2021 at 13:17

1 Answer 1

1
  • Firstly, avoid constructing an SQL query like this; sooner or later, one of the values to be inserted will have something like a quote and then everything will break. It's one of the more common security problems on the internet (SQL injection).

    The cur.execute() function can take two arguments - the query (with placeholders) and then the values to be inserted:

    cur.execute("insert into tbl (a, b) values (%s, %s)", (1, 2))
    
  • Rather than inserting all the data at once, read them from the file in groups of 100 or 1000 or something; small enough to fit into memory easily, large enough that there aren't too many round-trips.

  • There is an execute_values() function which does exactly what you want; you give it a query and a list of tuples:

    execute_values(cur, "insert into tbl (a, b) values %s", [(1, 2), (3, 4)])
    
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.