Python SQLAlchemy memory leak on Linux

Question

I wrote a script that iterates through a large database table. (~150K rows.) To avoid using too much memory I'm using this windowed_query method. My script goes something like this:

query = db.query(Table)

count = 0
for row in windowed_query(query, Table.id, 1000):

    points = 0

    # +100 points for a logo
    if row.logo_id:
        points += 100

    # +10 points for each image
    points += 10 * len(row.images) #images is a SQLAlchemy one-to-many relationship

    #...The script continues with much of the same...

    row.points = points
    db.add(row)

    count += 1
    if count % 100 == 0:
        db.commit()
        print count

request.db.commit()

When trying to run it on a CentOS server, it makes it through 9000 rows before getting killed by the kernel because it's using ~2GB of memory.

On my Mac development environment, it works like a charm, even though it's running on exactly the same version of Python (2.7.3), SQLAlchemy (0.7.8), and psycopg2 (2.4.5).

Using memory_profiler for some simple debugging: On Linux, each piece of code that queries the database increased the memory a small amount, with the growth never stopping. On Mac, the same thing happened, but after growing ~4MB it leveled off. It's as if on Linux nothing is being garbage collected. (I even tried running gc.collect() every 100 rows. Didn't do anything.)

Does anybody have a clue what is happening?

Perhaps you're suffering from this bug? velocityreviews.com/forums/… — Croad Langshan
– Croad Langshan, Commented Feb 10, 2013 at 18:59
But with windowed_query the largest query is 1000 rows, so fetchone using the memory of fetchall doesn't explain 2GB of memory use. — Theron Luhn
– Theron Luhn, Commented Feb 10, 2013 at 19:37
Aaahhhh.... I figured it out. I'm using Pyramid, and had the debug toolbar enabled. After disabling it, the memory usage plateaued at 73MB. Problem solved! — Theron Luhn
– Theron Luhn, Commented Feb 10, 2013 at 19:43
@TheronLuhn Hey, glad you solved this! Since your question is solved, feel free to post an answer to your own question and mark it as "Accepted" (click the big green checkmark). This helps other people know that your problem was solved :) — culix
– culix, Commented Dec 24, 2013 at 0:46

Theron Luhn · Accepted Answer · 2013-12-24 15:01:59Z

4

It turns out Pyramid's debugtoolbar was enabled, which was the reason for the high memory use. I disabled it and the script worked like a charm.

answered Dec 24, 2013 at 15:01

community wiki

Theron Luhn

Sign up to request clarification or add additional context in comments.

1 Comment

plaes Over a year ago

Yup, most of the debugging toolbars keep the sql which is sent to server for later debugging (also had similar fun with Flask's debug toolbar).

Collectives™ on Stack Overflow

Python SQLAlchemy memory leak on Linux

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related