6

I'm working with an Oracle database. I can do this much:

    import pandas as pd
    import pandas.io.sql as psql
    import cx_Oracle as odb
    conn = odb.connect(_user +'/'+ _pass +'@'+ _dbenv)

    sqlStr = "SELECT * FROM customers"
    df = psql.frame_query(sqlStr, conn)

But I don't know how to handle bind variables, like so:

    sqlStr = """SELECT * FROM customers 
                WHERE id BETWEEN :v1 AND :v2
             """

I've tried these variations:

   params  = (1234, 5678)
   params2 = {"v1":1234, "v2":5678}

   df = psql.frame_query((sqlStr,params), conn)
   df = psql.frame_query((sqlStr,params2), conn)
   df = psql.frame_query(sqlStr,params, conn)
   df = psql.frame_query(sqlStr,params2, conn)

The following works:

   curs = conn.cursor()
   curs.execute(sqlStr, params)
   df = pd.DataFrame(curs.fetchall())
   df.columns = [rec[0] for rec in curs.description]

but this solution is just...inellegant. If I can, I'd like to do this without creating the cursor object. Is there a way to do the whole thing using just pandas?

2 Answers 2

1

Try using pandas.io.sql.read_sql_query. I used pandas version 0.20.1, I used it, it worked out:

import pandas as pd
import pandas.io.sql as psql
import cx_Oracle as odb
conn = odb.connect(_user +'/'+ _pass +'@'+ _dbenv)

sqlStr = """SELECT * FROM customers 
            WHERE id BETWEEN :v1 AND :v2
"""
pars = {"v1":1234, "v2":5678}
df = psql.frame_query(sqlStr, conn, params=pars)
Sign up to request clarification or add additional context in comments.

Comments

0

As far as I can tell, pandas expects that the SQL string be completely formed prior to passing it along. With that in mind, I would (and always do) use string interpolation:

params = (1234, 5678)
sqlStr = """
SELECT * FROM customers 
WHERE id BETWEEN %d AND %d
""" % params
print(sqlStr)

which gives

SELECT * FROM customers 
WHERE id BETWEEN 1234 AND 5678

So that should feed into psql.frame_query just fine. (it does in my experience with postgres, mysql, and sql server).

5 Comments

I'd strongly advise against forming your SQL this way as it leaves your code vulnerable to SQL injection attacks. Even if your code/database isn't in a position to be vulnerable, you shouldn't get in the practice of forming your SQL this way. Bind variables are the safe way to go.
@DavidMarx agreed. I shouldn't have assumed the OP was working from a command-line (or just a basic script) like I normally do.
[FYI: I'm OP] Yeah, this is a self-contained file. I don't foresee any real issues with SQL injection in my current program since the people who will be using it will have direct access to the database anyway, but I would like to know for the future if I can use pandas in the way I described.
Oops. I see that now. FWIW, I'm able to recreate your experience/frustration using pyodbc on my databases. From my perspective, the path of least resistance is to hack on pandas so that read_frame can take a cursor object. Maybe I'll be able to get a PR together soon.
well your cur.fetchall() technique covers that. i'll stop blabbering now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.