61

I'm having a problem executing some SQL from within Python, despite similar SQL working fine from the mysql command-line.

The table looks like this:

mysql> SELECT * FROM foo;
+-------+-----+
| fooid | bar |
+-------+-----+
|     1 | A   | 
|     2 | B   | 
|     3 | C   | 
|     4 | D   | 
+-------+-----+
4 rows in set (0.00 sec)

I can execute the following SQL query from the mysql command-line, without a problem:

mysql> SELECT fooid FROM foo WHERE bar IN ('A','C');
SELECT fooid FROM foo WHERE bar IN ('A','C');
+-------+
| fooid |
+-------+
|     1 | 
|     3 | 
+-------+
2 rows in set (0.00 sec)

However, when I try to do the same from within Python, I get no rows, while I expected 2 rows:

import MySQLdb
import config
connection=MySQLdb.connect(
    host=config.HOST,user=config.USER,passwd=config.PASS,db='test')
cursor=connection.cursor()

sql='SELECT fooid FROM foo WHERE bar IN %s'
args=[['A','C']]
cursor.execute(sql,args)
data=cursor.fetchall()
print(data)
# ()

So the question is: how should the python code be modified to select those fooids where bar is in ('A','C')?

By the way, I noticed that if I switch the roles of bar and fooid, I can get the code to select those bars where fooid is in (1,3) successfully. I don't understand why one such query (below) works, while the other one (above) doesn't.

sql='SELECT bar FROM foo WHERE fooid IN %s'
args=[[1,3]]
cursor.execute(sql,args)
data=cursor.fetchall()
print(data)
# (('A',), ('C',))

And just to be absolutely clear, this is how the foo table was created:

mysql> DROP TABLE IF EXISTS foo;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE `foo` (
          `fooid` int(11) NOT NULL AUTO_INCREMENT,
          `bar` varchar(10) NOT NULL,
          PRIMARY KEY (`fooid`));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT into foo (bar) values ('A'),('B'),('C'),('D');
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

Edit: When I enable the general query log with mysqld -l /tmp/myquery.log I see

mysqld, Version: 5.1.37-1ubuntu5.5-log ((Ubuntu)). started with:
Tcp port: 3306  Unix socket: /var/run/mysqld/mysqld.sock
Time                 Id Command    Argument
110101 11:45:41     1 Connect   unutbu@localhost on test
            1 Query set autocommit=0
            1 Query SELECT fooid FROM foo WHERE bar IN ("'A'", "'C'")
            1 Query SELECT bar FROM foo WHERE fooid IN ('1', '3')
            1 Quit

Indeed, it looks like too many quotes are being placed around A and C.

Thanks to @Amber's comment, I understand better what is going wrong. MySQLdb converts the parametrized argument ['A','C'] to ("'A'","'C'").

Is there a way to make a parametrized query using the IN SQL syntax? Or must one manually construct the SQL string?

1
  • Can you look in your MySQL server's logs and find out what query was actually run? It seems to have something to do with how strings/varchars are being interpreted (since python int -> mysql int works fine). Commented Jan 1, 2011 at 16:10

10 Answers 10

74

Unfortunately, you need to manually construct the query parameters, because as far as I know, there is no built-in bind method for binding a list to an IN clause, similar to Hibernate's setParameterList(). However, you can accomplish the same with the following:

Python 3:

args=['A', 'C']
sql='SELECT fooid FROM foo WHERE bar IN (%s)' 
in_p=', '.join(list(map(lambda x: '%s', args)))
sql = sql % in_p
cursor.execute(sql, args)

Python 2:

args=['A', 'C']
sql='SELECT fooid FROM foo WHERE bar IN (%s)' 
in_p=', '.join(map(lambda x: '%s', args))
sql = sql % in_p
cursor.execute(sql, args)
Sign up to request clarification or add additional context in comments.

10 Comments

in_p = ', '.join(itertools.repeat('%s', len(args)))
This would be prone to sql injection. Isn't there something safer that we can do?
@Sohaib how would this be prone to injection? The data being used for the query (args) is passed separate from the sql text. The string manipulation only adds more placeholders '%s', not the actual data which would be vulnerable.
in_p = ', '.join(['%s']*len(args)), while we are at it…
Anyone who downvotes because of SQL injection vulnerability - you are doing mistake. This answer does not contain this vulnerability, even in previous edits and the initial answer. Please read the answer carefully. The arguments for the query are properly passed to execute, all the formatting is needed to have proper count of initial %s occurrences in the IN clause .
|
57

Here is a similar solution which I think is more efficient in building up the list of %s strings in the SQL:

Use the list_of_ids directly:

format_strings = ','.join(['%s'] * len(list_of_ids))
cursor.execute("DELETE FROM foo.bar WHERE baz IN (%s)" % format_strings,
                tuple(list_of_ids))

That way you avoid having to quote yourself, and avoid all kinds of sql injection.

Note that the data (list_of_ids) is going directly to mysql's driver, as a parameter (not in the query text) so there is no injection. You can leave any chars you want in the string, no need to remove or quote chars.

3 Comments

I like this, I would go as far as saying that the accepted answer is to be considered harmful!
Agreed - this is a much better solution. The above opens you up to SQL injection attacks, for example if a user types ); select username, password, credit_card from user; their code could be run against the database.
As others have noted, it is nonsense that this solution is somewhat more secure than João Silva's answer. Both solutions use string interpolation to adjust the number of SQL parameters. In both cases, an attacker could only change the number of SQL parameters, but not the query as such. Robert Berg's approach is, however, a bit more elegant.
11

If you have other parameters in the query, beyond the IN list, then the following extension to JG's answer may be useful.

ids = [1, 5, 7, 213]
sql = "select * from person where type=%s and id in (%s)"
in_ids = ', '.join(map(lambda x: '%s', ids))
sql = sql % ('%s', in_ids)
params = []
params.append(type)
params.extend(ids)
cursor.execute(sql, tuple(params))

That is, join all the params in a linear array, then pass it as a tuple to the execute method.

Comments

5

this works for me:

myTuple= tuple(myList)
sql="select fooid from foo where bar in "+str(myTuple)
cursor.execute(sql)

1 Comment

Never pass parameters directly in the SQL query! This leads to SQL injection vulnerabilities which is a major security issue. Parameters should always be passed in the parameter tuple so they get properly quoted by the SQL binding library. Different engines have different rules for quoting, and the only safe way to quote parameters is to let the lower-level library do it.
3

Maybe we can create a function to do what João proposed? Something like:

def cursor_exec(cursor, query, params):
    expansion_params= []
    real_params = []
    for p in params:
       if isinstance(p, (tuple, list)):
         real_params.extend(p)
         expansion_params.append( ("%s,"*len(p))[:-1] )
       else:
         real_params.append(p)
         expansion_params.append("%s")
    real_query = query % expansion_params
    cursor.execute(real_query, real_params)

Comments

2

Have been trying every variation on João's solution to get an IN List query to work with Tornado's mysql wrapper, and was still getting the accursed "TypeError: not enough arguments for format string" error. Turns out adding "*" to the list var "*args" did the trick.

args=['A', 'C']
sql='SELECT fooid FROM foo WHERE bar IN (%s)'
in_p=', '.join(list(map(lambda x: '%s', args)))
sql = sql % in_p
db.query(sql, *args)

Comments

2

Improving on João's and satru's code, I suggest creating a cursor mixin that can be used to build a cursor with an execute that accepts nested iterables and handles them correctly. A better name would be nice, though... For Python3, use str instead of basestring.

from MySQLdb.cursors import Cursor

class BetterExecuteMixin(object):
    """
    This mixin class provides an implementation of the execute method
    that properly handles sequence arguments for use with IN tests.
    Examples:
    execute('SELECT * FROM foo WHERE id IN (%s) AND type=%s', ([1,2,3], 'bar'))
    # Notice that when the sequence is the only argument, you still need
    # a surrounding tuple:
    execute('SELECT * FROM foo WHERE id IN (%s)', ([1,2,3],))
    """

    def execute(self, query, args=None):
        if args is not None:
            try:
                iter(args)
            except TypeError:
                args = (args,)
            else:
                if isinstance(args, basestring):
                    args = (args,)
            real_params = []
            placeholders = []
            for arg in args:
                # sequences that we treat as a single argument
                if isinstance(arg, basestring):
                    real_params.append(arg)
                    placeholders.append('%s')
                    continue
                try:
                    real_params.extend(arg)
                    placeholders.append(','.join(['%s']*len(arg)))
                except TypeError:
                    real_params.append(arg)
                    placeholders.append('%s')
            args = real_params
            query = query % tuple(placeholders)
        return super(BetterExecuteMixin, self).execute(query, args)

class BetterCursor(BetterExecuteMixin, Cursor):
    pass

This can then be used as follows (and it's still backwards compatible!):

import MySQLdb
conn = MySQLdb.connect(user='user', passwd='pass', db='dbname', host='host',
                       cursorclass=BetterCursor)
cursor = conn.cursor()
cursor.execute('SELECT * FROM foo WHERE id IN (%s) AND type=%s', ([1,2,3], 'bar'))
cursor.execute('SELECT * FROM foo WHERE id IN (%s)', ([1,2,3],))
cursor.execute('SELECT * FROM foo WHERE type IN (%s)', (['bar', 'moo'],))
cursor.execute('SELECT * FROM foo WHERE type=%s', 'bar')
cursor.execute('SELECT * FROM foo WHERE type=%s', ('bar',))

Comments

2

Very simple:

Just use the below formation###

rules_id = ["9","10"]

sql2 = "SELECT * FROM attendance_rules_staff WHERE id in"+str(tuple(rules_id))

note the str(tuple(rules_id)).

1 Comment

An elegant solution as long as rules_id never comes from an external source. If anyone outside your system can set the values for rules_id, your database will be vulnerable to SQL injection.
0

Why not just this in that case?

args = ['A', 'C']
sql = 'SELECT fooid FROM foo WHERE bar IN (%s)' 
in_p  =', '.join(list(map(lambda arg:  "'%s'" % arg, args)))
sql = sql % in_p
cursor.execute(sql)

results in:

SELECT fooid FROM foo WHERE bar IN ('A', 'C')

2 Comments

I think I answered my own question: If you pass the args separately, they will be properly escaped by the Mysqldb lib.
Never pass parameters directly in the SQL query! This leads to SQL injection vulnerabilities which is a major security issue. Parameters should always be passed in the parameter tuple so they get properly quoted by the SQL binding library. Different engines have different rules for quoting, and the only safe way to quote parameters is to let the lower-level library do it.
-1

args should be tuple.

eg:

args = ('A','B')

args = ('A',) # in case of single

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.