0

Hello all I'm using Django 1.2.5 with Python 2.6 on two machines: Ubuntu 11.04 and Debian Lenny. First one is my local dev and second is remote server. I'm using django-fts and postgresql (database is remote and unique for both installations) to make full text search queries. The intriguing thing is I can do fts query with russian symbols in it on local machine all right. On remote server it gives me UnicodeEncodeError while evaluating my extra.

In django-fts postgre backend I find this code:

    ts_query = "plainto_tsquery('%s','%s')" % (self.language, unicode(query).replace("'", "''"))
    where = '%s @@ %s' % (qn(self.vector_field.column), ts_query)
    select = {}
    order = []
    if rank_field is not None:
        select[rank_field] = 'ts_rank(%s.%s, %s, %d)' % (qn(self.model._meta.db_table), qn(self.vector_field.column), ts_query, rank_normalization)
        order = ['-%s' % rank_field]
    return qs.extra(select=select, where=[where], order_by=order)`

where query contains u'\u0430' or u"а" (cyrillic) so it encoded in utf-8 all right so far.

Error appears in django/db/models/sql/compiler.py on line 489 while trying to assemble a query from extras: result.append('(%s)' % str(col))

So, I tried all of coding/decoding stuff. LANG variable is "ru_RU.UTF-8" on both computers and in settings.py as DEFAULT_CHARSET. Have no more ideas. Any help?

UPDATE: I've just found out that my application executes the same way on both computers. The difference is between execution from shell (python manage.py runserver localhost:8000) and debug-mode start from eclipse's pydev environment. So, can some one tell me if there's difference between manual start and pydev debug start concerning my encoding problems?

TIA. Petr

3
  • What's the encoding on the Postgres DB? Commented May 3, 2011 at 11:12
  • UTF8 I managed to make full text search itself work: return qs.extra(where=[where.encode('utf-8']) does the trick. But if I try to do any extra selects on this QuerySet I keep getting UnicodeEncodeError even if I ensure that every string being added goes through encode('utf-8') expression. Commented May 3, 2011 at 14:58
  • well, actually this wasn't true: where clause works without specifying any encoding, but select still doesn't. Commented May 3, 2011 at 15:23

1 Answer 1

1

add a u'' to all the string in your example.
like u'ts_rank(%s.%s, %s, %d)' % (...)
and also interpolating variables with % without database escape is bad.

Sign up to request clarification or add additional context in comments.

1 Comment

Yeah, Evgeny, I know that. But it was made (I suppose) because making use of params and select_params in extra query somehow breaks quotation of strings. I tried to make to_tsquery(%s, %s) with params =[...] but got to_tsquery(russian, абв) in sql. This isn't right, I need those strings quoted. But if I use it like to_tsquery('%s', '%s'), resulting sql comes with syntax error. Nevertheless, your answer was helpful, although I printed type() of query elements and all of them were unicode.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.