1

I'm working on a new Django site, and, after migrating in a pile of data, have started running into a deeply frustrating DjangoUnicodeDecodeError. The bad character in question is a \xe8 (e-grave).

There's two sticky issues:

It only happens in the production server, running an Apache-fronted fcgi process (running the same code with the same database on the Django dev server has no issues)

The stack trace in question is entirely within Django code. It occurs in the admin site (elsewhere too) when retrieving an item to display, though the field that contains the bad character is not actually ever rendered.

I'm not even entirely sure where to begin debugging this, short of trying to remove the offending characters manually. My guess is that it's a configuration issue, since it's environment-specific, but I'm not sure where to start there either.

EDIT: As Daniel Roseman pointed out, the error is almost certainly in the unicode method--or, more precisely, another method that it calls. Note that the offending characters are in a field not referenced at all in the code here. I suppose that the exception is raised in a method that builds the object from the db result--if the queryset is never evaluated (e.g. if not self.enabled) there's no error. Here's the code:

def get_blocking_events(self):
    return Event.objects.filter(<get a set of events>)

def get_blocking_reason(self):
    blockers = self.get_blocking_events()
    label = u''
    if not self.enabled:
        label = u'Sponsor disabled'
    elif len(blockers) > 0:
        label = u'Pending follow-up: "{0}" ({1})'.format(blockers[0],blockers[0].creator.email)
        if len(blockers) > 1:
            label += u" and {0} other event".format(len(blockers)-1)
        if len(blockers) > 2:
            label += u"s"
    return label

def __unicode__(self):
    label = self.name
    blocking_msg = self.get_blocking_reason()
    if len(blocking_msg):
        label += u" ({0})".format(blocking_msg)
    return label

Here's the tail of the stack trace, for fun:

File "/opt/opt.LOCAL/Django-1.2.1/django/template/__init__.py", line 954, in render
   dict = func(*args)

 File "/opt/opt.LOCAL/Django-1.2.1/django/contrib/admin/templatetags/admin_list.py", line 209, in result_list
   'results': list(results(cl))}

 File "/opt/opt.LOCAL/Django-1.2.1/django/contrib/admin/templatetags/admin_list.py", line 201, in results
   yield list(items_for_result(cl, res, None))

 File "/opt/opt.LOCAL/Django-1.2.1/django/contrib/admin/templatetags/admin_list.py", line 138, in items_for_result
   f, attr, value = lookup_field(field_name, result, cl.model_admin)

 File "/opt/opt.LOCAL/Django-1.2.1/django/contrib/admin/util.py", line 270, in lookup_field
   value = attr()

 File "/opt/opt.LOCAL/Django-1.2.1/django/db/models/base.py", line 352, in __str__
   return force_unicode(self).encode('utf-8')

 File "/opt/opt.LOCAL/Django-1.2.1/django/utils/encoding.py", line 88, in force_unicode
   raise DjangoUnicodeDecodeError(s, *e.args)

DjangoUnicodeDecodeError: 'utf8' codec can't decode bytes in position 956-958: invalid data. You passed in <Sponsor: [Bad Unicode data]> (<class 'SJP.alcohol.models.Sponsor'>)
3
  • The usual culprit in these circumstances is the __unicode__ method of the model. Can you show us the code? Commented Nov 6, 2010 at 18:17
  • What about data in database - are your tables using UTF8 or some ISO Latin encoding ? Commented Nov 6, 2010 at 19:07
  • The database is MS SQL Server, so my understanding is that there isn't a database-wide encoding; however, the column in question is an nvarchar, which means that the data is encoded as UTF-16. I'll also note that I have no problems reading the exact same database with the same application code via the django dev server. Commented Nov 6, 2010 at 20:06

2 Answers 2

1

The issue here is that in unicode you use the following line:

label += " ({0})".format(blocking_msg)

And unfortunately in python 2.x this is trying to format blocking_msg as an ascii string. What you meant to type was:

label += u" ({0})".format(blocking_msg)
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, but no luck. To be sure, I converted all of the string literals in these functions to unicode strings.
What's the encoding in the db?
1

Turns out this is likely due to the FreeTDS layer that connects to the SQL Server. While FreeTDS provides some support for automatically converting encodings, my setup is either misconfigured or otherwise not working quite right.

Rather than fighting this battle, I've migrated to MySQL for now.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.