2

I have an Android app which uses

URLEncoder.encode(S.getSongArtist(),"UTF-8")

to encode a unicode string that is posted to a AppEngine python (2.7) web service. On the service I use

urllib.unquote_plus(artist)

This is not giving me correct results. I have an input like this:

Marie+Lafor%C3%AAt

which is unquote'd to

Marie Laforêt

If I use a javascript url decode, for instance: http://meyerweb.com/eric/tools/dencoder/ I get

Marie Laforêt

A correct result.

I tried using

urllib.unquote(artist).decode('utf-8') 

but this generates an exception. Any hints at all are greatly appreciated.

EDIT

Taxellool had the right answer in the comments:

what you are trying to decode is already decoded. try this:

urllib.unquote_plus(artist.encode('utf-8')).decode('utf-8')
6
  • what exception do you get in last urllib.unquote(artist).decode('utf-8') ? it seems to work correctly under python2.7.5 Commented Mar 18, 2014 at 6:44
  • if I use decode at the end i get: UnicodeEncodeError: 'ascii' codec can't encode characters in position 11-12: ordinal not in range(128) Commented Mar 18, 2014 at 10:14
  • 1
    what you are trying to decode is already decoded. try this: urllib.unquote_plus(artist.encode('utf-8')).decode('utf-8') Commented Mar 18, 2014 at 10:21
  • I tried this in a python shell (as Taxellool did) and it works: >>> print urllib.unquote_plus(artist).decode('utf-8') Marie Laforêt On the server this same line of code generates the exception Commented Mar 18, 2014 at 10:26
  • 1
    @Taxellool: it should be urllib.unquote_plus(artist.encode('ascii')).decode('utf-8') Commented Mar 18, 2014 at 11:13

2 Answers 2

3

Taxellool had the right answer in the comments:

what you are trying to decode is already decoded. try this:

urllib.unquote_plus(artist.encode('utf-8')).decode('utf-8')
Sign up to request clarification or add additional context in comments.

Comments

1

I guess you are decoding before urllib.unquote():

>>> print urllib.unquote_plus('Marie+Lafor%C3%AAt'.decode('utf-8'))  
Marie Laforêt

If you decode after unquote, result would be what you want:

>>> print urllib.unquote_plus('Marie+Lafor%C3%AAt').decode('utf-8')  
Marie Laforêt

Just make sure you don't pass a unicode to urllib.unquote_plus.

2 Comments

On the server this line of code generates an exception: UnicodeEncodeError: 'ascii' codec can't encode characters in position 11-12: ordinal not in range(128)
@scratchy: See How to print() a string in Python3?. My answer works on Python 2 too.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.