Python CSV encoding

Question

I'm writing a little script that allows me to import my Facebook contacts' email addresses to GMail/Android. My input file has unicode characters, like: Jasmin L\u00f3pez. The generated CSV output file looks like this:

Andr\u00e9 Zzz,,,,,,,,,,,,,,,,,,,,,,,,,,fbcontacts ::: * My Contacts,* Home,[email protected]
Andr\u00e9ia Ggg,,,,,,,,,,,,,,,,,,,,,,,,,,fbcontacts ::: * My Contacts,* Home,[email protected]
Andr\u00e9s Bbb,,,,,,,,,,,,,,,,,,,,,,,,,,fbcontacts ::: * My Contacts,* Home,[email protected]

As you can see I have problems with encodings. I'm creating a Google contacts CSV file but I need names properly displayed. I'm using this function to write the CSV:

def writecsv(self):
    if self.outfile is not '':
        #fh = open(self.outfile, 'wb')
        #fh = codecs.open(self.outfile, "wb", "utf-8")
        fh = codecs.open(self.outfile, 'wb', encoding="latin-1")
    else:
        fh = sys.stdout

    csvhdlr = csv.writer(fh, quotechar='"', quoting=csv.QUOTE_MINIMAL)
    csvhdlr.writerow("Name,Given Name,Additional Name,Family Name,Yomi Name,Given Name Yomi,Additional Name Yomi,Family Name Yomi,Name Prefix,Name Suffix,Initials,Nickname,Short Name,Maiden Name,Birthday,Gender,Location,Billing Information,Directory Server,Mileage,Occupation,Hobby,Sensitivity,Priority,Subject,Notes,Group Membership,E-mail 1 - Type,E-mail 1 - Value".split(','))        
    for contact in self.clist:
        #csvhdlr.writerow(dict((vname, vtype, vnotes, vstereotype, vauthor, valias, vgenfile.encode('utf-8')) for vname, vtype, vnotes, vstereotype, vauthor, valias, vgenfile in row.iteritems()))
        row = contact.fullname + ',,,,,,,,,,,,,,,,,,,,,,,,,,fbcontacts ::: * My Contacts,* Home,' + contact.email
        csvhdlr.writerow(row.split(','))

Any idea please? I'm quite new to python and everytime I have to use encodings, it doesn't work as I would like to =(

Thanks a lot for your help!

BrenBarn · Accepted Answer · 2012-09-21 18:25:40Z

3

If I understand you right, your file doesn't contain high unicode characters; it just contains unicode escape sequences like "\u00f3" that represent high unicode characters. If your file actually contains the string "Jasmin L\u00f3pez" (with a literal backslash and u) then you'll need to decode that to actual unicode characters before writing it. Take a look at the unicode_escape codec.

>>> x = b"\u00f3"
>>> print x
\u00f3
>>> print x.decode('unicode_escape')
ó

answered Sep 21, 2012 at 18:25

BrenBarn

253k39 gold badges421 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Albert Vonpupp Over a year ago

Thanks for your fast answer, it works great on console but when I try to write to the csv, I get this: UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 3: ordinal not in range(128). Any clue?

Maurício Szabo Over a year ago

What did you do to fix this?

Collectives™ on Stack Overflow

Python CSV encoding

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related