361

Here is my code:


import imaplib
from email.parser import HeaderParser

conn = imaplib.IMAP4_SSL('imap.gmail.com')
conn.login('[email protected]', 'password')
conn.select()
conn.search(None, 'ALL')
data = conn.fetch('1', '(BODY[HEADER])')
header_data = data[1][0][1].decode('utf-8')

At this point I get the error message:

AttributeError: 'str' object has no attribute 'decode'

Python 3 doesn't have str.decode() anymore, so how can I fix this?

1
  • @MartijnPieters I think the other version is better overall. I don't really get why both questions attracted an answer concerning PyJWT, though. That seems like it belongs on a separate question - one which might not be suitable for Stack Overflow, as it's essentially tech support for that library. Commented Dec 29, 2022 at 23:35

15 Answers 15

304

You are trying to decode an object that is already decoded. You have a str, there is no need to decode from UTF-8 anymore.

Simply drop the .decode('utf-8') part:

header_data = data[1][0][1]
Sign up to request clarification or add additional context in comments.

3 Comments

Is there a simple way to do this conditionally? (I only want to decode if the message is encoded.)
@devinbost: in Python 3? Test for the object type or the decode attribute, or just catch the exception. try: data = data.decode('...') except AttributeError: pass.
@devinbost: however, you are usually better off decoding closer to the source of your data, where you'll usually know exactly what you have.
76

If you land here using jwt authentication after the PyJWT v2.0.0 release (22/12/2020), you might want to freeze your version of PyJWT to the previous release in your requirements.txt file.

PyJWT==1.7.1

2 Comments

GIVE THIS PERSON A MEDAL!!!! This was a dependency in our enviroments rest_framework_simplejwt package and was causing the issue.
Not a safe solution: see CVE-2022-29217 that affects PyJWT 1.x versions: github.com/jpadilla/pyjwt/security/advisories/…
57

Begining with Python 3, all strings are unicode objects.

  a = 'Happy New Year' # Python 3
  b = unicode('Happy New Year') # Python 2

The instructions above are the same. So I think you should remove the .decode('utf-8') part because you already have a unicode object.

Comments

56

Use it by this Method:

str.encode().decode()

9 Comments

bytearray(str, 'encoding').decode('another_encoding') would do the job if you need to decode idna or any other encoding
This is useless. You are encoding to UTF-8, then decoding the resulting bytes as UTF-8, ending up where you started. You are keeping the CPU warm with no other benefit.
@MartijnPieters "ending up where you started" - not if you have escape sequences in your string, for example: >>> '\u0159'.encode().decode() 'ř'
@Peter: no, you don't need encoding or decoding for that. '\u0159' prints the exact same output. You are confusing the string literal syntax with the canonical representation of the value.
You can directly use, There is no need to encode and then decode again.
|
33

In Python 3, this mental model is pretty straight-forward:

  • Encoding is the process of converting a str to a bytes object
  • Decoding is the process of converting a bytes object to a str
┏━━━━━━━┓                ┏━━━━━━━┓
┃       ┃ -> encoding -> ┃       ┃
┃  str  ┃                ┃ bytes ┃
┃       ┃ <- decoding <- ┃       ┃
┗━━━━━━━┛                ┗━━━━━━━┛

In your case, you are calling data.decode("UTF-8") , but the variable is already a str object and is already decoded. So just refer to data directly if a string is what you need.

Comments

20

For Python3

html = """\\u003Cdiv id=\\u0022contenedor\\u0022\\u003E \\u003Ch2 class=\\u0022text-left m-b-2\\u0022\\u003EInformaci\\u00f3n del veh\\u00edculo de patente AA345AA\\u003C\\/h2\\u003E\\n\\n\\n\\n \\u003Cdiv class=\\u0022panel panel-default panel-disabled m-b-2\\u0022\\u003E\\n \\u003Cdiv class=\\u0022panel-body\\u0022\\u003E\\n \\u003Ch2 class=\\u0022table_title m-b-2\\u0022\\u003EInformaci\\u00f3n del Registro Automotor\\u003C\\/h2\\u003E\\n \\u003Cdiv class=\\u0022col-md-6\\u0022\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003ERegistro Seccional\\u003C\\/label\\u003E\\n \\u003Cp\\u003ESAN MIGUEL N\\u00b0 1\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EDirecci\\u00f3n\\u003C\\/label\\u003E\\n \\u003Cp\\u003EMAESTRO ANGEL D\\u0027ELIA 766\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EPiso\\u003C\\/label\\u003E\\n \\u003Cp\\u003EPB\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EDepartamento\\u003C\\/label\\u003E\\n \\u003Cp\\u003E-\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EC\\u00f3digo postal\\u003C\\/label\\u003E\\n \\u003Cp\\u003E1663\\u003C\\/p\\u003E\\n \\u003C\\/div\\u003E\\n \\u003Cdiv class=\\u0022col-md-6\\u0022\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003ELocalidad\\u003C\\/label\\u003E\\n \\u003Cp\\u003ESAN MIGUEL\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EProvincia\\u003C\\/label\\u003E\\n \\u003Cp\\u003EBUENOS AIRES\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003ETel\\u00e9fono\\u003C\\/label\\u003E\\n \\u003Cp\\u003E(11)46646647\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EHorario\\u003C\\/label\\u003E\\n \\u003Cp\\u003E08:30 a 12:30\\u003C\\/p\\u003E\\n \\u003C\\/div\\u003E\\n \\u003C\\/div\\u003E\\n\\u003C\\/div\\u003E \\n\\n\\u003Cp class=\\u0022text-center m-t-3 m-b-1 hidden-print\\u0022\\u003E\\n \\u003Ca href=\\u0022javascript:window.print();\\u0022 class=\\u0022btn btn-default\\u0022\\u003EImprim\\u00ed la consulta\\u003C\\/a\\u003E \\u0026nbsp; \\u0026nbsp;\\n \\u003Ca href=\\u0022\\u0022 class=\\u0022btn use-ajax btn-primary\\u0022\\u003EHacer otra consulta\\u003C\\/a\\u003E\\n\\u003C\\/p\\u003E\\n\\u003C\\/div\\u003E"""
print(html.replace("\\/", "/").encode().decode('unicode_escape'))

1 Comment

What has this got to do with the question? Can you explain what your answer is doing?
17

I'm not familiar with the library, but if your problem is that you don't want a byte array, one easy way is to specify an encoding type straight in a cast:

>>> my_byte_str
b'Hello World'

>>> str(my_byte_str, 'utf-8')
'Hello World'

3 Comments

They don’t have a bytes object to begin with, and str(bytes_object, codec) is just an alternative spelling for bytes_object.decode(codec). Both fail if you really have a str instead.
You're right, this specific question does have a str already. This answer could still be useful to people in the future that may have byte arrays (this was the issue I faced when I originally stumbled upon this post).
I'm not sure how you stumbled on this post, however, because my_byte_str.decode exists and works, and will not throw the exception in the question.
8

It s already decoded in Python3, Try directly it should work.

Comments

5

This worked for me:

html.replace("\\/", "/").encode().decode('unicode_escape', 'surrogatepass')

This is similar to json.loads(html) behaviour

3 Comments

That chain saved me, tried this solution randomly!
What has this got to do with the question? Can you explain what this code is doing?
This solution worked for me too. But can you please explain surrogatepass and unicode_escape params ? What do these parameters do ?
3

Use codecs module's open() to read file:

import codecs
with codecs.open(file_name, 'r', encoding='utf-8', errors='ignore') as fdata:

Comments

3

If anyone getting the same error while participating in Kaggle for a Logistic REgre, here is the solution :

logmodel = LogisticRegression(solver='liblinear')

1 Comment

How is this related to the question? I'm not sure it has anything to do with e-mails?
1

Other answers sort of hint at it, but the problem may arise from expecting a bytes object. In Python 3, decode is valid when you have an object of class bytes. Running encode before decode may "fix" the problem, but it is a useless pair of operations that suggest the problem us upstream.

Comments

1

I got 'str' object has no attribute 'decode' while creating JWT access_token using Flask_JWT_extended package.

To fix this issue, I upgraded my Flask-JWT-Extended package to Flask-JWT-Extended==4.1.0

For Reference:

Please Visit this page: https://flask-jwt-extended.readthedocs.io/en/stable/

Comments

1

First install suitable JWT

pip3 install PyJWT

then in your code

token.encode().decode('UTF-8')

this worked me, I think this will help you

Comments

0

my case may have been a bit rare but I was working with django and my project was running locally but not when I deployed it, it seemed as though I was getting multiple dependency errors because I was doing: pip freeze > requirements.txt doing this fixed the issue:

pip3 freeze > requirements.txt

Comments