Codec find bytes string encoding

Question

Is there any way we can find what kind of encoding is used in bytes string with codecs in python. There is a method in chardet chardet.detect(string)['encoding'] Is there any method similar to this in codecs python

If there was such a method in the standard library, chardet would most probably not exist. — MaxNoe
– MaxNoe, Commented May 2, 2020 at 8:29
Does this answer your question? How to detect string byte encoding? — Joe
– Joe, Commented May 2, 2020 at 9:15

Christoph Burschka · Accepted Answer · 2020-05-02 08:19:55Z

0

There isn't a built-in method, because it wouldn't be possible to reliably determine this for arbitrary values and arbitrary encodings. (For example, any text containing only ASCII characters is valid in most other encodings.)

The best you could do is a series of try-catch blocks where you guess a series of encodings (eg UTF8, UTF16) and go to the next if there is an invalid character.

answered May 2, 2020 at 8:19

Christoph Burschka

4,7093 gold badges21 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Codec find bytes string encoding

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related