How can use the 'coding' header of a python source file to read its contents properly?

Question

Python source files often come with a coding header similar to the following

# -*- coding: iso-8859-1 -*-

How can I this line to properly parse the contents of such a file? Is there a better way than manually opening the file in binary mode, reading one line, and checking if it contains the header? Is there a library that does this?

Background: this comes in the context of fixing this bug, which crashes elpy when used in conjunction with python3 and importmagic. The code that I'm trying to fix uses

with open(filename) as fd:
    success = subtree.index_source(filename, fd.read())

and crashes on non-utf-8 files. Ideally I would like to keep changes to a minimum.

"better way" is such an extremely relative thing that I'm tempted to ignore your question. What is bad about the way you're currently doing it? — Marcus Müller
– Marcus Müller, Commented Feb 11, 2015 at 16:45
@MarcusMüller - considering that python supports some source encoding schemes, it is reasonable to assume that there is an already existing python library to read such files. There are several formats, 8 vs. 16 bit encodings, BOMs and etc..., its not an obvious thing to do on your own. — tdelaney
– tdelaney, Commented Feb 11, 2015 at 16:55
ah, but there's a PEP that already describes how this should be handled — Marcus Müller
– Marcus Müller, Commented Feb 11, 2015 at 16:56
@tdelaney: I've added an answer based on your inspiration; thanks! — Marcus Müller
– Marcus Müller, Commented Feb 11, 2015 at 17:32

jfs · Accepted Answer · 2015-02-11 18:32:30Z

1

There is tokenize.open() that does exactly that: it opens a Python source file using the character encoding specified in the coding header (encoding declaration).

You could decode on-the-fly remote Python files too.

edited Feb 11, 2015 at 18:32

answered Feb 11, 2015 at 18:24

jfs

417k210 gold badges1k silver badges1.7k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How can use the 'coding' header of a python source file to read its contents properly?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related