I am trying to execute SQL queries using pandas.read_sql. It usually works, but for certain queries I run into this error:
File "C:\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1454, in _fetchall_as_list
result = cur.fetchall()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb4 in position 3: ordinal not in range(128)
I tried solutions suggested for a very similar problem here (UnicodeDecodeError with pandas.read_sql) but it didn't resolve the issue.
I am using cx_oracle library for the database connection.
I tried
db = cx_Oracle.connect(user,pwd, dsn_dict[dbname],encoding='utf-8')
but when I check the encoding using
print(db.encoding)
print(db.nencoding)
I always get
ASCII
ASCII
I tried changing NLS_LANG using
os.environ['NLS_LANG'] = 'AMERICAN_AMERICA.US7ASCII'
but it results in the same error
These are the database NLS parameters:
NLS_CHARACTERSET US7ASCII
NLS_NCHAR_CHARACTERSET AL16UTF16
I ran the same query in access and I notice this character in the query result, which might be causing this issue:
¿
Basically, I don't know how to set proper encoding to deal with the problem. Any help is appreciated. Thank you.
SOLUTION:
For reference, I solved this by setting
os.environ['NLS_LANG'] = 'AMERICAN_AMERICA.UTF8'
I dont like doing this though. better solutions are appreciated.