3

Hi i'm trying to save data that i get from this api into my Json column in my postgresql using sqlalchemy and python requests.

r = requests.get(api)

content = r.content

data = json.loads(content)

crawl_item = {}
crawl_item = session.query(CrawlItem).filter_by(site_id=3, href=list_id).first()
crawl_item.description = data['ad']['body']
crawl_item.meta_data = {}
crawl_item.meta_data["ward"] = data['ad_params']['ward']['value']

try:
      session.commit()
except:
      session.rollback()
      raise
finally:
      ret_id = crawl_item.id
      session.close()

my model:

class CrawlItem(Base):
    ...
    description = Column(Text)
    meta_data = Column(postgresql.JSON)

i want to get the value of ward :

"ward": {
      "id": "ward",
      "value": "Thị trấn Trạm Trôi",
      "label": " Phường, thị xã, thị trấn"
    }

I already encoding my postgresql to utf-8 so other fields that are not json column (description = Column(Text)) save utf-8 characters normally only my json column data are not decode:

enter image description here

{ 
   "ward":"Th\u1ecb tr\u1ea5n Tr\u1ea1m Tr\u00f4i"
}

description column:

description

meta_data column:

meta_data

i had tried using :

crawl_item.meta_data["ward"] = data['ad_params']['ward']['value'].decode('utf-8')

but the ward data don't get save

I have no idea what is wrong, hope someone can help me

EDIT:

i checked the data with psql and got these:

description column:

enter image description here

meta_data column:

enter image description here

It seems like only meta_data json column have trouble with the characters

9
  • Are you using Python 2 or 3? Commented Oct 18, 2019 at 5:33
  • What does the value look like if you select it using psql? I want to determine if it is stored in the database in escaped form or not. Commented Oct 18, 2019 at 6:17
  • @Ilja Everilä im using python 3.6 Commented Oct 18, 2019 at 6:59
  • 1
    Then my next question is: what is thr SQL statement that fills the column? You could set log_min_duration_statement = 0 in PostgresSQL, then the statement and the input data are logged. Then you know if the escaping happens in Python or in the database. Commented Oct 18, 2019 at 7:51
  • 1
    Come to think of it, maybe you're not using the proper text conversion(s) when fetching? The last SELECT statement you've shown is an example of that; you've selected the "raw" JSON produced by Python's json module, which contains escaped data. Commented Oct 18, 2019 at 9:02

2 Answers 2

6

Sqlalchemy serializes JSON field before save to db (see url and url and url).

json_serializer = dialect._json_serializer or json.dumps

By default, the PostgreSQL' dialect uses json.dumps and json.loads.


When you work with Text column, the data is converted in the following flow:

str -> bytes in utf-8 encoding

When you work with JSON column for PostgreSQL dialect, the data is converted in the following flow:

dict -> str with escaped non-ascii symbols -> bytes in utf-8 encoding

You can override the serializer in your engine configuration using json_serializer field:

json_serializer=partial(json.dumps, ensure_ascii=False)
Sign up to request clarification or add additional context in comments.

Comments

-2

use "jsonb" data type for your json column or cast "meta_data" field to "jsonb" like this:

select meta_data::jsonb from your_table;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.