0

I have a text field (utf8) that misbehaves a little. originally fetched it includes the character sequence \u00e2\u0080\u0099 which is basically an apostrophe in another encoding. I have decided to maintain this corrupted state and not solve it despite the fact I have found a few solutions online on how to reinterpret these kinds of errors in my text field.

So I just want to insert the raw data as is. I have tried 2 ways to insert this row.

  1. Using python with peewee (an orm library). with everything configured correctly this method actually works, the data is inserted and there is a row in the database.
    selecting the column yield: donâ\u0080\u0099t which I am ok with keeping.

So far so good.

  1. Writing a python script that prints tab delimited text and using \copy
    Annoyingly this method does not work and returns the following error:
ERROR:  invalid byte sequence for encoding "UTF8": 0x80
CONTEXT:  COPY comment, line 1: "'donâ\x80\x99t'"

(when printing the data from the python script to console it shows up as donâ\x80\x99t)

Thus clearly there is a difference between what peewee does and my naive printing of the string from python (peewee and print receive the same string as input).

How do I encode this string correctly so I can use \copy to populate the row?

3
  • 1
    Take a look at the answer from Wim on this question. Commented May 29, 2022 at 2:45
  • 1
    I know "You're doing it wrong" answers are a pain, but really you should fix the encoding problems as early on as possible, ie. I'm challenging your decision to insert the raw data as-is. It's like you detect some loose part on your car, but you decide to fix it later. Then you only ask for help after it fell off during the next ride. You're more likely to get help for an easier problem (properly attaching the part) than a cumbersome one (looking for the lost part on the side of the road). Commented May 29, 2022 at 8:02
  • I do understand that but I seriously just want to insert the raw data as is despite the problem Commented May 29, 2022 at 9:20

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.