The problem is to convert the bytes to unicode, when that bytes already saved in string. Here is an example:
s1 = '\xd0\xb1\xd0\xb0'
s2 = b'\xd0\xb1\xd0\xb1'
print(s1) # Here is the problem: prints a trash (аб)
print(s2.decode('utf-8')) # Everything is OK, printing 'ба' (two cyrillic symbols)
But how can i decode the data from s1 now? I can't add b'' modifier before the s1 declaration cause s1 may come from internet, so i can't just declare s1 like i declared s2. I found that b'' modifier works like a bytes() function, but when i tried to call it:
s3 = bytes(s1, 'utf-8')
There was a trash again:
print(s3.decode('utf-8')) # аб
So the question is: what should i do with s1 that it becomes the 'ба' in terminal output?
I googled a lot but all that i found was not that i need.
That is what i need:
s4 = SOME_WONDERFUL_MAGIC(s1)
print(s4) # Prints 'ба'
Very thanks for everybody who can help and sorry me please for bad english.
UPDATE: Oops, the problem returned. I hoped that 1st answer will help me, but i found that:
s1 == '\xd0\xb1\xd0\xb0' # BUT
s1 != '\xd0\xb1\xd0\xb0'
What do i mean: I used the 'requests' package to make a POST request to Flask server. It responses me:
req = requests.post(hostName)
print(req.text) # b'testText'
# BUT!
print(req.text[2:-1] # testText
It means that bytes representation of testText represented as string like that:
s5 = "b'tumba'"
So the real question is: how to extract tumba from "b'tumba'" (if tumba may contain cyrillic symbols)?
sock.read(), and may not realize thatrequestsorElementTreeor whatever is doing some magic with a default value or guess.