I have a device that returns a UTF-8 encoded string. I can only read from it byte-by-byte and the read is terminated by a byte of value 0x00.
I'm making a Python 2.7 function for others to access my device and return string.
In a previous design when the device just returned ASCII, I used this in a loop:
x = read_next_byte()
if x == 0:
break
my_string += chr(x)
Where x is the latest byte value read from the device.
Now the device can return a UTF-8 encoded string, but I'm not sure how to convert the bytes that I get back into a UTF-8 encoded string/unicode.
chr(x) understandably causes an error when the x>127, so I thought that using unichr(x) may work, but that assumes the value passed is a full unicode character value, but I only have a part 0-255.
So how can I convert the bytes that I get back from the device into a string that can be used in Python and still handle the full UTF-8 string?
Likewise, if I was given a UTF-8 string in Python, how would I break that down into individual bytes to send to my device and still maintain UTF-8?