In the past few days, I've been struggling to understand why this piece of codes behaves in such a way:
code:
file1 = open("input.txt","r")
M = file1.read()
file1.close()
print(M)
print(M.encode("latin"))
print(type(M.encode("latin")))
print("\n-----------------------------\n")
t = "\xAC\x42\x4C\x45\x54\x43\x48\x49\x4E\x47\x4C\x45\x59"
print(t)
print(t.encode("latin"))
print(type(t.encode("latin")))
file "input.txt" content:
\xAC\x42\x4C\x45\x54\x43\x48\x49\x4E\x47\x4C\x45\x59
output:
\xAC\x42\x4C\x45\x54\x43\x48\x49\x4E\x47\x4C\x45\x59
b'\\xAC\\x42\\x4C\\x45\\x54\\x43\\x48\\x49\\x4E\\x47\\x4C\\x45\\x59'
<class 'bytes'>
-----------------------------
¬BLETCHINGLEY
b'\xacBLETCHINGLEY'
<class 'bytes'>
What I don't understand is why the same string is interpreted in 2 different ways, if I read it from the file or if I copy it (by hands) in a variable. I know that the double "\" is probably the result of me printing the string to the console, but I cannot understand what is happening.
\xAC\x42\x4C\x45\x54\x43\x48\x49\x4E\x47\x4C\x45\x59, so literally those characters. in Python source code, string literals understand these back-space + x combination as an escape sequence. Similarly, if you writehello\nworldin a text file, and load it in python and print it, you'll seehello\nworldon the same line, but if your source code containsprint("hello\nworld")you will see ithellothen on another lineworldt = "\xAC\x42\x4C\x45\x54\x43\x48\x49\x4E\x47\x4C\x45\x59", in the other,Myou have the a string which happens to represent that same source code. But that won't make python magically execute this string. The same, if you write in a text file[1,2,3]and load it in the same way asM, thentype(M)will bestr, not magicallylistbecause strings are not source code. You would need to useeval