I have the following piece of code:
using (Stream inputFileStream = File.OpenRead("C:\\Users\\User\\Downloads\\test.txt"))
{
using (Stream transcodingStream = Encoding.CreateTranscodingStream(inputFileStream, Encoding.GetEncoding(500), new UnicodeEncoding(bigEndian: true, byteOrderMark: true)))
{
using (Stream outputStream = File.OpenWrite("C:\\Users\\User\\Downloads\\test.txt"))
{
await transcodingStream.CopyToAsync(outputStream, cancellationToken);
}
}
}
My file before transcoding has the following first 16 bytes and is of Ebcdic type encoding (code page 500):
F5 F1 F1 F0 F2 C2 D4 E6 40 40 40 40 40 40 F1 F1 = 51102BMW 11
After performing the transcoding to Unicode with Big-Endiant and Byte Order Markings, I expect the file to begin with:
FF FE
However, I get:
00 35 00 31 00 31 00 30 00 32 00 42 00 4D 00 57 = �5�1�1�0�2�B�M�W
Where am I going wrong with this?
UnicodeEncodinguses) has 2 bytes for every character in the Basic Multilingual Plane (BMP) - it's behaving as you asked it to. What are your actual requirements here? (I'd normally advise UTF-8 as the encoding to choose where possible.)