1

I have a program which reads an array of bytes. Those bytes are supposed to be ISO-8859-2 decimal codes of characters. My test array has two elements: 103 which is letter g and 179 which is letter ł (l with tail). I then create a Blob object from it and check its content using two methods:

  1. FileReader
  2. objectURL

The first method gives correct results but the second method gives an extra character in the saved blob file.

Here is the code:

var bytes = [103, 179];
var chr1 = String.fromCharCode(bytes[0]);
var chr2 = String.fromCharCode(bytes[1]);
var str = '';
str += chr1;
str += chr2;
console.log(str.charCodeAt(0)); //103
console.log(str.charCodeAt(1)); //179
console.log(str.charCodeAt(2)); //NaN

var blob = new Blob([str]);
console.log(blob.size); //3

//Checking Blob contents using first method - FileReader
var reader = new FileReader();
reader.addEventListener("loadend", function() {
    var str1 = this.result;
    console.log(str1); //g³
    console.log(str1.charCodeAt(0)); //103
    console.log(str1.charCodeAt(1)); //179
    console.log(str1.charCodeAt(2)); //NaN
});
reader.readAsText(blob);

//Checking Blob contents using second method - objectURL
var url = URL.createObjectURL(blob);
$('<a>',{
    text: 'Download the blob',
    title: 'Download',
    href: url

}).appendTo('#my');

In order to use the second method I created a fiddle. In the fiddle, when you click the "Download" link and save and then open the file in a binary editor, it consists of the following bytes: 103, 194, 179.

My question is, where does the 194 come from and how to create a blob file (using the createobjectURL method) containing only bytes given in the original array ([103, 179] in this case).

1 Answer 1

1

The extra 194 comes from an encoding issue :

179 is the unicode code point of "SUPERCRIPT THREE" so the string str will contains "g³". After creating the blob, you will get this string encoded in utf8 : 0x67 for g, 0xC2 0xB3 for ³ (194, 179 in decimal) and it takes 3 bytes. Of course, if you use a FileReader, you will get back 2 characters, "g³".

To avoid that situation (and if you don't want to put everything in utf8), you can use a typed array to construct the blob :

var u8 = new Uint8Array(bytes);
var blob = new Blob([u8]);

That way, you will keep exactly the bytes you want.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.