2

I know that JavaScript strings are usually encoded with an encoding taking at least two bytes per character (UTF-16 or UCS-2).

However, when constructing a Blob, a different encoding appears to be used because when I read it as ArrayBuffer, the length of the returned buffer is 3 for an Euro sign.

var b = new Blob(['€']);

1 Answer 1

3

According to the W3C, it is UTF-8 encoded.

Demo:

// Create a Blob with an Euro-char (U+20AC)
var b = new Blob(['€']);
var fr = new FileReader();

fr.onload = function() {
  ua = new Uint8Array(fr.result);
  // This will log "3|226|130|172"
  //                  E2  82  AC
  // In UTF-16, it would be only 2 bytes long
  console.log(
    fr.result.byteLength + '|' + 
    ua[0]  + '|' + 
    ua[1] + '|' + 
    ua[2] + ''
  );
};
fr.readAsArrayBuffer(b);

Play with that on JSFiddle.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.