Why is this decoding/encoding process giving a different buffer array?

Question

I have the following represented as an ArrayBuffer:

const encryptedMsg = await crypto.subtle.encrypt(algorithm, key, messageUTF8)

The byte length of this value is 28:

encryptedMsg
// ArrayBuffer { byteLength: 28 }

When I convert this into a Uint8Array then I get the following values:

const encryptedMsgArr = new Uint8Array(encryptedMsg)
// Uint8Array(28) [ 237, 243, 213, 127, 248, 55, 37, 237, 209, 21, … ]

I want to convert this into a UTF-8 cyphertext with a standard decoder and later revert that with a standard encoder:

const encoder = new TextEncoder("utf-8");
const decoder = new TextDecoder("utf-8");

When decoding it:

const cypherText = decoder.decode(encryptedMsgArr)
"���\u007f�7%��\u0015\u00113\u0012\u0016�۹o׀.:+=��\u0015\u0015"

But when I try to encode it back into a Uint8Array then it doesn't match up even though utf-8 encoding was specified for both.

In fact the above doesn't even look like it's utf-8, and the byte length does not match up either (now 46 instead of 28):

encoder.encode(cypherText)
// Uint8Array(46) [ 239, 191, 189, 239, 191, 189, 239, 191, 189, 127, … ]

What am I doing wrong here?

Goal

To be able to export the cyphertext so it can be decrypted elsewhere at a later stage. If UTF-8 decoding of the ArrayBuffer doesn't work, the only other thing I can think of is to convert the AB into a stringified version of the array of integers and export that string, but I don't think that is a very sane methodology.

Edit

Actually, just declaring the encoder and decoder without utf-8 encoding fixes the issue, but @ornic has provided a nice base64 encoding/decoding function to use instead.

const encoder = new TextEncoder();
const decoder = new TextDecoder();

What other encodings are suitable for this operation as the main goal is to be able to generate an exportable string/cyphertext — Simpleton
– Simpleton, Commented Aug 21, 2019 at 13:02
Base64 would probably work. I don't know if there's a way to do that from a typed array without writing your own code however. — Pointy
– Pointy, Commented Aug 21, 2019 at 13:26
There is a library to store bytes in javascript strings (since they are UTF-16) with minimal overhead. Very handy to store bytes in localStorage, since its size is limited. pieroxy.net/blog/pages/lz-string/index.html — ornic
– ornic, Commented Aug 21, 2019 at 13:50

ornic · Accepted Answer · 2019-08-21 13:34:19Z

3

AFAIK the most common way is to decode your bytes to ASCII text, not UTF-8.

Something like that (all that code is from my current project, I found almost all of it there, on SoF):

var bufferToBase64 = function (buffer) {
            var s = '';
            var uintArray = new Uint8Array(buffer);
            uintArray.filter(function (v) { s += String.fromCharCode(v); return false; });
            return window.btoa(s);
        };

var bytes = function (text) {
            return new Uint8Array(
               atob(text)
                  .split('')
                  .map(function (c) {
                       return c.charCodeAt(0);
                   })
            );
}

And the example of usage:

test = new Uint8Array([1, 5, 167, 12])
> Uint8Array(4) [1, 5, 167, 12]
test2 = bufferToBase64(test)
> "AQWnDA=="
test3 = bytes(test2)
> Uint8Array(4) [1, 5, 167, 12]

answered Aug 21, 2019 at 13:34

ornic

3953 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Simpleton Over a year ago

I like this approach. Accepted. Thanks!

Tom Blodget Over a year ago

String.fromCharCode(v): That could be effectively decoding using the ISO-8859-1 character encoding, not ASCII. It is reversible, as you have done, so it has that going for it.

Collectives™ on Stack Overflow

Why is this decoding/encoding process giving a different buffer array?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related