0

In my JS-application I'm uploading documents to a server. The documents are stored in Uint8Array's. That means documents are represented as Arrays consisting of Integers.

Before uploading the documents I JSON.stringify the documents. Here you have an example:

  var document = [0,5,5,7,2,234,1,4,2,4]
  JSON.stringify(document)
  => "[0,5,5,7,2,234,1,4,2,4]"

Then I send the JSON-representation to the Server.

My problem is that the JSON-representation has a much bigger file-size than the original Integer-Array. I guess that's because the Array is transformed to a JSON-String. I send much more data to the server then needed. How can I store the data more compressed in JSON?

I thought, the JSON-representation is maybe smaller, if I convert the Array first to Base64 and then to JSON.

What are your ideas? Thanks

4
  • 1
    It sounds like you don't actually want JSON. Commented Oct 9, 2016 at 16:50
  • Have you tried BSON? Commented Oct 9, 2016 at 16:54
  • What are the contents of the array? Would you have any value in the 0-255 range in about equal proportions, or would you have more ASCII-like values? Base64 will give you a better ratio than the above JSON array representation: one byte will result in an average of 1.33 bytes, while the array representation will result in 3.57 bytes. But if you have mostly ASCII-like data, you could go down to close to 1 byte... Commented Oct 9, 2016 at 17:00
  • 1
    There is this lz-string compression library for JS which presumably compresses the string to 25% of it's original size. Might be useful if you have large JSON data. Commented Oct 9, 2016 at 18:33

1 Answer 1

1

The integer array JSON representation results in an average 3.57 ratio between input and output:

  • 10 values are represented by 2 bytes (one digit, one comma)
  • 90 values by 3 bytes (2 digits, one comma)
  • 156 values by 4 bytes

On the other hand, base64 will result in an average 1.333... ratio (3 bytes are encoded as 4).

If you have mostly ASCII-like characters in your array (i.e. in the range 32-126), you would probably be better off just sending them as strings (with a few characters escaped), but not if you have random 8-bit data.

You could use some kind of base94 representation to get a better ratio over base64, but is it really worth the cost?

Also note that if you may also consider compression of the data.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.