FileReader Chunking and Base64 DataURLs

Photo by Joshua Sukoff on Unsplash

In a hurry? You can now use our HUp jquery plugin to read files in a chunked fashion as data URLs. Hooray!

Got a minute or two? Let’s talk about file read chunking, data URLs and base 64.

If you’ve been looking forward to the previously promised discussion about file reading/downloading to/uploading from IndexedDb – well, keep looking forward, it’s on the way. In the meantime, however, let’s take a quick look at a problem and it’s quick and easy solution, that emerged out of making file reading chunkable for the HUp plugin.

The Problem

The HUp plugin has, as one of its goals, sensible defaults – such that a user can just call it against whatever element they want to make into a file reader/uploader drag-and-drop point, and presto! Everything works, and in a reasonable manner.

When I added the ability to read files in chunks, to mirror and complement our ability to upload files in chunks, we hit a snag in regards to the above goal. Generally speaking, the expectation is that developers using the plugin to read files in will probably want to be able to use said file in the browser in some fashion, and one of the simplest representations that can be used directly in the browser is a base64-encoded data URL. You can expect to be able to pop it into the src attribute of a number of elements in a modern browser, and be able to do something with it right away. So, defaulting our read_method to readAsDataURL makes sense.

However! Now that we’re chunking file reads by default (since it offers benefits, some of which were discussed last time), if the file to be read is larger than our chunk_size (which defaults to 1MiB), than it will be read as a number of chunks, which will need to be reassembled to be used in a src attribute as a data URL. Small bit of extra work for the developer, but a little string manipulation and you’re done, right?

Not quite.

It’s common knowledge that encoding binary data as base64 will result in an overall increase in size of about 33%. This is because, with a total of 26 different values, base64 can encode 6 bits per character (that is, the eight bits composing one character (in, for example, utf-8), are each used to represent 6 bits in the source binary).

So what’s the problem? We can end up with chunks that can’t be trivially recombined, and the details we just discussed regarding how many bits per character we get in base64-encoding explains why – if we’re not careful about where we slice the file chunks, we end up with each base64 chunk after the first out of alignment from the binary source. Attempting to simply concatenate them and use them will fail.

The Workaround

Recently, a user of the plugin brought up an unrelated problem, regarding the mime-type on the produced data URLs. In the process of fixing this, I started thinking about how chunking might affect the data URLs, and a few quick experiments showed me I was correct – chunking the files and encoding them as base64 caused problems. I didn’t have the time to fix the issue at that moment, however, so I mentioned a workaround to said user.

You, lucky reader, don’t need this workaround, since this issue is now addressed – however, it might be of future interest, particularly if you need/prefer to work with Array Buffers.

Since using an Array Buffer means we don’t need to worry about base64 encoding a file, we can slice and dice each chunk however we please, and concatenating the resulting array buffers poses no issue. So, instead of using readAsDataURL as our read_method, I suggested this user employ the readAsArrayBuffer method instead.

What if he wants to be able to display or otherwise use the read-in file in the browser? Well, we could convert the concatenated array buffers into base64, but a much more straightforward method is to create a new Blob from the concatenated array buffers, and get an object url from said blob to display.

This would look something like the following, assuming for the sake of example that you have three array buffers which we’ll creatively name arr1, arr2 and arr3:

var blob = new Blob([arr1, arr2, arr3], {type:'mime/type'}),
    url = URL.createObjectURL(blob);

Pretty easy, right?

The Solution

I wasn’t entirely happy with having to suggest this workaround, however, so when I had an hour or so free to revisit the question, I tried a little experiment. Could it really be as simple, I wondered, as making sure our chunks were aligned to the nearest multiple of 6 (remembering, as mentioned above, that we get 6-bits per character)?

Yes, yes it could be.

So, the defaults for HUp can now remain blissfully unchanged, and chunking works just fine – when readAsDataURL is specified as the read_method, the plugin will transparently alter the total size of each chunk to the nearest multiple of 6 until we reach the end of the file, ensuring that the resulting base64 remains aligned with the binary source, and allowing trivial recombination of the data URLs.

I’ve also added a small convenience function to handle said recombination – check out the Github Repo for the details.

The What and Why of Javascript Frameworks

Photo by Raphael Nogueira on Unsplash

 

As has been previously discussed, JavaScript has the propensity to be very untidy if you let it be. This isn’t a problem unique to JavaScript, of course – many other languages suffer from a lack of native organization, especially for specific tasks.

It is for this reason that frameworks exist – to give us some structure upon which to build, the foundation for our edifice of code. Let’s dig a little deeper into what a JavaScript framework looks like, and why you might want to use one.

(more…)

2014 Review: Day 7

As 2014 winds down, we’ll take an opportunity to look back at some of our most-read posts from this year, in case you missed them the first time. 

Software screen capture

Via JMack on StackOverflow

Christopher Keefer is back, showing a useful technique for selectively hiding option values from inside of select controls:

(more…)

It’s a (jQuery-style) Promise

Photo of pinky finger. Pinky Promises by marissavoo on Flickr

Way back when I brought up the topic of promises (particularly, jQuery Deferred), and I promised we would come back to the topic someday.

Well, that promise has finally resolved, and this is the done block. Don’t get it? Don’t worry, all shall be explained. If you do get it, and wish there were a done block in the Promise spec… well, read on.

(more…)

Hidden Options: A Workaround

Photo by Kelly Repreza on Unsplash

Here’s the situation:

You’ve got a select. Maybe a whole bunch of selects, with a ton of options each (metric ton – let’s keep our imaginary hyperbolic units straight here); and these are meant to be complex interactive elements, with options made visible or not as some programmatic condition dictates.

Traditionally, if you wanted to selectively display options, you had to do it the hard way – remove the non-visible option nodes entirely. What, did you want to filter on some state information stored on the node? Too bad – you’ll have to keep track of the full structure outside of the DOM and filter on that, inserting or removing elements as needed.

This is sub-optimal. It’s much tidier if we can just set display:none on our options elements, and have them hidden like any other DOM element:

option[disabled]{
    display:none;
}

(more…)