HTML 5 Web Workers and image processing

One of the more exiting things we are getting via the new HTML specification is Web Workers. If you have no idea what Web Workers (lets call them Worker from now on) are you can basically think of them as "sandboxed" threads. By threads I do refer to threads as in «multithreading», an architectural feature of modern CPUs and operating systems. Workers allow us to run tasks separately from the main thread where all the drawing, animation and DOM manipulation is going on. This allows us to do calculations without leaving the impression on the user that the browser is "hanging". If you have worked with other "thread wrapping" technologies like Grand Central Dispatch (GCD) you will get Workers without fuzz, however Workers do have some limitations that other similar technologies don't have.

Workers limitations and usage
First of all, Workers are separate code blocks of JavaScript not found in the spawning script, kinda.. In most cases a Worker is a separate JavaScript file which you DO NOT refer to in you HTML script tags in the HTML document. However, Workers can also be defined inline through the BlobBuilder interface. In this post I will use only external Workers. Also, there are two kinds of Workers, «Shared» and «Dedicated». I will only use «Dedicated Workers» in this post.

Workers cannot access the following:
  • The DOM or the DOM APIs
  • The window object
  • The document and the parent object. 
A worker can access:
  • The «navigator» object
  • «XMLHttpRequest»
  • A read-only version of the «location» object
  • The Application cache
  • setTimeout(), clearTimeout() and setInterval(), clearInterval()
A Worker can also spawn other Workers and import external scripts via «importScripts()».

The demo (aka. the fun part)
For the demo this time I have created a simple web-app which loads and displays 3 pictures n times. You can randomize the position of the images by clicking the randomize button. You can pick the number of images displayed as well as their size. The two image manipulation buttons will grayscale and invert the images, if you check the checkbox the images will perform the randomize animation concurrently with the image processing. It's this latter part which calls for the usage of Workers.

The demo will show you how to spawn a separate Worker for each of the images, thereby allowing for concurrent processing and animation without much degradation in performance. I won't explicitly go through all the layout and animation code here, rather focusing on the usage of the Workers.

var worker = new Worker('DWGrayscale.js');
worker.addEventListener('message', function(e){
   ctxAr[e.data.index].context.putImageData(e.data.imagedata,0,0);
});
worker.postMessage(...imagedata...);

On line 126 - 133 within the grayscale function we create a Worker for each of the images, then we add a listener for the «onmessage» event allowing the Worker to communicate with the main thread. We finish off by posting a message to the worker using «postMessage» and passing the image data. These messages are the only way to communicate with a Worker. This means that we also need to implement this interface in the Worker itself.
addEventListener('message', function(e){
 var imageData = e.data.imagedata;
 //....process image data....
 postMessage(...imagedata...);
});
In the Worker «DWGrayscale.js» we grab the passed image data from the «data» property of the event. We then go on processing the image data, when done we post a message back to the main thread passing the image data which then in turn can be used to update the canvas element on screen.

Why not pass the Canvas element or at least the Canvas context?
Workers do not have access to the DOM, because it would not be thread safe. Workers are limited to passing object which can be serialized into JSON, which does not support cyclic objects. The Canvas element is a DOM element and the context is a cyclic object, hence we need to get the actual pixel array which we can pass back and forth to the Worker.

Take a look at the demo and the source code to learn in more detail how the demo was created.
Note also that you should probably run the demo in Safari as it has no limitation on the number of Workers concurrently running like Chrome has, also the animations uses the WebKit prefix and will not work in non WebKit browsers. So, please don't use this code in an production environment!

References:

4 comments:

OrNot said...

Hi,
So far the demo, the size of the image data passed between the front page and the web worker is not very big. but if a picture with 3000*2000 resolution, the data clone might fail. Do you have some sort of better solution ?

Thanks in advance
OrNot

Unknown said...

You are completely right, a large image might fail in both the Canvas methods and the WebWorkers interface. I imagine that the size limitations is dependent on the amount of memory available to the process, something which is decided by the user agent.

OrNot said...

Furthermore, the overhead for cloning big size of data Can not be ignored.
web worker is still not ready for image processing then?

Unknown said...

You might be missing the point here. It would be far better to e.g. turn an image into grayscale using a worker instead of performing this operation on the main thread, in which case you would be blocking all other operations like UI animations or other calculations. However, this method might not always be preferable since you will need to work with a copy of the image data.