Lob's website experience is not optimized for Internet Explorer.
Please choose another browser.

Engineering
October 1, 2015

Calipers: The Fastest Way to Measure Image Dimensions in Node

by 
Marcus Gartner

Every JPG, PNG and PDF file passed to Lob’s Print & Mail API is measured to guarantee that the dimensions of the provided file match the dimensions of the medium. For example, 4.25"x6.25" postcards requires an image of at least 1275 x 1875 pixels, and a PDF document with a page size of 8.5" x 11" is required for letters. A common way to retrieve the dimensions of images and PDFs is to use the premier tool for working with images, ImageMagick’s identify tool. However, dealing with the huge volume of files we receive every day revealed performance issues when using ImageMagick. To improve the performance of our API, we built Calipers, a simple and performant node module for measuring the dimensions of JPG, PNG, and PDF files.

It is important to note that the measurements we take on images take place during the life-cycle of a single API request. This makes the speed of these measurements a top priority to keep API response times as low as possible. Measuring images cannot be punted to a background worker because the API must return a 200 status code if the file is the correct size, and a 422 if it is not.

Calipers Source Code

Benchmarks

Simple benchmarks of Calipers show how much more performant it is than spawning additional process like ImageMagick’s identify. These benchmarks show the time taken to perform shell-outs to identify and pdfinfo vs. Calipers. Each test measured a file 500 times with a maximum of 50 concurrent measurements. They were run on a Mid-2014 13" MacBook Pro with a 2.6 GHz Intel Core i5. In the PDF shell-out benchmark, Poppler’s pdfinfo is shown instead of ImageMagick’s identify, because is it significantly faster and a more realistic comparison.

ImageMagick Performance

There were two main factors limiting the performance when shelling-out to ImageMagick. The first is that ImageMagick’s identify is doing more than just determining the dimensions of the files. It is reading in the entire files and is able to provide more thorough data about each image. While this is definitely important in certain applications, our only concern is the dimensions of the file, and we have no need for the other information.

The other bottleneck we saw when using ImageMagick, was that frequent shell calls in Node can significantly reduce the performance of the event loop. Even though each spawned process in Node is free to run on a separate core, we noticed that the overhead in the act of initiating a shell call is significant. Under high traffic in which the API shelled-out multiple times per request, the event loop would become almost unresponsive, resulting in fully utilizing a single CPU core, while the rest of the cores remained comfortably under-utilized.

How it Works

Calipers is able to get around these issues by not relying on ImageMagick. For JPG and PNG files, Calipers reads only a handful of bytes to determine the pixel dimensions. This makes the measurement fast because it is not doing any extra work beyond determining the dimensions, and because it minimizes disk reads. Much of the code for measuring JPG and PNG files was borrowed from the image-size module (We initially considered adding PDF support to this module, but opted not to because adding support for PDFs would have drastically changed the image-size API).

For PDFs, Calipers relies on the poppler-simple node module. This module provides a binding to the performance-focused Poppler library. This module binds natively to Poppler and avoids spawning additional processes in Node.

How to use Calipers

Using Calipers is easy. Simply provide the measure function with the absolute path to a file. When using Calipers, you can use either callbacks or Bluebird promises.

Callback

var calipers = require('calipers');

calipers.measure('/path/to/image.png', function (err, result) {  
 console.log("this " + result.type + " file is " + result.pages[0].width + "x" + result.pages[0].height);
});

Promise

var calipers = require('calipers');

calipers.measure('/path/to/file.png')  
.then(function (result) {
 console.log("this " + result.type + " file is " + result.pages[0].width + "x" + result.pages[0].height);
});

The result given back contains the fields shown below. For JPG and PNG files, the dimension are provided in pixels, and for PDFs, PostScript Points (equal to 1/72th of an inch).

{
 type: 'png',
 pages: [
   {
     width: 450,
     height: 670
   }
 ]
}

Lob has been using Calipers in production for several months, and we feel that it is very stable. That being said, there are surely improvements that can be made, and we welcome contributions!

We're Hiring

If working on problems like this interest you, you're in luck - we're hiring! Check out our careers page for more information.

Lob Careers