by Patrice Neff
At Squirro we work a lot with images. For every story that we collect we try to extract a meaningful picture. That image is shown in the application and in the digest emails we send out. Of course we need the image in various sizes to account for the different needs on desktop, tablet, mobile, email clients, etc.
As there are so many changing requirements we didn’t want to pre-render all the different image sizes. Instead we create the thumbnails in real time and then cache them.
For example the image below is an on-the-fly rendition of the original image on Flickr.
We are big fans of REST services here at Squirro. So we created a service called Thumbler that creates the thumbnails. It’s nothing fancy, just a relatively simple wrapper for ImageMagick. It first downloads the original image, usually from Amazon S3 where we store it when we get the story. Then it applies the configuration present in the URL to resize the image.
The deployment is where it gets really interesting. We put Thumbler behind Amazon CloudFront, which is Amazon’s Content Delivery Network. They distribute their caching servers all over the globe. That way a user who is in South America will probably connect to their server in São Paulo whereas a user sitting in Palo Alto will only have to travel within the city to the closest server.
When the request arrives at the CloudFront server, it looks up its local cache. If another user in the area has recently requested the thumbnail, that cached response is delivered. Otherwise CloudFront will now contact out Thumbler service and get a freshly rendered thumbnail.
Thumbler itself has an additional caching layer using Varnish. So if two users request it from different regions, we won’t have to render the thumbnail twice.
The full architecture for our thumbnailing is shown in the following graph.