Image Caching Server
Another project I worked on at E-Commerce Solutions was a product
search engine with data supplied from many partner websites. One
problematic aspect of the search engine was trying to display
product thumbnails on the results page.
Along with the product's information, each partner site also
sent us the URL to a full-size picture of the product. Image size
varied greatly from site to site, but we wanted to display small
thumbnails all formatted to similar dimensions. The first solution
was to set the height
and width
attributes
of the image tag, but unfortunately we did not know the original
dimensions ahead of time, so attempting to hardcode dimensions
resulted in horribly distorted pictures. The second solution used
JavaScript to scale the pictures, but it only worked in Internet
Explorer 4.0 and above, and would sometimes generate JavaScript
errors if there was a problem downloading the image. Either solution
required the client to download the entire original picture before
it could scale it down, and some of these photos were full-screen.
My solution was to create an intermediate proxy server that would
download the images from the partner sites as they were requested
and then scale the pictures down on the fly before sending them
to the client browser. For efficiency, once an image was scaled,
it was saved in a cache on the server so future requests did not
require repeat trips to the partner servers.
I wrote the proof of concept in Python using the Python standard
library and the Python Imaging Library (PIL). The URL of a requested
image was sent to the proxy server which used Python's URL handling
libraries to pull the image from the partner site. The PIL library
loaded the image and extracted its dimensions. With this information,
the server was able to properly calculate the image's height and
width to fit the search results page. The PIL library scaled down
the image and saved a copy to disk, keeping a list of scaled image
in memory. The scaled image was then sent back to the client's
browser.
The next time the same image was requested, the server simply
redirected the client to the local copy sitting on the server's
hard drive. If the requested image could not be found on the partner's
site, then the client was redirected to a generic "Picture
Not Available" graphic.
The solution was dubbed the Image Caching Server, and a full
implentation was written by one of my coworkers in C as a custom
Apache module running on a Linux server.