Monday, November 4, 2013

Setting up a CDN using Amazon EC2 Instances

Today's topic, is setting up a content delivery network (CDN) for map tiles, using Amazon EC2 instances.

Why a CDN?


This particular set of map tiles is in use in dozens of maps that we know of (literally, about 50). The map tiles are very high quality, 20 - 30 KB apiece, and tiled maps being what they are, there can be 16 of them loading at a time for the base layer and 16 more for the labels layer.

They're being served from TileStache, so of course they're cached, and our 10 Gbps network continues to impress... but still, our server has other things to do, and nobody complains when map tiles load too quickly, do they?

The simplest approach would be the most expensive: get another server, set it up with TileStache, etc.That means 500 GB of storage, decent CPU and RAM, and hosting expense when we just want to host tiles. (sorry but we can't do a simple S3 of the PNGs, that presumes that all tiles are already generated, which isn't true)


Much cheaper, is a simple content delivery network (CDN) setup based on Varnish.


What's Varnish?


Varnish is a caching proxy server. You set it up on a server that isn't running a web server, configure it to point at a back-end web server (the real web server where your content resides), then point your browser at Varnish instead of your real web server. Varnish will silently cache things and serve them from its cache, refreshing from the back end web server as necessary.

Varnish can take a lot of tuning if you're outside their presumed default use case, but it's worth the trouble. Any item served from the cache, isn't served from your web server, and that saves throughput, disk IO, firewall traffic, etc. for other things.

So the basic idea is:
  • Get a server with moderate stats: fast network, 1 GB of RAM, and a chunk of disk space. The network is the only part that needs to be really performant.
  • Set it up with Varnish, configured to point at the TileStache server. The configuration is minimal, just cache everything since the PNG and JPEG files really are everything.
  • Update my map apps to point to the new CDN URLs, and let them populate the caches. Tiles are served from the cheap server, sparing load on our big server.

The Amazon EC2 Instance


If you'll be running this cache for a while, consider paying for a Reserved Instance, as it's a significant discount.

Go for a Small instance, as you don't need a lot of CPU and RAM. A Small provides 1.5 GB of RAM which is great, and a decent CPU to do the job. Of course, if you want to raise the bar for more RAM, go right ahead! For this plan, we're looking at 50 GB of cached tiles accumulating over the months, so the difference between 1 GB of tiles in RAM and 2 GB of tiles in RAM isn't significant.

Don't go for a Micro instance, though the free tier is tempting. They're very low performance, particularly network throughput which is your premium here.

I gave the instance 60 GB of standard EBS storage (in addition to its 8 GB root filesystem), set to delete on termination. This will form the cache files. It's rather excessive, but our intent is that this cache go literally months without throwing out anything. For your case, you may want to go with a lot less disk space, maybe only 5 or 10 GB.

Choice of OS is up to you. We went with Amazon Linux 64-bit which is the latest Ubuntu, so the instructions below may differ for you, as far as pathnames and package management.

Give it a while, log in, run the recommended yum update, and let's get started.


Installing Varnish


Yay for repos, right? This creates the varnish user and group, which will be useful later on when we create the cache files.
sudo yum install varnish

Setting up the EBS Volume


This is standard behavior when adding a new, unformatted disk volume to your server. These commands would partition the disk, format it with a nice, large inode size, register it into fstab, and mount it to get you started.
sudo bash
    cfdisk /dev/sdb

    mke2fs -i 4096 -L CACHE /dev/sdb1
    echo 'LABEL=CACHE   /cache    ext2    defaults    1    1' >> /etc/fstab
    mkdir /cache 
    mount /cache
    df -hT
exit
On the cache volume, we want to dedicate all of the space to Varnish. In order to reduce disk fragmentation, let's preallocate all of the space now. This creates a single file, 55 GB in size out of the 60 GB volume. Below we'll configure Varnish for 50 GB, the extra 5 GB is for overhead.
dd if=/dev/zero of=/cache/varnish.bin bs=1G count=55
chown varnish:varnish /cache/varnish.bin

Setting up tmpfs


A Small instance has 1.5 GB of RAM, which allows us to optimize Varnish a bit. Specifically, we can have Varnish write its non-persistent logfiles to a tmpfs, which means less disk IO.

While we're at it, Amazon Linux comes with a tmpfs already set up: about 800 MB and under /dev/shm. Let's get rid of that, as we won't be using it.

sudo vi /etc/fstab
    # comment out the line for /dev/shm on tmpfs
    # add this line    tmpfs       /var/lib/varnish    tmpfs   size=100M    0   0

sudo umount /dev/shm

sudo mkdir /var/lib/varnish
sudo mount /var/lib/varnish
You should now see your tmpfs listed, showing 100 MB of available space:
df -hT
Why 100 MB? Because Varnish's shared memory log takes up about 85 MB of space. There's no point in creating a larger tmpfs, but a little bit extra won't hurt.



Setting Up Varnish


There are two files of interest:
  • /etc/sysconfig/varnish -- The startup configuration file. This is read by the service command and defines more of the defaults such as the HTTP port on which Varnish should listen, and which VCL file to load.
  • /etc/varnish/default.vcl -- This configuration file instructions Varnish on what to cache and how to cache it. There's a lot of documentation on it, but it's still complicated.
The content of /etc/sysconfig/varnish is as follows:

# Configuration file for varnish
# /etc/init.d/varnish expects the variable $DAEMON_OPTS to be set
# from this shell script fragment.

# Maximum number of open files (for ulimit -n)
NFILES=131072

# Locked shared memory (for ulimit -l)  Default log size is 82MB + header
MEMLOCK=82000

# simple configuration
# a good chunk of our RAM (we have 1.5 GB, give it 800 MB) and that big cache file on the EBS volume
DAEMON_OPTS="-a XXX.XXX.XXX.XXX:80 \
             -f /etc/varnish/default.vcl \
             -T localhost:8080 \
             -u varnish -g varnish \
             -p thread_pool_min=200 \
             -p thread_pool_max=4000 \
             -s malloc,800M \
             -s file,/cache/varnish.bin,50G"
The content of /etc/varnish/default.vcl is as follows:
# Varnish VCL file for caching out TileStache tiles
# one named back-end, and some extreme caching for PNG and JPEG files

# our back end TileStache server
backend default {
  .host = "tilestache.myserver.com";
  .port = "http";
}

# extreme TTL! PNGs and JPEGs are kept for a full year,
# on grounds that it only changes once or twice twice per year so is really never stale,
# and that we would restart Varnish anyway (clearing the cache) when we eventually update
sub vcl_fetch {
    if (beresp.status == 301 || beresp.status == 302 || beresp.status == 404 || beresp.status == 503 || beresp.status == 500) {
        return (hit_for_pass);
    }
    if (beresp.http.cache-control !~ "s-maxage" && (req.url ~ "\.jpg$" || req.url ~ "\.png$")) {
        set beresp.ttl   = 30d;
        set beresp.grace = 365d;
    }

    return (deliver);
}

# add an extra header to the HTTP responses, simply indicating a Hit or Miss
# this is useful for diagnostics when we're wondering whether this is REALLY caching anything
sub vcl_deliver {
    if (obj.hits > 0) {
        set resp.http.X-Cache = "Hit";
    } else {
        set resp.http.X-Cache = "Miss";
    }
}

# Varnish won't cache effectively if there's a cookie, and a lot of our websites do use cookies.
# For tile caching purposes, ditch the cookies.
# Also, standardize the hostname: there are DNS aliases for tilestache, tilestache-1 through tilestache-4, etc.
# If we don't standardize the hostname, Varnish will cache multiple copies, from these multiple hosts.
sub vcl_recv {
    if (req.http.Cookie) {
        remove req.http.Cookie;
    }

    # we use one static hostname for all of these, but may be called as tilestache-1 through -4
    # standardize the hostname so it all goes into the one bucket
    set req.http.Host = "tilestache.myserver.com";
}
In the sysconfig, you need to either enter your own IP address for the XXXs, or else leave it blank for it to listen on all interfaces. On my system when I left it blank, someone hit port 80 on the other interface (on the 10.0.0.0/8 network) causing Varnish to start a second shared memory log. Not only is that useless work, but if you use tmpfs like I described, this second log wouldn't fit into the space available.

In the VCL, we strip out cookies. We can do that in this case, since we specifically only care to cache PNGs and JPEGs, and to cache them indefinitely without regard to sessions or the like. We also standardize the hostname, so Varnish won't cache multiple copies of the same tile for tilestache-1.myserver.com, tilestache-2.myserver.com, tilestache-3.myserver.com, and tilestache-4.myserver.com

The VCL also sets some extremely long caching for the PNGs and JPEGs, since our explicit goal here is cached images that won't be expiring for several months. And we add a custom HTTP header X-Cache which indicates Hit or Miss, which I find useful for debugging whether we're in fact caching anything.


Start It Up!


You should now be able to start the service, and set it to start when the server boots:
sudo service start
sudo chkconfig varnish on
If you run a ps you should see two instances of varnishd, and netstat will show your new network listener on port 80.

Give it a test: Get the URL of some thing that you think should be cached, e.g. one of the tile PNGs. Point your browser at the original and make sure you got it right. Now change the URL so the hostname points to your Amazon EC2 instance, and it should still work, having automatically fetched the remote content.

If you bring up Firebug or some other tool that shows HTTP headers, load up a few cached items, and look for the X-Cache header in the response. Some should say Miss, but subsequent loads of the content should start showing Hit. This X-Cache header is added by our VCL file, as described above, and is a good indicator that you're actually caching.

If you're interested in nuts and bolts, run varnishstat and varnishhist and look for your hit ratio. It will be very low at first, of course, because your cache is empty. But over time, the cache will fill and the hit ratio should go up.



And You're Done!


Looking good, huh? Great. Set up some DNS for your Amazon EC2 instance, so you can refer to it by a static hostname, then start updating your applications to use this new caching server. Bit by bit, the caches will fill, the hit ratio will increase, load on your main server will be reduced, and response times of the site will decrease.


About the Persistent Storage Back-End


An important note: the file storage backend purges the cache whenever you give a service varnish restart. Naturally, after this restart and your cache being empty, your hit rate is going to be awful until the cache fills again.

If you're feeling adventurous, you can edit the sysconfig file and change -s file to -s persistent, and Varnish will keep the cached files in between restarts. I didn't go with this because:
  • It's experimental, which doesn't give me a good feeling. Sorry to cop out on the community, though.
  • The store won't show up in varnishstat, so I can't get statistics on it.
  • These servers do nothing but Varnish, so should reboot annually at most. And giving it a restart is a super easy way to clear the cache, when we do make changes.

Elastic Computing At Its Finest


Some time back, I posted how we moved our web server off of Amazon EC2 and were enjoying improved performance and all that stuff. That's all still true.

But in our need for a remote caching scenario, a content delivery network where network speed was paramount and CPU & disk IO weren't strong needs, we have found an excellent use case well matched to EC2's (mediocre) hard disks and CPUs and (quite good) network.