Gregor The Map Guy

Monday, January 13, 2014

ArcGIS API, Highcharts, and kickbutt elevation profiles - 1. Intro to ArcGIS JS API

My previous post was a teaser for this one: part 1 in a 4-part series on creating interactive elevation profile charts, using Highcharts and the ArcGIS JavaScript API. This series will refer often to this live demo, so you may want to have it open as you read this.

I was originally going to just post a few bites of code, and skip over the basics of the API. But I realized that my demo app isn't overly complicated, and can double as a tutorial on the ArcGIS JS API. So here you go: a little more complicated than Hello World, but a little more functional than "just use an ArcGIS Online application template"

Getting Started

If you're not familiar with it, you'll want to read up on the ArcGIS JavaScript API documentation. It's a surprisingly functional API, with access to a lot of cool services such as ESRI's geocoder, ESRI's elevation service, arbitrary ArcGIS geoprocessing endpoints, some good basemaps, etc.

Your basic map would come in these two parts, an HTML file and a JavaScript file. The code below covers version 3.8 of the API, which is current as of January 2014. 3.8 makes some minor changes from 3.7, most notably to dojo.require -- it now accepts the whole list of dependencies, and runs a callback after they're loaded.

The HTML

<!DOCTYPE HTML>
<html>
<head>
    
    <link rel="stylesheet" href="http://js.arcgis.com/3.8/js/dojo/dijit/themes/claro/claro.css">
    <link rel="stylesheet" type="text/css" href="http://js.arcgis.com/3.8/js/esri/css/esri.css">
    <script src="//js.arcgis.com/3.8/"></script>

    
    <script type="text/javascript" src="index.js"></script>
    <style type="text/css">
    #map {
        width:5in;
        height:5in;
        border:1px solid black;

        margin:0 auto 0 auto;
    }
    </style>
</head>
<body class="claro">

    <div id="map"></div>

</body>
</html>

The JavaScript (index.js)

var MAP;

var START_W = -106.88042;
var START_E = -106.79802;
var START_S =   39.16306;
var START_N =   39.22692;

var ARCGIS_URL = "http://205.170.51.182/arcgis/rest/services/PitkinBase/Trails/MapServer";
var LAYERID_TRAILS = 0;

require([
    "esri/map",
    "dojo/domReady!"
], function() {
    // the basic map, with a global reference of course
    MAP = new esri.Map("map", {
        extent: new esri.geometry.Extent({xmin:START_W,ymin:START_S,xmax:START_E,ymax:START_N,spatialReference:{wkid:4326}}),
        basemap: "streets"
    });

    // add the trails overlay to the map
    OVERLAY_TRAILS = new esri.layers.ArcGISDynamicMapServiceLayer(ARCGIS_URL);
    OVERLAY_TRAILS.setVisibleLayers([ LAYERID_TRAILS ]);
    MAP.addLayer(OVERLAY_TRAILS);
});

This is pretty minimal: load up Dojo and have it load its dependencies, and when that's done the callback will create a new esri.Map bound to the DIV with id="map", with a specific starting zoom area (the extent).

It does go one extra step, though, and add a layer above the basemap. In this case, it's an ArcGIS REST service which will display hiking trails in the same area to which the map is zoomed. Just these two pieces, and you have your very first ArcGIS API map.

What? That's it?

I hate Hello World type applications, which claim to walk you through the API but which just hand you a working map and nothing else to go on. But you know what, in this case that really is it... for now. The next posting covers click events and the Identify task.

Wednesday, January 8, 2014

ArcGIS API, Highcharts, and kickbutt elevation profiles - Teaser

A recent website for a Bay Area parks agency, has some very good-looking elevation profiles. You pick a trail from the list, and an elevation profile appears: this great line chart going up and down to show the elevation as you would walk its length, AND as you mouse over the chart it draws a marker on the map to really give an idea of where you're talking about. Here's a clip from my version of it:

And a live demo courtesy of Github Pages:
http://gregallensworth.github.io/ArcGISElevationProfileCharting/

This article will span a few postings, and will describe step by step how I developed this. But for a spoiler, here's the big punchline: it doesn't use ArcGIS Online application templates, because we weren't able to get the extreme flexibility we needed for our demanding clients. Instead I did it with these parts:

- a basic map written in ArcGIS JS API
- a Geoprocessing call to ESRI's elevation service
- jQuery and the Highcharts charting library

The next few postings will be from the ground up: creating a basic map using the ArcGIS JS API, adding click event handlers and Identify tasks, and ultimately calling the elevation service and rendering a very dynamic chart.

Monday, December 16, 2013

A PHP proxy to do both GET and POST, and the limitations of JSONP

It's been busy the last few months! Working on a few projects that aren't ready for release yet, so not a lot I'll say about them. But here's something that may come in handy.

The Problem & Solution

This web application, needs to fetch data from a remote data source. It's not based on the map per se, just a few dozen checkboxes to filter by this-n-that, search terms, some date filters, all submitted to the server and we get back some JSON. Pretty ordinary.

Problem is: it's on another server, sop we have cross-domain issues. The browser successfully makes the GET request, but the response body is blank and the XHR returns an error status.

The common solution, is to set up a proxy server. That is: it's a program hosted on the same site as your web application, and you can make your GET and POST requests to this program instead of the remote one on another site, it will do the GET and POST to the remote server, and return the content. This proxy server being on the website's domain, your cross-origin problem is solved.

Three simple steps.

#1, create this PHP program and call it proxy.php

<?php
// a proxy to call the remote ReefCheck server
// we were doing well with JSONP until we tried using too many
checkboxes, generating URLs that were too long for GET,
// and JSONP doesn't happen over POST
$PROXIES = array(
'/count' => 'http://other.remote.site.com/query/endpoint.php.php',
'/search' => 'http://my.remote.site/some/endpointc.cgi',
);
$url = @$PROXIES[$_SERVER['PATH_INFO']];

if (!$url) die("Bad proxy indicator: {$_SERVER['PATH_INFO']}");
// are we GETting or POSTing? makes a difference...
if (@$_POST) {
// compose a cURL request, being sure to capture the output
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_POST, TRUE);
curl_setopt($curl, CURLOPT_POSTFIELDS, http_build_query($_POST) );
curl_setopt($curl, CURLOPT_HEADER, FALSE);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$output = curl_exec($curl);
curl_close($curl);
print $output;
} else {
// easy way out: the base URL plus ? plus params
$url = sprintf("%s?%s", $url, $_SERVER['QUERY_STRING'] );
readfile($url);
}

#2, Adjust the $PROXIES to match the endpoints you need. Pretty obvious: this is an aliasand a real URL, such as /search to http://some.site.com/api/v1/search.json

#3, change your JavaScript code to use your new proxy.

// old

$.get('http://some.site.com/api/v1/search.json', params, function(reply) { });

// new, note the /search matching the /search in $PROXIES

$.get('proxy.php/search', params, function(reply) { });

Dead simple. You should be able to make your queries, using exactly the same POST or GET parameters, and get back your reply exactly as if you had made the request to the real server. Except for the missing response body and error status.

Features & Shortcomings

This proxy script supports both GET and POST, which right off is a great start. And it supports multiple endpoints. And the GET stuff is done as an URL which makes debugging easy (cURL can do GET of course, but I find debugging simpler if it can dump an URL).

It does not support headers, either direction: no enctype, and no Content-type headers from the remote source. If you're using jQuery, you'll definitely want to use the data-type parameter (the 4th parameter to $.get and $.post, forcing the interpretation of the data type). Personally I consider that 4th param a good practice anyway... And for file uploads, the missing enctype may be relevant but there are iframe-based fixes for that anyway such as ajaxForm()

And keep in mind the most basic issue of proxies like this: it's effectively triple-transiting traffic: browser calls your server, your server calls the remote API, your server gets the data back, your server spits that data out to the browser. If your server's network throughput is a concern, adding someone else's API to your server's responsibilities may not be a pleasant necessity.

A quick recap: JSONP and the need for this proxy

As we started development of this application, we used JSONP instead. To recap if you're not familiar with JSONP:

You're probably used to returning JSON, which is a representation of a structure such as { title:'My Stuff', size:100 } JSONP takes this a step further, and wraps that structure into a function call, forming executable JavaScript, like this: handleResults({ title:'My Stuff', size:100 }) The name of the function being invoked, is defined by the &callback= parameter, which you sent along with your request, so you can in fact name the function that will be used, e.g. &callback=handleResults is simply one more param in your usual GET request.

This does presume that the API endpoint is programmed to handle the &callback= parameter and wrap the JSON output, and that you're willing to specify this one extra parameter in your request. (server-side: this really is simple to implement: you're about to spit out json_encode()'d data anyway, put if @$_GET['callback'] and change the output slightly if so) (client-side: if you use jQuery's $.get function, it can create a random function name for you, and bind your callback to it, and supply the callback param for you; very little labor here)

As long as your endpoint is JSONP-aware, and will accept a &callback= parameter and wrap the content in it, this is a great way to make your browser do the work itself without involving your proxy. Slightly faster transfer times, my server not needing to double-transit traffic, everybody wins...

...until the GET params become too much, and we must use POST.

But JSONP doesn't happen over POST!

You see, the spec changes at one point and we had to include A LOT of checkboxes and other such parameters, and the client's endpoint uses text names instead of primary keys, so it was entirely normal to construct an URL like this:

http://example.com/endpoint.json?counties[]=Santa Rosa&counties[]=Alameda&counties[]=San Mateo&species[]=Catfish&species[]=Salmon&species[]=Goldfish&species[]=Dogfish&species[]=Mulkey's Pollock

The URL params were now too long for GET, so remote servers start truncating our queries, hanging up on us, etc. so we must use POST. But, W3C specification is that JSONP doesn't work over POST, and if you try to use JSONP in jQuery it will automatically be changed to a GET.

As such, as much as I was fond of using JSONP while it lasted... ultimately we had to go for a PHP proxy.

Wednesday, December 4, 2013

PHP: Calculate the centroid of a polygon

A project I've been working on, generates point markers onto a map. But the interesting part, is how I populate that database of points. It's a series of "drivers" for connecting to ArcGIS REST API, OGC WFS, CartoDB, and so on. More on that as it develops...

A need that came up today, was that this particular layer, being served from ArcGIS REST, is polygons and we need points. Normally I would calculate the centroid and use that... but this specific software is being developed on a plain web host: no GEOS or OGR, no PostGIS... just plain ol' PHP.

Some Reading... then Writing

So I did some reading:

http://en.wikipedia.org/wiki/Centroid#Centroid_of_polygon

And wrote this, a pure PHP function for calculating the signed area of a polygon, and then the centroid of the polygon:

https://github.com/gregallensworth/PHP-Geometry/blob/master/Polygon.php

Caveats

As described in the file, this is specific to our sorts of use cases: park boundaries, city boundaries, and the like. If the polygon is self-intersecting (that's a no-no) it may be wrong. If it's an L shape it may come up wrong too. And if it's a multipolygon and those rings overlap, it'll double-count the area, as it's not smart enough to find intersecting area between two rings and subtract it out (and this incorrect area may affect the centroid placement).

But, it does what we need it to do, and it may work for you. Enjoy!

Monday, November 4, 2013

Setting up a CDN using Amazon EC2 Instances

Today's topic, is setting up a content delivery network (CDN) for map tiles, using Amazon EC2 instances.

Why a CDN?

This particular set of map tiles is in use in dozens of maps that we know of (literally, about 50). The map tiles are very high quality, 20 - 30 KB apiece, and tiled maps being what they are, there can be 16 of them loading at a time for the base layer and 16 more for the labels layer.

They're being served from TileStache, so of course they're cached, and our 10 Gbps network continues to impress... but still, our server has other things to do, and nobody complains when map tiles load too quickly, do they?

The simplest approach would be the most expensive: get another server, set it up with TileStache, etc.That means 500 GB of storage, decent CPU and RAM, and hosting expense when we just want to host tiles. (sorry but we can't do a simple S3 of the PNGs, that presumes that all tiles are already generated, which isn't true)

Much cheaper, is a simple content delivery network (CDN) setup based on Varnish.

What's Varnish?

Varnish is a caching proxy server. You set it up on a server that isn't running a web server, configure it to point at a back-end web server (the real web server where your content resides), then point your browser at Varnish instead of your real web server. Varnish will silently cache things and serve them from its cache, refreshing from the back end web server as necessary.

Varnish can take a lot of tuning if you're outside their presumed default use case, but it's worth the trouble. Any item served from the cache, isn't served from your web server, and that saves throughput, disk IO, firewall traffic, etc. for other things.

So the basic idea is:

Get a server with moderate stats: fast network, 1 GB of RAM, and a chunk of disk space. The network is the only part that needs to be really performant.
Set it up with Varnish, configured to point at the TileStache server. The configuration is minimal, just cache everything since the PNG and JPEG files really are everything.
Update my map apps to point to the new CDN URLs, and let them populate the caches. Tiles are served from the cheap server, sparing load on our big server.

The Amazon EC2 Instance

If you'll be running this cache for a while, consider paying for a Reserved Instance, as it's a significant discount.

Go for a Small instance, as you don't need a lot of CPU and RAM. A Small provides 1.5 GB of RAM which is great, and a decent CPU to do the job. Of course, if you want to raise the bar for more RAM, go right ahead! For this plan, we're looking at 50 GB of cached tiles accumulating over the months, so the difference between 1 GB of tiles in RAM and 2 GB of tiles in RAM isn't significant.

Don't go for a Micro instance, though the free tier is tempting. They're very low performance, particularly network throughput which is your premium here.

I gave the instance 60 GB of standard EBS storage (in addition to its 8 GB root filesystem), set to delete on termination. This will form the cache files. It's rather excessive, but our intent is that this cache go literally months without throwing out anything. For your case, you may want to go with a lot less disk space, maybe only 5 or 10 GB.

Choice of OS is up to you. We went with Amazon Linux 64-bit which is the latest Ubuntu, so the instructions below may differ for you, as far as pathnames and package management.

Give it a while, log in, run the recommended yum update, and let's get started.

Installing Varnish

Yay for repos, right? This creates the varnish user and group, which will be useful later on when we create the cache files.

sudo yum install varnish

Setting up the EBS Volume

This is standard behavior when adding a new, unformatted disk volume to your server. These commands would partition the disk, format it with a nice, large inode size, register it into fstab, and mount it to get you started.

sudo bash
    cfdisk /dev/sdb
    mke2fs -i 4096 -L CACHE /dev/sdb1
    echo 'LABEL=CACHE   /cache    ext2    defaults    1    1' >> /etc/fstab
    mkdir /cache
    mount /cache
    df -hT
exit

On the cache volume, we want to dedicate all of the space to Varnish. In order to reduce disk fragmentation, let's preallocate all of the space now. This creates a single file, 55 GB in size out of the 60 GB volume. Below we'll configure Varnish for 50 GB, the extra 5 GB is for overhead.

dd if=/dev/zero of=/cache/varnish.bin bs=1G count=55
chown varnish:varnish /cache/varnish.bin

Setting up tmpfs

A Small instance has 1.5 GB of RAM, which allows us to optimize Varnish a bit. Specifically, we can have Varnish write its non-persistent logfiles to a tmpfs, which means less disk IO.

While we're at it, Amazon Linux comes with a tmpfs already set up: about 800 MB and under /dev/shm. Let's get rid of that, as we won't be using it.

sudo vi /etc/fstab
# comment out the line for /dev/shm on tmpfs # add this line tmpfs /var/lib/varnish tmpfs size=100M 0 0

sudo umount /dev/shm

sudo mkdir /var/lib/varnish
sudo mount /var/lib/varnish

You should now see your tmpfs listed, showing 100 MB of available space:

df -hT

Why 100 MB? Because Varnish's shared memory log takes up about 85 MB of space. There's no point in creating a larger tmpfs, but a little bit extra won't hurt.

Setting Up Varnish

There are two files of interest:

/etc/sysconfig/varnish -- The startup configuration file. This is read by the service command and defines more of the defaults such as the HTTP port on which Varnish should listen, and which VCL file to load.
/etc/varnish/default.vcl -- This configuration file instructions Varnish on what to cache and how to cache it. There's a lot of documentation on it, but it's still complicated.

The content of /etc/sysconfig/varnish is as follows:

# Configuration file for varnish
# /etc/init.d/varnish expects the variable $DAEMON_OPTS to be set
# from this shell script fragment.

# Maximum number of open files (for ulimit -n)
NFILES=131072

# Locked shared memory (for ulimit -l) Default log size is 82MB + header
MEMLOCK=82000

# simple configuration
# a good chunk of our RAM (we have 1.5 GB, give it 800 MB) and that big cache file on the EBS volume
DAEMON_OPTS="-a XXX.XXX.XXX.XXX:80 \
             -f /etc/varnish/default.vcl \
             -T localhost:8080 \
             -u varnish -g varnish \
             -p thread_pool_min=200 \
             -p thread_pool_max=4000 \
             -s malloc,800M \
             -s file,/cache/varnish.bin,50G"

The content of /etc/varnish/default.vcl is as follows:

# Varnish VCL file for caching out TileStache tiles
# one named back-end, and some extreme caching for PNG and JPEG files

# our back end TileStache server
backend default {
.host = "tilestache.myserver.com";
.port = "http";
}

# extreme TTL! PNGs and JPEGs are kept for a full year,
# on grounds that it only changes once or twice twice per year so is really never stale,
# and that we would restart Varnish anyway (clearing the cache) when we eventually update
sub vcl_fetch {

    if (beresp.status == 301 || beresp.status == 302 || beresp.status == 404 || beresp.status == 503 || beresp.status == 500) {
        return (hit_for_pass);
    }

    if (beresp.http.cache-control !~ "s-maxage" && (req.url ~ "\.jpg$" || req.url ~ "\.png$")) {
        set beresp.ttl   = 30d;
        set beresp.grace = 365d;
    }

    return (deliver);
}

# add an extra header to the HTTP responses, simply indicating a Hit or Miss
# this is useful for diagnostics when we're wondering whether this is REALLY caching anything
sub vcl_deliver {
    if (obj.hits > 0) {
        set resp.http.X-Cache = "Hit";
    } else {
        set resp.http.X-Cache = "Miss";
    }
}

# Varnish won't cache effectively if there's a cookie, and a lot of our websites do use cookies.
# For tile caching purposes, ditch the cookies.
# Also, standardize the hostname: there are DNS aliases for tilestache, tilestache-1 through tilestache-4, etc.
# If we don't standardize the hostname, Varnish will cache multiple copies, from these multiple hosts.
sub vcl_recv {
    if (req.http.Cookie) {
        remove req.http.Cookie;
    }

    # we use one static hostname for all of these, but may be called as tilestache-1 through -4
    # standardize the hostname so it all goes into the one bucket
    set req.http.Host = "tilestache.myserver.com";
}

In the sysconfig, you need to either enter your own IP address for the XXXs, or else leave it blank for it to listen on all interfaces. On my system when I left it blank, someone hit port 80 on the other interface (on the 10.0.0.0/8 network) causing Varnish to start a second shared memory log. Not only is that useless work, but if you use tmpfs like I described, this second log wouldn't fit into the space available.

In the VCL, we strip out cookies. We can do that in this case, since we specifically only care to cache PNGs and JPEGs, and to cache them indefinitely without regard to sessions or the like. We also standardize the hostname, so Varnish won't cache multiple copies of the same tile for tilestache-1.myserver.com, tilestache-2.myserver.com, tilestache-3.myserver.com, and tilestache-4.myserver.com

The VCL also sets some extremely long caching for the PNGs and JPEGs, since our explicit goal here is cached images that won't be expiring for several months. And we add a custom HTTP header X-Cache which indicates Hit or Miss, which I find useful for debugging whether we're in fact caching anything.

Start It Up!

You should now be able to start the service, and set it to start when the server boots:

sudo service start
sudo chkconfig varnish on

If you run a ps you should see two instances of varnishd, and netstat will show your new network listener on port 80.

Give it a test: Get the URL of some thing that you think should be cached, e.g. one of the tile PNGs. Point your browser at the original and make sure you got it right. Now change the URL so the hostname points to your Amazon EC2 instance, and it should still work, having automatically fetched the remote content.

If you bring up Firebug or some other tool that shows HTTP headers, load up a few cached items, and look for the X-Cache header in the response. Some should say Miss, but subsequent loads of the content should start showing Hit. This X-Cache header is added by our VCL file, as described above, and is a good indicator that you're actually caching.

If you're interested in nuts and bolts, run varnishstat and varnishhist and look for your hit ratio. It will be very low at first, of course, because your cache is empty. But over time, the cache will fill and the hit ratio should go up.

And You're Done!

Looking good, huh? Great. Set up some DNS for your Amazon EC2 instance, so you can refer to it by a static hostname, then start updating your applications to use this new caching server. Bit by bit, the caches will fill, the hit ratio will increase, load on your main server will be reduced, and response times of the site will decrease.

About the Persistent Storage Back-End

An important note: the file storage backend purges the cache whenever you give a service varnish restart. Naturally, after this restart and your cache being empty, your hit rate is going to be awful until the cache fills again.

If you're feeling adventurous, you can edit the sysconfig file and change -s file to -s persistent, and Varnish will keep the cached files in between restarts. I didn't go with this because:

It's experimental, which doesn't give me a good feeling. Sorry to cop out on the community, though.
The store won't show up in varnishstat, so I can't get statistics on it.
These servers do nothing but Varnish, so should reboot annually at most. And giving it a restart is a super easy way to clear the cache, when we do make changes.

Elastic Computing At Its Finest

Some time back, I posted how we moved our web server off of Amazon EC2 and were enjoying improved performance and all that stuff. That's all still true.

But in our need for a remote caching scenario, a content delivery network where network speed was paramount and CPU & disk IO weren't strong needs, we have found an excellent use case well matched to EC2's (mediocre) hard disks and CPUs and (quite good) network.

Thursday, October 24, 2013

Weather Underground for historical tides & weather

In a recent (current) project, volunteers enter the results of surveys along the beach. As part of the survey, they are to note the weather conditions: visibility clean or limited, cloud cover percentage, temperature, precipitation yes/no, etc. They are also to note the height of the tide at that time.

Big bonus points, if I can make it look up the data at that time, date, and location, and have it auto-fill the boxes for them.

Most APIs are for Forecasts

It's easy enough to find a weather forecasting API. NOAA is one of several, and the GFS GRIB2 files can be had if you're hardcore. But those are low-resolution forecasts: what about yesterday's weather or the day before, and for a specific time instead of morning/mid/evening breakdowns?

And what about tides? Tide forecasts exist, but typically as full reports and not a readily-usable API. And you have to request a specific tide station, while we have simply a raw lat & long and no ready way to detect the nearest tide station. Besides, these are tide forecasts when we need yesterday's observations.

Weather Underground

Weather Underground has an API, and unlike others theirs goes into the past. Awesome. They also offer both weather observations and tide observations. And they have a free tier, limited to 500 hits per day. For our use case, this is way over a realistic usage for us, as these are office staff entering forms, not the general public hammering us with every hit.

So, step 1: sign up for an API key. Sign up for one that includes tides, and be sure to enable the History option when you sign up.

Making a Request / Creating an AJAX Endpoint

A request looks like this:

http://api.wunderground.com/api/APIKEY/history_YYYYMMDD/q/LAT,LON.json

Slot in the date, lat & lon, and your API key, and get back JSON. Dead simple.

In our case, I set up an AJAX endpoint written in PHP (CodeIgniter). It looks like this:

public function ajax_conditions($placeid,$date,$starttime) {
    header('Content-type: text/javascript');

    // validation: make sure they have access to this place ID, date and time filled in, etc.
    if (! preg_match('/^\d{8}$/', $date) )        return print json_encode(array('error'=>'Invalid date'));
    if (! preg_match('/^\d{4}$/', $starttime) )   return print json_encode(array('error'=>'Invalid starting time'));

    // code here translates between a $placeid and a lat / lon
    // yours will be very specific to your application

    // make the requests to wunderground for weather conditions and tide conditions
    $output = array();
    $latlon = $place->centroid();
    $time   = mktime((integer) substr($starttime,0,2), (integer) substr($starttime,2,2), 0, (integer) substr($date,4,2), (integer) substr($date,6,2), (integer) substr($date,0,4) );

    $weather_url = sprintf("http://api.wunderground.com/api/%s/history_%s/q/%f,%f.json", $this->config->item('wunderground_api_key'), $date, $latlon->lat, $latlon->lon );
    $weather_info = @json_decode(file_get_contents($weather_url));
    if (! @$weather_info->history) return print json_encode(array('error'=>'Could not get weather history. Sorry.'));

    $tides_url = sprintf("http://api.wunderground.com/api/%s/rawtide_%s/q/%f,%f.json", $this->config->item('wunderground_api_key'), $date, $latlon->lat, $latlon->lon );
    $tides_info = @json_decode(file_get_contents($tides_url));
    if (! @$tides_info->rawtide) return print json_encode(array('error'=>'Could not get tide history. Sorry.'));

    // weather as $weather_observation
    // go over the observations, find the one closest to the given $time
    // step 1: go over them, add a timedelta attribute, push onto a list
    $observations = array();
    foreach ($weather_info->history->observations as $observation) {
        $year = (integer) $observation->date->year;
        $mon = (integer) $observation->date->mon;
        $mday = (integer) $observation->date->mday;
        $hour = (integer) $observation->date->hour;
        $min = (integer) $observation->date->min;

        $obstime = mktime($hour, $min, 0, $mon, $mday, $year);
        $observation->timedelta = abs($obstime - $time);
        $observations[] = $observation;
    }
    // step 2: sort by timedelta, best observation is element [0] from the sorted list
    usort($observations,array($this,'_sort_by_timedelta'));
    $weather_observation = $observations[0];

    // tides as $tide_observation
    // step 1: go over them, add a timedelta attribute, push onto a list
    $observations = array();
    foreach ($tides_info->rawtide->rawTideObs as $observation) {
        $obstime = (integer) $observation->epoch;
        $observation->timedelta = abs($obstime - $time);
        $observations[] = $observation;
    }
    // step 2: sort by timedelta, best observation is element [0] from the sorted list
    usort($observations,array($this,'_sort_by_timedelta'));
    $tide_observation = $observations[0];

    // ta-da, we now have one observation for tides and one for weather
    // and they're the closest ones we have to the stated time

    // see below for code which massages the data into the desired output format

    // all set, hand it back!
    return print json_encode($output);
}

Some neat points here:

As is my usual fashion, I use sprintf() and preg_match() extensively for validating the input, and check for errors. Otherwise, some wise guy can supply invalid params and make nasty-looking requests to wunderground on my behalf (hack attempts from my server? no thanks!), or even generate an error which causes PHP to tell him what URL was used... including my API key.
The return from wunderground is in JSON, and that's just super simple to parse. The return to the client is also in JSON, because it's super simple to generate.
The trick to finding the correct forecast for the time I have in mind, is to figure out the "time delta" between each forecast and the target time. One can then use usort() to sort by time delta, and slice off the first element of the array. That being the lowest time delta, it's the closest to the target time.

A Little More On The Endpoint

Now, the endpoint does go a step further. The browser end of the app doesn't want the raw numbers, per se, but the simplified, digested version. They want the following:

air temperature in F
a simple yes/no about precipitation
a simple perfect/limited for visibility
a percentage cloud cover, even if estimated
the Beaufort measurement of the wind
the height of the tide in feet, including a prefixed + if it's >0

In the code above, you see the "code which massages" Well, here it is:

    // compose output: weather
    $output['weather'] = array();
    $output['weather']['airtemperature'] = round( (float) $weather_observation->tempi );
    $output['weather']['precipitation'] = 'no';
    if ( (integer) $weather_observation->fog ) $output['weather']['precipitation'] = 'yes';
    if ( (integer) $weather_observation->rain) $output['weather']['precipitation'] = 'yes';
    if ( (integer) $weather_observation->snow) $output['weather']['precipitation'] = 'yes';
    if ( (integer) $weather_observation->hail) $output['weather']['precipitation'] = 'yes';
    $output['weather']['visibility'] = 'perfect';
    if ( (integer) $weather_observation->fog ) $output['weather']['visibility'] = 'limited';
    $output['weather']['clouds'] = 'clear';
    if ((string) $weather_observation->icon == 'mostlysunny') $output['weather']['clouds'] = '20% cover';
    if ((string) $weather_observation->icon == 'partlycloudy') $output['weather']['clouds'] = '30% cover';
    if ((string) $weather_observation->icon == 'partlysunny') $output['weather']['clouds'] = '50% cover';
    if ((string) $weather_observation->icon == 'mostlycloudy') $output['weather']['clouds'] = '80% cover';
    if ((string) $weather_observation->icon == 'cloudy')       $output['weather']['clouds'] = '100% cover';
    $output['weather']['beaufort'] = '1';
    if ( (float) $weather_observation->wspdi >= 4.0) $output['weather']['beaufort'] = '2';
    if ( (float) $weather_observation->wspdi >= 8.0) $output['weather']['beaufort'] = '3';
    if ( (float) $weather_observation->wspdi >= 13.0) $output['weather']['beaufort'] = '4';
    if ( (float) $weather_observation->wspdi >= 18.0) $output['weather']['beaufort'] = '5';
    if ( (float) $weather_observation->wspdi >= 25.0) $output['weather']['beaufort'] = '6';
    if ( (float) $weather_observation->wspdi >= 31.0) $output['weather']['beaufort'] = '7';
    if ( (float) $weather_observation->wspdi >= 39.0) $output['weather']['beaufort'] = '8';
    if ( (float) $weather_observation->wspdi >= 47.0) $output['weather']['beaufort'] = '9';
    if ( (float) $weather_observation->wspdi >= 55.0) $output['weather']['beaufort'] = '10';
    if ( (float) $weather_observation->wspdi >= 64.0) $output['weather']['beaufort'] = '11';
    if ( (float) $weather_observation->wspdi >= 74.0) $output['weather']['beaufort'] = '12';

    // compose output: tides
    // be sure to format it with a + and - sign as is normal for tide levels
    $output['tide'] = array();
    $output['tide']['time'] = date('G:ia', (integer) $tide_observation->epoch );
    $output['tide']['height'] = (float) $tide_observation->height;
    $output['tide']['height'] = sprintf("%s%.1f", $output['tide']['height'] < 0 ? '-' : '+', abs($output['tide']['height']) );
    $output['tide']['site']   = (string) $tides_info->rawtide->tideInfo[0]->tideSite;

The end result is exactly the fields they want, corresponding to the fields in the form.

Speaking of the Form...

Using jQuery. the additions to the form are relatively simple. It's a simple GET request, with the URL contrived to contain the /placeid/date/starttime parameters.

// make an AJAX call to fetch the weather conditions at the given place, date, and times
// along with disclaimer and credits per wunderground's TOU
function fetchWeatherConditions() {
    // remove the : from HH:MM and the - from YYYY-MM-DD
    var starttime = jQuery('#form input[name="time_start"]').val().replace(/:/g,'');
    var date      = jQuery('#form input[name="date"]').val().replace(/\-/g,'');
    var placeid   = jQuery('#form select[name="site"]').val();
    var url = BASE_URL + 'ajax/fetch_conditions/' + placeid + '/' + date + '/' + starttime;

    jQuery('#dialog_waiting').dialog('open');
    jQuery.get(url, {}, function (reply) {
        jQuery('#dialog_waiting').dialog('close');
        if (! reply) return alert("Error");
        if (reply.error) return alert(reply.error);

        jQuery('#form select[name="clouds"]').val(reply.weather.clouds);
        jQuery('#form select[name="precipitation"]').val(reply.weather.precipitation);
        jQuery('#form input[name="airtemperature"]').val(reply.weather.airtemperature);
        jQuery('#form select[name="beaufort"]').val(reply.weather.beaufort);
        jQuery('#form input[name="tidelevel"]').val(reply.tide.height);
        jQuery('#form select[name="visibility"]').val(reply.weather.visibility);

        // show the attribution & disclaimer
        jQuery('#dialog_wunderground_tideinfo').text('Tide information: ' + reply.tide.site + ' @ ' + reply.tide.time);
        jQuery('#dialog_wunderground').dialog('open');
    }, 'json').error(function () {;
        jQuery('#dialog_waiting').dialog('close');
        alert("Could not contact the server to load conditions.\nMaybe you have lost data connection?");
    });
}

Notes here:

I like to open a "please wait" dialog, because it can be 2-3 seconds as we get back a response.
The fields returned exactly fit those in the form. It's quite nice.
After populating the fields, I open a jQuery UI Dialog showing an attribution to wunderground, and mentioning where the tide data comes from since it may be several miles away from the actual location.

Conclusion

Unlike most other weather and tide prediction APIs, Weather Underground keeps historical records, and supplies both tides and weather in one simple API. And with their free tier, they really made my day... and our clients'.

Friday, October 4, 2013

javascript:void(0)

A common need that we come across (like, multiple times daily), is to have a button or hyperlink which opens a dialog or does some other action in JavaScript.

Way back in the late 90s, our technique was this:

<a href="javascript:void(0)" onClick="doWhatever()">Click me</a>

But over the years, this was pointed out as being not entirely a good thing. It's a hyperlink to nothing, it can confuse screen readers (really? who has those, and do they really read out the entire URL?)
, and it's more semantically correct to do this:

<span class="lookslikelink" onClick="doWhatever();">Click me</span>

So I adopted this technique. A class called fakelink can be constructed which looks like other hyperlinks (cursor:pointer; text-decoration:underline; color:blue;) and now we don't have these semantically-incorrect null-links, just DIVs and SPANs with event handlers.

But, enter mobile...

As you have probably noticed, mobile devices won't necessarily detect these "hotspots", and they give strong preference to hyperlinks. On my six-month-old telephone running Android 2.1, for example, I have a menu of 3 links:

<li><a href="/postings">Postings</a></li>
<li><span class="fakelink" onClick="openSignupDialog();">Sign Up</span></li>
<li><a href="/catalog">Our Catalog</a></li>

On the desktop, this works great: three links, one of which opens the popup dialog. On mobile not so much: it's impossible to click the middle link. It seems the phone detects the tap and the nearest hotspot, and you've tapped on one of the two outside links. Even without a menu of other hyperlinks, the tap events often "just didn't work"

So, back to the old ways...

<li><a href="/postings">Postings</a></li>
<li><a href="javascript:void(0);" onClick="openSignupDialog();">Sign Up</a></li>
<li><a href="/catalog">Our Catalog</a></li>

The desktop doesn't really care about this sort of semantic violation, and it works on phones. And I'd rather have working links, than win an argument about semantics.