Monday, December 16, 2013

A PHP proxy to do both GET and POST, and the limitations of JSONP

It's been busy the last few months! Working on a few projects that aren't ready for release yet, so not a lot I'll say about them. But here's something that may come in handy.

The Problem & Solution


This web application, needs to fetch data from a remote data source. It's not based on the map per se, just a few dozen checkboxes to filter by this-n-that, search terms, some date filters, all submitted to the server and we get back some JSON. Pretty ordinary.

Problem is: it's on another server, sop we have cross-domain issues. The browser successfully makes the GET request, but the response body is blank and the XHR returns an error status.

The common solution, is to set up a proxy server. That is: it's a program hosted on the same site as your web application, and you can make your GET and POST requests to this program instead of the remote one on another site, it will do the GET and POST to the remote server, and return the content. This proxy server being on the website's domain, your cross-origin problem is solved.

Three simple steps.

#1, create this PHP program and call it proxy.php
<?php
// a proxy to call the remote ReefCheck server
// we were doing well with JSONP until we tried using too many
checkboxes, generating URLs that were too long for GET,
// and JSONP doesn't happen over POST

$PROXIES = array(
    '/count'   => 'http://other.remote.site.com/query/endpoint.php.php',
    '/search'     => 'http://my.remote.site/some/endpointc.cgi',
);

$url = @$PROXIES[$_SERVER['PATH_INFO'
]];
if (!$url) die("Bad proxy indicator:  {$_SERVER['PATH_INFO']}");
// are we GETting or POSTing? makes a difference...
if (@$_POST) {
    // compose a cURL request, being sure to capture the output
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, $url);
    curl_setopt($curl, CURLOPT_POST, TRUE);
    curl_setopt($curl, CURLOPT_POSTFIELDS,  http_build_query($_POST) );
    curl_setopt($curl, CURLOPT_HEADER, FALSE);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
    $output = curl_exec($curl);
    curl_close($curl);

    print $output;
} else {
    // easy way out: the base URL plus ? plus params
    $url = sprintf("%s?%s", $url, $_SERVER['QUERY_STRING'] );
    readfile($url);
}
 #2, Adjust the $PROXIES to match the endpoints you need. Pretty obvious: this is an aliasand a real URL, such as /search to http://some.site.com/api/v1/search.json

#3, change your JavaScript code to use your new proxy.
// old
$.get('http://some.site.com/api/v1/search.json', params, function(reply) { });
// new, note the /search matching the /search in $PROXIES
$.get('proxy.php/search', params, function(reply) { });

Dead simple. You should be able to make your queries, using exactly the same POST or GET parameters, and get back your reply exactly as if you had made the request to the real server. Except for the missing response body and error status.

Features & Shortcomings

This proxy script supports both GET and POST, which right off is a great start. And it supports multiple endpoints. And the GET stuff is done as an URL which makes debugging easy (cURL can do GET of course, but I find debugging simpler if it can dump an URL).

It does not support headers, either direction: no enctype, and no Content-type headers from the remote source. If you're using jQuery, you'll definitely want to use the data-type parameter (the 4th parameter to $.get and $.post, forcing the interpretation of the data type). Personally I consider that 4th param a good practice anyway... And for file uploads, the missing enctype may be relevant but there are iframe-based fixes for that anyway such as ajaxForm()

And keep in mind the most basic issue of proxies like this: it's effectively triple-transiting traffic: browser calls your server, your server calls the remote API, your server gets the data back, your server spits that data out to the browser. If your server's network throughput is a concern, adding someone else's API to your server's responsibilities may not be a pleasant necessity.

A quick recap: JSONP and the need for this proxy

As we started development of this application, we used JSONP instead. To recap if you're not familiar with JSONP:
You're probably used to returning JSON, which is a representation of a structure such as { title:'My Stuff', size:100 }  JSONP takes this a step further, and wraps that structure into a function call, forming executable JavaScript, like this:   handleResults({ title:'My Stuff', size:100 })  The name of the function being invoked, is defined by the &callback= parameter, which you sent along with your request, so you can in fact name the function that will be used, e.g. &callback=handleResults is simply one more param in your usual GET request.
This does presume that the API endpoint is programmed to handle the &callback= parameter and wrap the JSON output, and that you're willing to specify this one extra parameter in your request. (server-side: this really is simple to implement: you're about to spit out json_encode()'d data anyway, put if @$_GET['callback'] and change the output slightly if so) (client-side: if you use jQuery's $.get function, it can create a random function name for you, and bind your callback to it, and supply the callback param for you; very little labor here)
As long as your endpoint is JSONP-aware, and will accept a &callback= parameter and wrap the content in it, this is a great way to make your browser do the work itself without involving your proxy. Slightly faster transfer times, my server not needing to double-transit traffic, everybody wins...

...until the GET params become too much, and we must use POST.
But JSONP doesn't happen over POST!

You see, the spec changes at one point and we had to include A LOT of checkboxes and other such parameters, and the client's endpoint uses text names instead of primary keys, so it was entirely normal to construct an URL like this:
http://example.com/endpoint.json?counties[]=Santa Rosa&counties[]=Alameda&counties[]=San Mateo&species[]=Catfish&species[]=Salmon&species[]=Goldfish&species[]=Dogfish&species[]=Mulkey's Pollock
The URL params were now too long for GET, so remote servers start truncating our queries, hanging up on us, etc. so we must use POST. But, W3C specification is that JSONP doesn't work over POST, and if you try to use JSONP in jQuery it will automatically be changed to a GET.
As such,  as much as I was fond of using JSONP while it lasted... ultimately we had to go for a PHP proxy.

Wednesday, December 4, 2013

PHP: Calculate the centroid of a polygon

A project I've been working on, generates point markers onto a map. But the interesting part, is how I populate that database of points. It's a series of "drivers" for connecting to ArcGIS REST API, OGC WFS, CartoDB, and so on. More on that as it develops...

A need that came up today, was that this particular layer, being served from ArcGIS REST, is polygons and we need points. Normally I would calculate the centroid and use that... but this specific software is being developed on a plain web host: no GEOS or OGR, no PostGIS... just plain ol' PHP.

Some Reading... then Writing


So I did some reading:
http://en.wikipedia.org/wiki/Centroid#Centroid_of_polygon

And wrote this, a pure PHP function for calculating the signed area of a polygon, and then the centroid of the polygon:
https://github.com/gregallensworth/PHP-Geometry/blob/master/Polygon.php

Caveats


As described in the file, this is specific to our sorts of use cases: park boundaries, city boundaries, and the like. If the polygon is self-intersecting (that's a no-no) it may be wrong. If it's an L shape it may come up wrong too. And if it's a multipolygon and those rings overlap, it'll double-count the area, as it's not smart enough to find intersecting area between two rings and subtract it out (and this incorrect area may affect the centroid placement).

But, it does what we need it to do, and it may work for you. Enjoy!