Wednesday, March 13, 2013

CodeIgniter sessions: timeouts, AJAX, and Remember Me

I love CodeIgniter. It's among the top 3 popular PHP frameworks, though the order of those 3 is a matter of opinion and taste.

Back in February I finally got to research and work around an issue that had been driving me nuts. I have an app which requires login, and uses CI sessions. Using one particular page, after a while my AJAX calls will fail because I am no longer logged in. Reload the page, yep, I'm logged out.

Turns out, this is a known issue, going way back:
CodeIgniter regenerates a new session ID periodically, to prevent session ID hijacking (session fixation attacks). At some point my session ID will change, and my session will effectively expire because the AJAX requests aren't setting the new session ID in my browser.

https://github.com/EllisLab/CodeIgniter/issues/154

This goes beyond AJAX, though. If you want the session to stay open over time, e.g. a Keep Me Logged In functionality, you will also see this problem: when you open the site tomorrow, your session ID is out of date and you're not Keeping Logged In at all.


Fortunately, the fix for this is not difficult.

Session Expiration & Session ID Regeneration


Open up config.php and look for these two session variables.
The sess_expiration setting, defines how long a session is good (idle time since last hit).
The sess_time_to_update setting, defines how often your session ID will be regenerated.
$config['sess_expiration'] = 86400 * 30; // 30 days
$config['sess_time_to_update'] = 300; // 5 minutes
Our problem is the session ID regeneration. We can effectively disable this by making sess_time_to_update very, very long:

// sessions are good for 30 days, and IDs will change every year
$config['sess_time_to_update']    = 86400 * 365;




Problem solved. I only need to login after 30 days of idle time, and even using AJAX-heavy pages without

Consequence: Session Fixation

Disabling session ID regeneration does have a consequence for security, though. Now that your session ID doesn't change, it's hypothetically possible for someone to snag your session ID and use it indefinitely (well, for a year if you set it as above).

This can be mitigated by a few other arrangements:

Shorten the session expiration and regeneration time. Do they really need a session to be good for 30 days? Would 7 days suffice? How about 3 days? The shorter it is, the less time you give a sniffer to try hijacking your session before finding it already expired.

Encrypt the cookies! Set sess_encrypt_cookie to TRUE and set a random encryption_key so thesession IDs can't be intercepted in the first place. Do a Google search for "codeigniter encryption key generator" and take your pick.


Enable user agent matching. The sess_match_useragent setting enables a check on your session, that you're using the same browser as you were when the session opened. User agent strings can be faked, but this does increase the "cost" of brute forcing or hijacking sessions, as the hacker must try the session key several times as they guess browsers.

Enable IP address matching. The sess_match_ip setting enables a check that your IP address hasn't changed since your session was last used. This one may not be appropriate if your ISP routes traffic through changing proxy servers (satellite Internet, America Online) and most of us have dynbamic IP addresses that change every few days. But, if your use case is access from a fixed IP address such as an office intranet, or you can tolerate logging back in after your reset your home router, this can provide a significant layer of security.

Use SSL. This isn't always possible, but when it is, intercepting sessions is impossible. If you use this and also cookie encryption, then intercepting sessions is doubly impossible.

Conclusion

This solution to session timeouts works fine, but opens up a security consideration. Fortunately, with cookie encryption, this security consideration doesn't need to be a security problem and your users can Keep Me Logged In without having their sessions hijacked.

Monday, March 11, 2013

MobileMapStarter update: Tile caching and Chrome compatibility

Yesterday I put the finishing touches onto a major revision of MobileMapStarter.

- The previous version used imgcache.js to passively cache map tiles as you browse. The new version uses a new system, allowing you to download a specific area for offline use. No more panning and zooming repeatedly -- just zoom out, center it, and hit Download. It even has a nice status thingy, and skips over tiles that you already have.

- I spent a lot of time getting the caching to work in Chrome too, so you should be able to prototype your app on the desktop.

- There were also some other minor things: typos, minor adjustments, and changing out the Streets basemap for the Mapbox Satellite basemap.

Go git it!
https://github.com/gregallensworth/MobileMapStarter/





Thursday, March 7, 2013

Automating CartoDB and Google Fusion Spreadsheets on GoDaddy

A whole month without a posting. I've been busy on some fun stuff here. Here's a peek.

A client wanted something simple. They have a registration form which logs submissions regarding trees, to a Google Fusion Spreadsheet. They want a choropleth map of Census Block Groups (CBGs), indicating how many trees had been registered there. But it's not that simple: they don't want to pay for geospatial hosting, they want to stick with their $8 GoDaddy hosting.


So, our basic needs are:
  1. Update the Google spreadsheet of trees, populating the lat and lon fields for any which don't have them.
  2. Have a PostGIS table of CBGs someplace for free.
  3. Perform a spatial join between the CBG polygons and the  tree points.
  4. Render it as a Leaflet map, using only client-side tech, knowing that the CBGs are too much to be communicated as raw vectors and rendered as a L.Polygon layer.
Our resources are:
  1. The Google Spreadsheet.
  2. A website at GoDaddy which supports PHP, but not many modules.
  3. Plenty of disk space.
  4. A free account at CartoDB.
It all worked out beautifully, using client-side-only mapping, and PHP with no special configurations, suitable for hosting on GoDaddy's low-priced web hosting. No server-side mapping infrastructure!

An annotated version of the app's source is available here:
http:/tilelab.greeninfo.org/~gregor/cartodb_google/


Step 1: GoogleFusionTable.php


Step 1 is comparatively easy. Google Docs already has a web API for doing SQL-like queries to spreadsheets, and a PHP library already exists for this.

https://github.com/gregallensworth/CloudServices/blob/master/GoogleFusionTable.php

This was just simple: Google's geocoder service, the GFT PHP, and a few dozen lines of code. Now we just visit the URL and all non-geocoded registrations will have their Lat and Lon fields updated.



Steps 2: Uploading Google to CartoDB


We had been itching to try out CartoDB's offerings, this was a great opportunity.

CartoDB is a website, which offers an easy-to-use interface to PostGIS tables. You upload a spreadsheet or a shapefile, they grok it and load it, and you get a nice table UI and a panel for entering SQL. A step further, they have a web API for performing these SQL queries, so you can get back a JSON object full of rows, by submitting a query via GET or POST.

The heart of accessing it via PHP, boils down to this wrapper function I whipped up:
https://github.com/gregallensworth/CloudServices/blob/master/CartoDB.php

And the implementation is just a tie-together of Google's SQL API for Fusion Docs, and CartoDB's SQL API to your table:
https://github.com/gregallensworth/CloudServices/blob/master/google2cartodb.php

Remember: Each query really is a hit to CartoDB's web service. If you have 12,000 rows to insert, you really will make 12,000 hits to CartoDB. Be kind and thoughtful. Use a common key between your two tables, so you don't need to re-insert every single record if it's already there. And consider doing the initial load via CartoDB's dashboard.


Step 3: Spatial join via CartoDB's web SQL API


Pretty simple here, now that you've already seen the CartoDB web SQL API.
Just run these two queries:

DELETE FROM census_block_groups_treecounts;

INSERT INTO census_block_groups_treecounts (
    the_geom,
    treecount
)
SELECT
    blocks.the_geom,
    SUM(treeshere) AS treecount
FROM census_block_groups AS blocks
JOIN registrations AS trees
ON ST_Contains(blocks.the_geom, trees.the_geom)
GROUP BY blocks.cartodb_id;


Step 4: Client side map... CartoDB again


Further still, CartoDB created a JavaScript library which bundles Leaflet, their web SQL API, mouse behaviors, CartoCSS parsing, and a dozen other great things. So getting your data back out of CartoDB and into Leaflet requires very little code.

http://developers.cartodb.com/documentation/cartodb-js.html

http://tilelab.greeninfo.org/~gregor/cartodb_google/index.js

Not a lot to say here. The docs are less than rich in real-use-case examples, and CartoDB.js brings together a lot of moving parts. But once you figure it out, it's really something quite nice.

BONUS: CartoDB layers can have their CartoCSS changed at runtime, and have their SQL changed at runtime. This app doesn't make use of that, but in theory filtering features or changing color schemes, has never been easier.


Step 5: Nightly Automation


GoDaddy has a facility for setting up cronjobs, right there in the control panel. At 1am it runs the geocode PHP. At 2am it runs the google2cartodb PHP. At 3am it runs the spatial join. Pretty groovy.


Trouble and Surprises


Not a lot in the way of trouble and surprises.

GoDaddy's PHP has a execution time limit. This seems to be 5 minutes, and ini_set() and set_time_limit() don't seem to change it. As such, the steps really are handled as separate PHPs instead of one monolithic one to geocode, upload, and join. This is probably for the best anyway, as it's easier to debug individual parts this way.

CartoDB's free account limits you to 5 MB of storage. In the case of this client, their registrations came up to 1 MB even if we stripped down to only the Lat & Lon, and they are expecting plenty more registrations . The set of CBGs comes up to 2 MB for the SF Bay Area, and that's after I stripped out areas that were unlikely to get registrations, then simplified, and clipped to fit land. The joined layer displayed on the map, will vary in size based on the number of matching CBGs; there's a maximum of 2 MB here too, if some day every single CBG were to have at least one registered tree. As such, hitting the 5 MB limit seems very likely as they get more registrations and they may end up upgrading anyway.

CartoDB's web SQL API cannot create tables, only populate them. I had to upload a spreadsheet of the first few registrations, for example, and has to manually run the spatial join SQL in CartoDB's dashboard. After the table structure was there, a DELETE FROM followed by later INSERTs worked just perfectly.

CartoDB.js involves a lot of parts, and the docs are good but not always great. Though the final map involves fairly little code, figuring out exactly which code took some time. I had to read up on CartoCSS a bit, and had to read a few times to figure out that CartoDB's API won't return the_geom but that you must do query for "ST_ASGEOJSON(the_geom) as json_geom" to get it, e.g. if you want to highlight features.


Conclusion


A client wanted a geospatial app with some moderate back-end complexity, but without paying for a geo infrastructure. Using a Google-hosted contact "database", CartoDB's free tier plan, and a pinch of PHP to tie them together, it all came together beautifully.

In the meantime, I learned a lot about CartoDB and really admire what they've come up with. The $30 price tier is a bit steep for those intermediate use cases: this client may go up to 15 MB of usage, over the free tier but well under the need for the Magellan tier. If they had a smaller paid plan that would be really excellent.