July 16, 2010

HTML5 History API – an accident waiting to happen?

Writing about web page http://www.w3.org/TR/html5/history.html

HTML5 has a very fancy new feature, which allows clients to manipulate the browser’s history via javascript.

This is designed as a replacement for all those horrible hacks that use #fragments at the end of URLs to hold state on pages that use AJAX to load content. However, as currently implemented, it looks like a bit of a disaster for sites with a lot of editors.

window.history.replaceState(data, title [, url ] ) 

allows you to replace the current entry in the history with an arbitrary URL. Flickr use this in their Lightbox interface – if you go to http://www.flickr.com/photos/chrismay/4797065896/in/photostream/lightbox/ in a browser that supports it (Chrome or Safari) then you can see that “next/previous” change the URL bar, but if you look at the network traffic you’ll see that there’s never a request for the new URLs, and neither is their any kind of loading event fired.

What this means, is that if I can write javascript on a webpage, then I can change the apparent URL of that webpage to whatever I want, whether or not that page really exists.

Per the spec, I can change the path of the URL to whatever I like, so long as I keep the same host and protocol.

So, if I wanted to, I could quite easily write a blog entry which did the following:
  • publish a tinyURL link to a blog post I’d written
  • changed the URL on that post to “http://blogs.warwick.ac.uk/login”
  • Replaced the page’s content with something that looked a bit like Web Sign-on, and said “Sorry, your session has expired. Please re-enter your password”
  • Collect usernames and passwords, and do evil stuff.

Now, smart people like you and I would spot that all logins to WarwickBlogs happen via https://websignon.warwick.ac.uk, and thereby smell a rat. But the vast majority of users wouldn’t pick up on this; an official-looking URL and a familiar looking page would be enough to catch them.

Or how about this: I make a page on the university website that looks like the official news page, and says “University closed today due to zombie attack”. Then I use history manipulation to change it’s URL to http://www2.warwick.ac.uk/newsflash, and put out a twitter message along the lines of “RT @warwickuni: University closed today http://is.gd/uowzombies”. No-one comes to work and the university is in trouble.

It’s clear to me that this API is a potential problem for sites like ours, where we have large numbers of contributors (about 5000 active editors in our case) able to create content that includes javascript. At present, the only defence is a policy-based one; anyone who chose to take advantage of these features would rapidly find themselves with a great deal of free time to re-read our A.U.P. and Computer Misuse regulations*. But if I were in charge of the HTML5 spec, I’d like to add in a few additional constraints:

  • I’d like a way to turn off history-manipulation on a domain-by-domain basis. Some kind of HTTP header would be fine, or alternatively a file under the root a la robots.txt (“features.txt” if you will) that could offer advice to browsers about whether or not they should enable features like this one.

Most egregious of all is that this breaks a very well-established principle of browser behaviour; that you can’t change the URL without reloading the page. A lot of sites have based a lot of security assumptions around this principle, and since Chrome and Safari implement these APIs against any doctype (not just HTML 5), those assumptions are all now broken.

Ah well. I look forward to seeing how this plays out on large sites like ours. Who’s going to get caught out first?

* You Have Been Warned. Don’t be silly.

- 5 comments by 3 or more people Not publicly viewable

  1. Mathew Mannion

    Turning off history manipulation via some policy file would be good, but being forced to turn on history manipulation would be even better. Developers who know what they’re doing are forced to consider the risk before turning it on.

    16 Jul 2010, 18:42

  2. Mike Willis

    Could you redefine window.history.replaceState() in the Sitebuilder/Blogs Javascript to something that just returns true? I suppose if you could, then someone could redefine it again if they were sufficiently determined.

    19 Jul 2010, 12:48

  3. Chris May

    Hmm, nice idea. You can, and (at least from my testing) it’s quite hard to redefine it back again, which is good. However, in the general case (i.e. outside of Sitebuilder and Blogs), as Mat points out above, it’s still opt-out rather than opt-in, so not ideal.

    19 Jul 2010, 14:24

  4. Andy

    Given that someone can write arbitrary javascript somewhere on your website is already a threat (if you don’t trust them), consider them sending login cookies somewhere far. Even creating a fake “login page” will fool most users, whether you change URL with history api or not.

    19 Oct 2010, 15:27

  5. Chris May

    You can fix the login-cookie issue by using HttpOnly cookies for session/login information – javascript can’t access these.

    However, you’re absolutely right about how disappointingly easy it is to phish users in general; a login page with an obviously-bogus URL will s till catch a reasonable proportion of users. And my predictions of doom have so far not turned out to happen, so maybe it’s OK.

    19 Oct 2010, 15:56

Add a comment

You are not allowed to comment on this entry as it has restricted commenting permissions.

Most recent entries


Search this blog

on twitter...


    Not signed in
    Sign in

    Powered by BlogBuilder
    © MMXXII