November 07, 2007

Using someone else's infrastructure: utilising amazon s3 and ec2

  • starting point: one machine, running the whole stack
  • scaling from 1->n app servers is hard
    – backing up large datasets is hard
    – scaling filesystems to large numbers of clients is hard
    – inter-app communications is hard
    – managing traffic spikes is hard
    – managing load fluctuations is hard
  • Amazon offerings: s3: 15c/GB; ec2 10c cpu/hour
  • s3: redundant storage; as much as you like. 5GB per object (large files have to be split). APIs are HTTP or BitTorrent
  • s3 buckets are a single-level hierarchies (each one must have a unique name). Each bucket contains key-value tuples
  • APIs for most languages
  • EC2: linux 2.6.16 Xen images; Can have small, medium or large servers; large is 4 dual-core processors, 15GB RAM, 1.6TB storage. Storage is not persistent; when the instance is spun down the storage is lost
  • ACLs for host/port access
  • commandline toolsets for stopping/starting instances
  • SQS: reliable messaging service. 256KB message payload. pay per-message (10c/1000 messages). Basic permissions model
  • Uses:
    – backup (s3sync.rb: like rsync)
    – S3 asset host: Use S3 to hold large files rather than serving them locally.
    – S3: authenticated downloads; s3 can make files publicly available for a short period of time (~2 s) if the user presents a specific token
    – rails: attachment_fu does this for you automatically
    -load fluctuation: start additional servers with cron for busy periods; or even use monitoring system to detect h igh load and spin up new machines
    - to make storage persistent; have a slave database which syncs periodically to s3

    - Problems: dynamic IP addresses makes it hard to manage the loadbalancing, can’t specify DB server very well.
    - Solution: run DB server in-house, just use EC2 for app servers for surplus capacity; VPN through from your DC to EC2. Still problems with latency between your DC and EC2

  • – entirely built with AWS. No datacentre of their own
  • High quality and low cost; occasional problems with latency/integrity and vendor lock-in

- 3 comments by 1 or more people Not publicly viewable

  1. Rami

    You can check out how the Global Hosted Operating System is using AWS at currently provide a free 3GB of online storage.


    08 Nov 2007, 09:34

  2. ben

    The line “run DB server in-house, just use EC2 for app servers” raises a smile.

    I’ve never worked on a high traffic dynamic site where the DB was not the bottleneck so it makes EC2 pretty useless.

    21 Jan 2008, 17:10

  3. Chris May

    I’ve never worked on a high traffic dynamic site where the DB was not the bottleneck so it makes EC2 pretty useless.

    Possibly, although I think there may be some cases where that’s not true.

    Picking a few large sites, each of LiveJournal , Flickr (you’ll have to dig around for that info, or pay to see the “how we built flickr” talk), and Twitter describe going through a phase where they were bottlenecked on app servers. They’ve all outgrown that phase now, although it’s instructive to note that each of them use many, many more app server nodes than DB server nodes .

    My own experience lies with much more middle-sized sites in traffic terms (the largest peaking at about 1500 dynamic requests/minute, translating into about 3000-4000 DB transactions / minute) , but even here, DB server resource requirements are small compared to the app server’s needs (typically, the app servers need about 5* more CPU than the DB) Right now I’m in the happy position of having enough server cycles to go round, but if things started to scale, I’d need more app servers a long time before I needed more DBs.

    So whilst it’s technically true that most high-traffic sites will ultimately bottleneck on the DB (in fact, you could probably go one stage further and state that they’ll bottleneck on storage IO), that doesn’t decrease from the usefulness of being able to rapidly turn on extra hardware in the app server tier. It’s worth noting that both Twitter did exactly this, using Joyent’s accelerators rather than EC2, to buy time while they optimised their middle tier. iLike, meanwhile, didn’t, and had to race madly around town in a truck buying every server they could get their hands on.

    Of course as I noted in the original post, the latency issue will kill this approach for a lot of users anyway. However, there are still quite a few use cases that don’t require such a low-latency connection. Two that spring to mind immediately are data consolidation (we have a server which slices, dices and rolls-up several million rows of apache access log data every night, which requires several hours of processing on on otherwise-idle box) and content indexing for searching (similarly, we have a box which spends most of every sunday generating full-text indexes of 200GB of HTML, and then sits idle for the rest of the week). Those kinds of things could easily be farmed out to “the cloud”, freeing us from the need to run the hardware.

    21 Jan 2008, 19:31

Add a comment

You are not allowed to comment on this entry as it has restricted commenting permissions.

Most recent entries


Search this blog

on twitter...


    Not signed in
    Sign in

    Powered by BlogBuilder
    © MMXXI