Favourite blogs for Connection reset by beer

Chum....ps » Secret Plans and Clever Tricks

November 13, 2011

New winter kit.

My new, shorter commute means I can now largely do away with my bike-specific kit, and ride in to work in “normal” clothes. Doubtless a relief for those who don’t have to suffer the sight of my lycra-centric outfits any more, but it has required some thought as to exactly what to replace then with.

So I was very pleased when the nice folk at Berghaus asked me to review some outdoor clothing which is just the ticket for a few miles riding through town on an autumn morning. First up, this:
Ardennes softshell
A splendid softshell jacket. It’s warm, windproof, waterproof-ish (the seams aren’t sealed, so I wouldn’t rely on it for a long time in really foul weather, but it’s stood up to 20 minutes or so of steady drizzle just fine). The cut is slightly more relaxed than a dedicated biking jacket (meaning it looks just fine down the pub), but has enough length in the arms and the back to avoid any annoying cold spots when leant over the handlebars. If I were being really picky, I could note that it doesn’t have pit-zips or any other kind of vents, which means it can get a bit wam if you’re working hard (e.g. when racing roadies on the way into work) – so far the weather’s still too warm to wear it for “proper” long rides. It’s also not very reflective, so you need something high-vis to slip over the top now the nights are dark.
But for riding to work it’s been ideal, and it’s also been great for standing on the sidelines of windswept football pitches watching the kids – which at this time of year is a pretty stern test of any fleece!

Item #2 on the list is a new rucksack – specifically, a Terabyte 25l .
Terabyte 25l
As the name suggests, this is optimised for carrying your laptop to work, rather than your ice axes into the northern Cairngorms or your inner tubes round Glentress. It features an impressively solid-feeling padded sleeve which will comfortably take anything from an ipad to a big-ish laptop and hold it securely, as well as a main compartment big enough for a packed lunch and my running kit, and the usual assortment of well-laid-out pockets and attachments. I particularly like the little strap on the back of the pack for attaching a extra bike light to. It’s comfortable even when loaded up, and plenty stable enough to bike with. Highly recommended.

October 28, 2011

Scala: 3 months in

So, I’ve been using scala in production for a few months now. Is it everything I expected it to be?

First, a little bit of background. The project I’m working on is a medium-sized app, with a Scala back-end and a WPF/C# fat client. The interface layer is a set of classes autogenerated from a rubyish DSL, communicating via a combination of client-initiated JSON-RPC over HTTP, and JSON-over-AMQP messages from server to client. It’s split into about half a dozen projects, each with it’s own team. (I should point out that I have little involvement in the architecture at this level. It’s a given; I just implement on top of it.)

The server side is probably a few thousand classes in all, but each team only builds 500 or so classes in its own project. It integrates with a bunch of COTS systems and other in-house legacy apps.

In functionality terms, it’s really fairly standard enterprisey stuff. Nothing is that complicated, but there’s a lot of stuff, and a lot of corner-cases and “oh yeah, we have to do it like that because they work differently in Tokyo” logic.

So, Scala. Language: Generally lovely. Lends itself to elegant solutions that are somehow satisfying to write in a way which java just isn’t. Everything else? Not so much. Here are my top-5 peeves:

  • The compiler is so slow. For a long time, the Fast(er) Scala Compiler simply wouldn’t compile our sources. It threw exceptions (no, it didn’t flag errors, it just died) on perfectly legitimate bits of code. This meant using the slow compiler, which meant that if you changed one line of code and wanted to re-run a test, you had to wait while your 8-core i7 churned for 2-3 minutes rebuilding the whole world. In a long-past life, I cut my teeth writing mainframe cobol with batch compilation. This was like that, all over again. TDD? Not so much. Recently the FSC has improved to the point where it can actually compile code, which is nice, but sometimes it gets confused and just forgets to recompile a class, which is hella confusing when you’re trying to work out why your fix doesn’t seem to be having any effect.
    I would gladly exchange a month of writing WPF automation code (that’s a big sacrifice, for those who haven’t endured it) for the ability to have eclipse’s hot-code-replace, unreliable as it is, working with scala.
  • The IDEs blow. Eclipse + ScalaIDE just plain doesn’t work – apart from choking on our maven config, even if you manually build up the project dependencies, it will randomly refuse to compile classes, highlight errors that don’t exist (highlighting errors on lines that contain no code is a personal favourite), and ignore actual errors. IDEA + Scala Plugin is better – this is now my day-to-day IDE, despite starting out as a dyed-in-the-wool Eclipse fanboy, but it’s slow and clunky, requires vast amounts of heap, and the code completion varies from arbitrary to capricious.
  • Debugging is a pain. Oh yes,
    data.head.zipWithIndex.map((t) => data.map((row: List[Int]) => row(t._2)).reverse)    

feels great when you write it, but just try stepping through that sucker to find out what’s not working. You end up splitting it up into so many temporary variables and function calls that you might as well have just written a big nested for-loop with couple of if-elses in it. Sure, you can take them all out again when you’ve finished, but are you really helping the next guy who’s going to have to maintain that line?

  • The java interop is a mixed blessing. Yes, it’s nice that you have access to all that great java code that’s out there, but the impedance mismatch is always lurking around causing trouble. Of recent note:
    – Scala private scope is not sufficiently similar to java private scope for hibernate field-level access to work reliably
    – Velocity can’t iterate scala collections, nor access scala maps.
    – Mokito can’t mock a method signature with default parameter values.

The list goes on. None of these are blockers in any sense – once you’ve learned about them, they’re all pretty easy to work around or avoid. But each one represents an hour or so’s worth of “WTF”-ing and hair-pulling, and these are just the ones that are fresh enough to remember. Scala-Java interop is a constant stream of these little papercuts.

  • Trivial parallelisation is of minimal value to me. Sure, there are loads of cases where being able to spread a workload out over multiple threads is useful. But, like a great many enterprise apps, this one is largely not CPU-bound, and where it is, it needs to operate on a bunch of persistent objects within the context of a JTA transaction. That means staying on one thread. Additionally, since we’re running on VMWare, we don’t have large numbers of cores to spread onto, so parallelisation wouldn’t buy us more than a factor of 2 or 3 in the best case. In a related vein, immutable classes are all well and good, but JPA is all about mutable entities. There have been a few bits of the design where baked-in immutability has led to a cleaner model, but they’re surprisingly few and far between.

Not that long after I started work on this project, there was a video doing the rounds, showing how Notch, the creator of Minecraft, used Eclipse to create a pretty functional Wolfenstein clone in java in a 48-hour hackathon. Whilst that kind of sustained awesomeness is out of my reach, it illustrates the degree to which the incredibly sophisticated tooling that’s available for Java can compensate for the language’s shortcomings. If you haven’t watched it, I recommend skimming though to get a sense of what it means to have a toolset that’s completely reliable, predictable, and capable of just getting out of the way and letting you create.

If I were starting this project again, I’d swap the elegance and power of the scala language for the flow of java with a good IDE. Even if it did mean having to put up with more FooManagerStrategyFactory classes

September 07, 2011

5 Operations metrics that Java web developers should care about

Sometimes it’s nice, as a developer, to ignore the world outside and focus in on “I’m going to implement what the specification says I should, and anything else is somebody else’s problem”. And sometimes that’s the right thing to do. But if you really want to make your product better, then real production data can provide valuable insights into whether you’re building the right thing, rather than just building the thing right. Operations groups gather this kind of stuff as part of normal business*, so here are a handful of ops. metrics that I’ve found particularly useful from a Java Web Application point of view. Note that I’m assuming here that you as the developer aren’t actually responsible for running the aplicationp in production – rather, you just want the data so you can make better informed decisions about design and functionality.

How should you expose this information? Well, emailing round a weekly summary is definitely the wrong thing to do. The agile concept of “information radiators” applies here: Big Visible Charts showing this kind of data in real time will allow it to seep into the subconscious of everyone who passes by.

Request & Error rates

How many requests per second does your application normally handle? How often do you get a 5-fold burst in traffic? How often do you get a 50-fold burst? How does your application perform when this happens? Knowing this kind of thing allows you to make better decisions about which parts of the application should be optimised, and which don’t need it yet.
Requests per minute graph
Request rates for one node on a medium-scale CMS. Note the variance throughout the day. The 5AM spike is someone’s crazy web crawler spidering more than it ought to.

Error rates – be they counting the number of HTTP 500 responses, or the number of stack traces in your logs, are extremely useful for spotting those edge-cases that you thought should never happen, but actually turn out to be disappointingly common.
Error rates graph
Whoa. What are all those spikes? Better go take a look in the logs…

GC Performance

GC is a very common cause of application slowdowns, but it’s also not unusual to find people blaming GC for problems which are entirely unrelated. Using GC logging, and capturing the results, will allow you to get a feel for what “normal” means in the context of your application, which can help both in identifying GC problems, and also in excluding it from the list of suspects. The most helpful metrics to track, in my experience, are the minimum heap size (which will vary, but should drop down to the same value after each full GC) and the frequency of full GCs (which should be low and constant).

weekly GC performance
A healthy-looking application. Full GCs (the big drops) are happening about one per hour at peak, and although there’s a bit of variance in the minimum heap levels, there’s no noticeable upwards trend

Request duration

Request duration is a fairly standard non-functional requirement for most web applications. What tends not to be measured and tracked quite so well is the variance in request times in production. It’s not much good having an average page-load time of 50 milliseconds if 20% of your requests are taking 10 seconds to load. Facebook credit their focus on minimising variance as a key enabler for their ability to scale to the sizes they have.
Render performance graph
the jitter on this graph gives a reasonable approximation of the variance, though it’s not quite as good as a proper statistical measure. Note that the requests have been split into various different kinds, each of which has different performance characteristics.
smokeping screenshot
Request speed with proper variance overlaid

Worker pool sizes

How many apache processes does your application normally use? How often do you get to within 10% of the apache worker limit? What about tomcat worker threads? Pooled JDBC connections? Asynchronous worker pools? All of these rate-limiting mechanisms need careful observation, or you’re likely to run into hard-to-reproduce performance problems. Simply increasing pool sizes is inefficient, will more likely than not will just move the bottleneck to somewhere else, and will leave you with less protection against request-flooding denial of service attacks (deliberate or otherwise).

Apache workers graph
Apache worker pool, split by worker state. Remember that enabling HTTP keepalive on your front-end makes a huge difference to client performance, but will require a significantly larger pool of connections (most of which will be idle for most of the time)

If you have large numbers of very long-running requests bound by the network speed to your clients (e.g. large file downloads) consider either offloading them to your web tier, using mod_x_sendfile or similar. For long-running requests that are bound by server-side performance, (giant database queries or complex computations), consider making them asynchronous, and having the client poll periodically for status.

Helpdesk call statistics

The primary role of a helpdesk is to ensure that second and third-tier support isn’t overwhelmed by the volume of calls coming in. And many helpdesks measure performance in terms of the number of calls which agents are able to close without escalation. This can sometimes create a perverse incentive, whereby you release a new, buggy version of your software, users begin calling to complain, but the helpdesk simply issue the same workaround instructions over and over again – after all, that’s what they’re paid for. If only you’d known, you could have rolled back, or rolled out a patch there and then, instead of waiting for the fortnightly call volumes meeting to highlight the issue. If you can get near-real-time statistics for the volume of calls (ideally, just those related to your service) you can look for sudden jumps, and ask yourself what might be causing them

Helpdesk call volumes

* “But but… my operations team doesn’t have graphs like this!” you say? Well, if only there were some kind of person who could build such a thing… hold on a minute, that’s you, isn’t it? Building the infrastructure to collect and manage this kind of data is pretty much a solved problem nowadays, and ensuring that it’s all in place really should be part and parcel of the overall development effort for any large-scale web application project. Of course, if operations don’t want to have access to this kind of information, then you have a different problem – which is topic for a whole ‘nother blog post some other time.

August 05, 2011

There's no such thing as bad weather…

...only the wrong clothes. Continuing the camping-kit theme, let’s talk about waterproofs. If you’re going camping in the UK, it’s going to rain sooner or later. There are a few things you can do to make this not be a problem

- a tarp, or gazebo, or event shelter, or other tent-without-a-floor-or-walls, allows you to sit around, cook, and generally be outside without getting rained on or feeling quite as penned-in as sitting inside the tent does. Watch out in high winds though.

- Wet grass will soak trainers and leather boots. Get some wellies, some crocs, or some flip-flops

- An umbrella is pretty handy

- Most importantly, get some decent waterproof clothing . For summer camping, I like a lightweight jacket rather than a big heavy winter gore-tex – drying stuff out in a tent is hard (especially if it’s damp outside), so the less fabric there is to dry, the easier it’ll be. My jacket of choice at the moment is a Montane Atomic DT, but my lovely wife has been testing a The North Face Resolve
It uses a 2-layer construction with a mesh drop liner, dries fast (and the mesh liner means it feels dry even when slightly damp), breathes well, and is slightly fitted so it doesn’t look too much like a giant sack. Packs down nice and small, and, of course, it keeps the rain out. It’s cut fairly short, so if you’re out in the rain for a long time you’ll either need waterproof trousers or a tolerance for wet legs. For 3-season use, I’d say it’s ideal.

Update: We’ve had a bit longer to test the TNF Resolve jacket, and so far (after 3 months) the signs are still good. It’s stood up to some pretty torrential rain without showing any signs of weakness, and I am informed that the colour is excellent (I wouldn’t know about that, obviously). The DWR coating is still working well, which is alway a good sign. The previously-mentioned fitted cut means that it doesn’t flap about when it’s blowy, but it’s not tight or restrictive, even when wearing a fleece underneath. The only shortcoming, which is common to a lot of lightweight jackets, is that the peak on the hood isn’t stiffened at all, so if it’s really windy, it tends to get blown onto your face a bit. This isn’t really an issue unless you’re up in the hills in really foul weather, though, and of course adding a wire to the hood would make the jacket a lot less packable, so for a 3-season jacket it’s a worthwhile trade-off.

August 04, 2011

How to back up your gmail on ubuntu

Quick-and-dirty solution:

1. install python and getmail
2. make a getmail config like this:

type = SimplePOP3SSLRetriever
server = pop.gmail.com
username = your_user_name@gmail.com
password = your_password

type = Mboxrd
path = ~/gmail-archive/gmail-backup.mbox

# print messages about each action (verbose = 2)
# Other options:
# 0 prints only warnings and errors
# 1 prints messages about retrieving and deleting messages only
verbose = 2
message_log = ~/.getmail/gmail.log 

3. enable pop on your gmail account
4. add a cronjob like this:

13,33,53 * * * * /usr/bin/getmail -q -r /path/to/your/getmail/config

You’ll end up with a .mbox file which grows and grows over time. Mine is currently at 5GB. I have no idea whether thunderbird or evolution can open such a big file, but mutt can (use “mutt -f ~/gmail-archive/gmail-backup.mbox -R” , unless you really like waiting whilst mutt rewrites the mbox on each save) , or it’s easy enough to grep through if you just need to find the text of a specific message. If you needed to break it into chunks, you could always just use split, and accept that you’ll lose a message where the split occurs (or use split, and then patch toghether the message that gets split in half.

July 15, 2011

Gnome 3 / gnome–shell on Ubuntu Natty

OK, so I’m not the first to do this by a long chalk, but here’s what worked for me:

1: Install the gnome 3 PPAs (c.f. http://nirajsk.wordpress.com/category/gnome-3/) :

sudo add-apt-repository ppa:gnome3-team/gnome3
sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get install gnome-shell
sudo apt-get install gnome-shell-extensions-user-theme

For some reason, the default theme (Adwaita) was missing, as was the default font (Cantarell). (Could be because I didn’t install gnome-themes and gnome-themes-extra – see here). You can download Adwaita from here Get Cantarell from here, copy to /usr/share/fonts/ttf, and sudo fc-cache -rv to update.
I wasn’t too keen on the fat titlebars in adwaita, so I used this to shrink them down:

sed -i "/title_vertical_pad/s/value=\"[0-9]\{1,2\}\"/value=\"0\"/g" /usr/share/themes/Adwaita/metacity-1/metacity-theme-3.xml

You can set the theme and the font using gnome-tweak-tool (aptitude install it if it didn’t arrive with gnome-shell). I’m still looking for a nice icon set; the ubuntu unity ones are a bit orange, and the gnome ones are an unispiring shade of khaki. For now I’ve settled on the KDE-inspired Oxygen (aptitude install oxygen-icon-theme) which is OK, but still doesn’t quite look right.

There’s an adwaita theme for chrome which is nice, and makes everything a bit more consistent.

The London Smoke gnome-shell theme is really nice

Switching from pidgin to empathy gets you nice, clickable message notifications, although I’d rather they were at the top of the screen than the bottom.

Aero-style snap-to-window-edge works fine, except that for nvidia graphics cards with TwinView, you can’t snap to the edges that are between two monitors. Right-click context menus look weird in a way that I can’t quite put my finger on, but they function as expected.

Other than that, it pretty much just works. The only glitch I’ve yet to work out is that spawning a new gnome-terminal window freezes the whole UI for a second or so. Not got to the bottom of why that might be yet; if I find it I’ll post something. In gnome 2 there was a similar problem with the nvidia drivers, which could be “solved” by disabling compiz, but that’s not an option here. Update it seems to be connected to using a semi-transparent background for the terminal; if I use a solid background the problem goes away.

Things I like better than Unity, after a week of playing:
– Clickable notifications from empathy. The auto-hiding notification bar at the bottom of the screen is great, although I’ve found it can sometimes be a bit reluctant to come out of hiding when you mouse over it.
– alt-tab that shows you the window title (good when you’ve got 20 terminals open). I like the ability to tab through multiple instances of the same app in order, too. To get alt-shift-tab to work, I used these instructions
– menus on the application. Fitt’s law can go hang; I like my menus to be connected to the window they control

July 06, 2011

Camping time!

It’s summer time, and that means it’s camping season again. Camping is ace, and it’s doubly ace if you have kids. Almost without exception, kids love weaselling about in the countryside, so once you’ve got them onto the campsite, they’ll amuse themselves and you can get on with the serious business of drinking beer, talking rubbish and playing with fires. What could be finer?

There’s a curious kind of Moore’s Law at the moment, as far as tents are concerned.
Every year, technology trickles down from the top-end, so low and mid-range tents get better and better. My first tent, long long ago, was pretty much bottom-of-the-range and cost about £50 (which was a lot of money for a fourteen year old in ninteen-eighty-something). It weighed approximately a tonne, leaked like a sieve, and stood me in good stead for a few years worth of adventures. Now, for half of that price you can get one of these pop up tents pop-up tent

You don’t so much “pitch” it as just take it out of the bag and stand back. It sleeps two close friends, or one with too much gear, it’s pretty waterproof, and if you peg out the guy-ropes it’s surprisingly sturdy in the wind.

Downsides? It doesn’t pack down small, and I’m not sure it would be my first choice for a week in the Lake District in November – in cold, wet conditions the single-skin design means you’ll end up damp from condensation on the inside of the tent – but for summer weekend trips it’s brilliant; fun, easy, and it costs roughly 1/25th as much as an Ultra Quasar (though if you do have £600 to spare, I can highly recommend one of those as an alternative!).

ProTip: If you want to extend the range of weather you can use this in, get a tarp and pitch it over the top of the tent, overhanging the front by a meter or two. You get an extra layer of waterproofing, and a porch so you don’t have to bring your soggy boots into the tent.

one–liner of the day: quick–and–dirty network summary

If you’ve got a solaris box with a load of zones on, you might sometimes have an issue whereby you can see that the box is pushing a lot of traffic over the network, but you’re unsure which zone(s) are responsible. Here’s a super-hacky way to get an overview:

 snoop -c 1000 | awk '{print $1}' | sort | uniq -c | sort -n

basically, catch the first 1000 packets, find the source for each one (assuming most of your traffic is outbound; if it’s inbound then print $3), and then count the distinct hosts (i.e. your zone names) and list them.

If you have a slow nameserver you may want to add “-r” to the snoop command and then map IPs to names afterwards.

June 26, 2011

Scala: fun with Lists

So, I’m slowly working my way though some of the programming exercises at Project Euler as a means of learning scala (and hopefully a little bit of FP at the same time).
Problem #11 is my most recent success, and it’s highlighted a number of interesting points. In brief, the problem provides you with a 20×20 square of 2-digit integers, and asks for the largest product of any 4 adjacent numbers in any direction (up, down, left, right, diagonally). Kind of like a word-search, but with extra maths.

For no better reason than the afore-mentioned desire to get a bit more acquainted with a functional style of programming, I chose to implement it with no mutable state. I’m not sure if that was really sensible or not. It led to a solution that’s almost certainly less efficient, but potentially simpler than some of the other solutions I’ve seen.

So, step one: Give that we’re going to represent the grid as a List[List[Int]], let’s see if we can get the set of 4-element runs going horizontally left-to-right:

    def toQuadList()= {
      def lrQuadsFromLine(line: List[Int], accumulator: List[List[Int]]): List[List[Int]] = {
        if (line.length < 4)
          lrQuadsFromLine(line.tail, accumulator :+ line.slice(0, 4))
      data.flatMap(line => lrQuadsFromLine(line, List.empty))

(“data” is the List[List[Int]] representing the grid). Here we define a recursive function that takes a List[Int], obtains the first 4-number run (line.slice), adds it to an accumulator, and then pops the first item from the list and calls itself with the remainder.
Then we just call flatMap() on the original List-of-Lists to obtain all such quadruples. We could call Map, but that would give us a list of lists of quadruples – flatMap mushes it down into one list.
It would be nice to find a way to make the if/else go away, but other than just transliterating it into a match, I can’t see a way to do it.

Given that list, finding the largest product is trivial.

   def findMaxProduct = data.map((x: List[Int]) => x.reduce(_ * _)).max

- take each Quadruple in turn and transform it into a single number by multiplying each element with the next. Then call .max to find the largest.

So now we can do one of the 6 directions. However, a bit of thought at this point can save us some time: the Right-to-Left result is guaranteed to be the same as the left-to-right, since multiplication doesn’t care about order (the set of quadruples will be the same in each direction). Similarly, the downwards result will be the same as the upwards one.

Next time-saver: Calculating the downwards result is the same as rotating the grid by 90 degrees and then calculating the left-to-right result. So we just need a rotate function, and we get the vertical results for free:

   def rotate90Deg() = {
      data.head.zipWithIndex.map((t) => data.map((row: List[Int]) => row(t._2)).reverse)

Doing this without mutable state took me some pondering, but the solution is reasonably concise. Doing zipWithIndex on an arbitrary row of the grid (I used .head because it’s easy to access) gives us an iterator with an index number in, so we can now build up a List containing the _n_th element from each of the original lists. (The outer map() iterates over the columns in the grid, the inner one over the rows in the grid))

So now we have the horizontal and vertical totals, we need to do the diagonals. It would be nice if we could once again re-use the left-to-right quadruple-finding code, so we need to get the grid into a form where going left to right gets the sequences that would previously have been diagonals. We can do this by offsetting subsequent rows by one each time, the rotating; like this:




You can see that the horizontal sequences are now the original diagonals. Initially, I started using Option[Int] for the items in the grid, so I could use None for the “question marks”. However, after more time than it ought to have taken me, I realised that using zero for those would work perfectly, as any zero in a quadruple will immediately set that quad’s product to zero, thus excluding it from our calculation (which is what we want).
The function to do the offseting is still rather complex, but not too bad (it owes a lot to the recursive toQuadList function above:

    def diagonalize() = {
      def shiftOneRow(rowsRemaining: List[List[Int]], lPad: List[Int], rPad: List[Int], rowsDone: List[List[Int]]): List[List[Int]] = {
        rowsRemaining match {
          case Nil => rowsDone
          case _ => {
            val newRow: List[Int] = lPad ::: rowsRemaining.head ::: rPad
              0 :: rPad,
              rowsDone ::: List(newRow))
      shiftOneRow(data, List.fill(data.size)(0), List.empty, List.empty)

We define a recursive function that takes a list of rows yet to be padded, a list of zeros to pad the left-hand-side with, another for the right-hand-side, and an accumulator of rows already padded. As we loop through, we remove from the remaining rows, add to the done rows, remove from the left-padding, and add to the right padding. It kind of feels like there ought to be a neater way to do this, but I’m not quite sure what yet.

To get the “other” diagonals, just flip the grid left-to-right before diagonalizing it.

Once that’s done, all that remains is to glue it together. Because I’ve been thinking in OO for so long, I chose to wrap this behaviour in a couple of classes; one for the list-of-quadruples, and one for the grid itself. Then I can finish the job with:

      List(new Grid(data).toQuadList.findMaxProduct,
      new Grid(data).rotate90Deg.toQuadList.findMaxProduct,
      new Grid(data).diagonalize.rotate90Deg.toQuadList.findMaxProduct,
      new Grid(data).flipLR.diagonalize.rotate90Deg.toQuadList.findMaxProduct).max)

June 20, 2011

The one where I fail to go on any exciting expeditions

Writing about web page http://www.gooutdoors.co.uk/thermarest-prolite-small-p147594

So, in the spirit of previous product reviews, this should have been an entry that described various recent feats of derring-do, and casually slipped in a plug for the latest bit of camping equipment that I’ve been sent to review. Unfortunately, the last few weeks have been characterised by some especially foul weather. Last week’s planned camping trip to do the Torq rough ride was abandoned, in favour of an early morning drive down on sunday, followed by three-and-a-half hours of squelching round the “short” route in driving rain.
Then tonight’s Second Annual Summer-Solstice-Mountain-Bike-Bivi-Trip was abandoned when it became apparent that the chances of a scenic sunrise were pretty much zero, whereas the chance of a night in a plastic bag in the rain on a hill were looking close to 100%.

All of which means that my shiny new toy has so far not seen much action outside of the back garden. But it seems churlish not to write something about it; so here goes. It’s a sleeping mat specifically a Thermarest Pro-Lite. Mats might seem like a pretty mundane item, but if you’re trying to travel light, either running or on a bike, then they’re a tricky thing to get right. Too minimalist and you don’t get any sleep, then you feel like crap the next morning. Too heavy, and they start to make a serious contribution to your pack weight, which matters a lot if you’re having to run up a hill or ride down one.
For a long time, my preferred option was a cut-down foam karrimat, which was a reasonable compromise, but suffered from being a bit on the bulky side, and also not terribly warm. I have an old thermarest as well, which is fabulously comfy – great for winter camping, but far too bulky for fast & light summer trips.

There will be some pics here, when it stops raining for long enough… for now, here’s a stock photo…

So, the pro-lite: Point 1; I got the small one; it’s very small. (note the artful perspective on the photo!) If you want something that will reach alll the way down to your toes (or even your knees) this isn’t it; buy the large size. I don’t mind this, though; in the summertime I’m happy with something that just reaches from head-to-thigh. Equally, it’s quite slim. My shoulders are fairly scrawny, and this just-about reaches from one to the other. If you’ve been spending longer in the gym (or the cake shop) than on the track or the turbo, then you might want a bigger size.

Point 2: It is just unbelievably compact. Really. It rolls up into something about the size of a 750ml drink bottle. Foam karimats can’t come near to this. This makes more of a difference than you might think, because it means you can get away with a pack that’s 5L or so smaller (and therefore lighter), and still fit the mat inside (keeping your mat inside your pack is a winning plan, because you can keep it dry). It’s also great if you’re backpacking on a bike, because the smaller your pack, the less it affects the bike’s handling.

Point 3: It’s as light as a foam mat. Unlike my old thermarest, there’s not a lot inside this mat, so it’s super-lightweight. Mine weighed in at a smidge over 300g according to the kitchen scales.

Point 4: Back-garden tests strongly suggest that it’s a lot more comfy than my old foam mat. I’ll report back once it’s stopped raining long enough to try it out for real!

Update #1 : Still not had the chance to take this backpacking, but a car-camping trip confirms that it’s very comfortable indeed – just as good as my old thermarest, though in a big tent you have to be careful not to roll off it in the night!

June 16, 2011

Partial and Curried Functions

On with the scala show! Partially-applied functions and curried functions took me a while to get my head around, but really they’re quite simple, and partials at least are super-useful.

Partially-applied functions are just functions where you pre-bind one of the parameters. e.g.

scala> val message:(String,String,String)=>String = "Dear " + _ +", " + _ + " from " + _
message: (String, String, String) => String = <function3>

scala> message("Alice","hello","Bob")
res24: String = Dear Alice, hello from Bob

scala> val helloMessage=message(_:String,"hello",_:String)
helloMessage: (String, String) => String = <function2>

scala> helloMessage("Alice","Bob")
res25: String = Dear Alice, hello from Bob

scala> val aliceToBobMessage=message("Alice",_:String,"Bob")
aliceToBobMessage: (String) => String = <function1>

scala> aliceToBobMessage("greetings")
res27: String = Dear Alice, greetings from Bob

What’s happening here is reasonably self-explanatory. We create a function “message” which takes 3 parameters, then we create two more functions, “helloMessage” and “aliceToBobMessage” which just are just aliases to the “message” function with some of the parameters pre-filled.

Since functions are first-class objects that you can pass around just like anything else, this means you can do stuff like

scala> val times:(Int,Int)=>Int = _*_
times: (Int, Int) => Int = <function2>

scala> val times2=times(2,_:Int)
times2: (Int) => Int = <function1>

scala> (1 to 10) map times2
res38: scala.collection.immutable.IndexedSeq[Int] = Vector(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)

Currying is, at first sight, just a different (more complicated) way to achieve the same thing:

scala> val message:(String)=>(String)=>(String)=>String = (to:String)=>(message:String)=>(from:String)=>{ "Dear " + to +", " + message + " from " + from}
message: (String) => (String) => (String) => String = <function1>

scala> message("Alice")("Hello")("Bob")
res28: String = Dear Alice, Hello from Bob

scala> val aliceToBobMessage=message("Alice")(_:String)("Bob")aliceToBobMessage: (String) => String = <function1>

scala> aliceToBobMessage("greetings")res29: String = Dear Alice, greetings from Bob

What that giant line of verbiage at the start is doing, is creating a function which takes one string and returns a function that takes another string, which returns a function that takes another string, which returns a string. Phew.
This mapping from a function that takes n parameters, to a chain of n functions that each take 1 parameter, is known as “currying” (After Haskell Curry). Why on earth would you want to do this, rather than the much simpler partial application example above?
Several of the core scala API methods use curried functions – for example foldLeft in the collections classes

def foldLeft[B](z: B)(op: (B, A) => B): B = 
        foldl(0, length, z, op)

- here we’re exposing a curried function, and then proxying on to a non-curried implementation. Why do it like this?

It turns out that curried functions have an advantage (see update below) that partial ones dont. If I curry a function, and then bind some variables to form what is effectively the same as a partial function, I can also modify the type of the remaining unbound parameters. Example required!

scala> val listMunger:(String)=>(List[Any])=>String = (prefix:String)=>(contents:List[Any])=>prefix + (contents.reduce(_.toString +_.toString))
listMunger: (String) => (List[Any]) => String = <function1>

scala> val ints = List(1,2,3)
ints: List[Int] = List(1, 2, 3)

scala> listMunger("ints")(ints)
res31: String = ints123

scala> val strings = List("a","b","c")
strings: List[java.lang.String] = List(a, b, c)

scala> listMunger("strings")(strings)
res32: String = stringsabc

scala> val stringListMunger=listMunger("strings")(_:List[String])
stringListMunger: (List[String]) => String = <function1>

scala> stringListMunger(strings)
res33: String = stringsabc

scala> stringListMunger(ints)
<console>:11: error: type mismatch;
 found   : List[Int]
 required: List[String]

That last error is the important bit. We’ve specialized the parameter type for the unbound parameter in stringListMunger so it won’t accept a list of anything other than Strings. Note that you can’t arbitrarily re-assign the type; it has to be a subtype of the original (otherwise the implementation might fail).
OK; so now all I have to do is think of a real-world example where this would be useful…

Update Gah, I was wrong. You can do exactly the same type-specialization with a partial:

scala> val x:(Int,List[Any])=>Int = (_,_)=>1
x: (Int, List[Any]) => Int = <function2>

scala> val y:(List[Int])=>Int = x(1,_)
y: (List[Int]) => Int = <function1>

So now I still have no idea when you’d want to curry a function, rather than just leaving it with multiple arguments and partially applying when required. This blog entry suggests that it really exists to support languages like OCaml or Haskel that only allow one parameter per function – so maybe it’s only in scala to allow people to use that style if they like it. But then what’s it doing in the APIs ?

June 11, 2011

Further Scala: Implicit conversions

My attempts to learn scala continue…

Implicit conversions are a cool, if slightly unsettling (from a java programmers POV) scala feature. If I have an instance of one class, and I try and call a method on it which is defined in a different class, then if there’s an “implicit” method in scope which will convert between the two, scala will silently use it.

scala> var x = 12
x: Int = 12

scala> x.substring(1,2)
<console>:9: error: value substring is not a member of Int

scala> implicit def foo(i:Int):String={i.toString}
foo: (i: Int)String

scala> 12.substring(1,2)
res10: java.lang.String = 2


This lends itself to a very very useful trick; the ability to enhance classes with additional methods. Say you had a java Map class, and you wanted the ability to merge it with another Map according to some sort of merge function on the values. You’d probably do it like this:

class MergeableMap implements Map{

public MergeableMap(Map delegate){
 this.delegate = delegate

public Map merge(Map otherMap, ValueMergingFunction mergeFunction){

... delegate implementations of all Map methods here...

Trouble is, (a) writing all the delegate methods is tedious, and(b) every time you want to use it, you’ve got to do

MergeableMap m = new MergeableMap(myMapVariable)

Implicits in scala make this a lot easier:

class MergeableMap[A, B](self: Map[A, B]) {
  def merge(m1, merger): Map[A, B] = {
... implementation here...

implicit def map2mergeableMap[A,B](m:Map[A,B]):MergeableMap[A,B] = new MergeableMap(m)

myMap.merge(myOtherMap, myMergeFunction)

there’s no need to implement the other delegate methods, since we can just call them on the original Map class – but when we call merge() compiler-based voodoo works out that we want a mergeable map, and swaps it in for us. Magical.

June 09, 2011

Long Ranges in Scala

So, following on from yesterday’s prime-factoring annoyance; a more basic requirement: How to iterate from 1 to ~10^17? (i.e. close to the limit of Long)

A Range won’t cut it…

scala> val big_long=12345678901234567L
big_long: Long = 12345678901234567

scala> (1L to big_long)
java.lang.IllegalArgumentException: 1 to 12345678901234567 by 1: seqs cannot contain more than Int.MaxValue elements.

How about a BigIntRange?

scala> (BigInt(1) to BigInt(big_long))
java.lang.IllegalArgumentException: 1 to 12345678901234567 by 1: seqs cannot contain more than Int.MaxValue elements.
    at scala.collection.immutable.NumericRange$.count(NumericRange.scala:229)

OK, how about a stream?

scala>  lazy val naturals: Stream[Long] = Stream.cons(1, naturals.map(_ + 1))
naturals: Stream[Long] = <lazy>
scala>  naturals.takeWhile(_<big_long).find(_ == -1L)
Exception in thread "Poller SunPKCS11-Darwin" java.lang.OutOfMemoryError: Java heap space

hmm… running out of options a bit now. Maybe if I could construct a Stream without using the map() call (since I suspect that’s what’s eating up the heap), or use for-comprehension with an efficient generator?
Or…hang on…

scala> var i=0L
i: Long = 0
scala> while (i < big_long){i = i+1L}
{...time passes...exceptions are not thrown...}

So, it turns out that “the java way” works. Which, I guess, is the benefit of a mixed language; you can always fall back to the tried-and-tested solutions if the clever bits fail. And of course, you can hide the clunkiness pretty well:

 object LongIter extends Iterator[Long]{
    var l:Long = 0
    def hasNext:Boolean={true}
    def next:Long={
  object LongIterable extends Iterable[Long]{
    val iter = LongIter
    def iterator = {iter}

//ugliness ends, now we can pretend that LongIterable was there all along...

   LongIterable.find(_ == 12345678901234567L)

I suspect that this doesn’t really conform to the iterator/iterable contract (obviously, hasNext isn’t implemented), but it does appear to work tolerably well. Well enough, in fact, that I’m surprised I’ve not yet found a more idiomatic syntax for creating an iterator whose next() function is some arbitrary function on the previous value.

...reads source of Iterator ...

  Iterator.iterate(0L)(_+1L).find(_ == 123456701234567L)

phew. Finally.

June 08, 2011

Learning Scala, a step at a time

So, I have decided to make a more serious effort to learn scala . It fits almost all of my ‘ideal language’ features; it’s object-oriented and it’s statically typed (like java), it’s reasonably terse (unlike java) and it has good support for a more functional style of programming, without forcing you to use it all the time. It also runs on the JVM, which is a good thing if you care about observability, manageability, and all the other stuff that ops people care about.

The downsides, as far as I’ve found so far, are that (a) the tooling is nothing like as good as it is for java. The Eclipse Scala plugin appears to be the best the current bunch, but even there, support for things like refactoring (which is kind of the point of a statically-typed language) is pretty sketchy. And (b) it’s a bit of an unknown quantity in production. Yes, there are some big, high-scale users (Twitter spring to mind), but they’re the kind that can afford to have the best engineers in the world. They could make anything scale. Maybe scala will scale just as well in a more “average” operational environment, but there doesn’t seem to be a huge amount of evidence one way or another at the moment, which may make it a hard sell to a more conservative operations group.

On with the show. The best resources I’ve found so far for learning the language are:

- Daniel Spiewak’s series Scala for Java Refugees . Absolutely brilliant series of articles that do exactly what they say on the tin – put scala firmly into the domain of existing java programmers. Which brings me to….

- Tony Morris’ Scala Exercises for beginners. I had a hard time with these, not so much because they were difficult, but because coming from a non-comp-sci background, I found it hard to motivate myself to complete exercises as dry as “implement add(x: Int, y: Int) without using the ’+’ operator”. I also found the slightly evangelical tone of some of the other blog entries (TL;DR:”All you java proles are dumb”) a bit annoying. But working through the exercises undoubtedly improved my understanding of Scala’s support for FP, so it was worth it. On the Intermediate exercises though, I don’t even know where to start

- Akka Akka is an actor framework for scala and java, so it’s not really a scala resource per se but, it does such a great job of explaining actor-based concurrency that it’s worth reading though and playing with, because you’ll come away with a much clearer idea of how to use either Akka or Scala’s own Actors.

- Project Euler – A series of maths-based programming challenges – find the prime factors of a number, sum the even fibonaci numbers less than a million, and so on. Being maths-ish, they’re a good match for a functional style of programming, so I find it helpful to start by just hacking together a java-ish solution, and then seeing how much smaller and more elegant I can make it. It’s a great way to explore both the syntax, and the Collections APIs, which are at the heart of most of scala’s FP support. It’s also a nice way to learn ScalaTest (the unit testing framework for scala), since my hack-and-refactor approach is reliant on having a reasonable set of tests to catch my misunderstandings.
It’s also exposed some of my concerns about the operational readiness of scala. I came up with what I thought was a fairly neat implementation of a prime-factoring function; obtain the lowest factor of a number (ipso facto prime), divide the original by that number, and repeat until you can’t factor any further:

 lazy val naturals: Stream[Long] = Stream.cons(1, naturals.map(_ + 1))

  def highestFactor(compound: Long): Long = {
   naturals.drop(1).takeWhile(_ < compound/ 2).find(
        (compound % _ == 0)).map(

– which works beautifully for int-sized numbers, but give it a prime towards the upper bounds of a Long and it runs out of heap space in the initial takeWhile (before it’s even had a chance to start recursively calling itself). It seems that Stream.takeWhile doesn’t permit the taken elements to be garbage-collected, which is counter-intuitive. I wouldn’t be at all surprised to find that I’m Doing It Wrong, but then again, shouldn’t the language be trying hard to stop me?

Java String.substring() heap leaks

Here’s an interesting little feature that we ran into a short while ago…

Suppose I have something like a Map which will exist for a long time (say, an in-memory cache), and a large, but short-lived String. I want to extract a small piece of text from that long string, and use it as a key in my map

Map<String,Integer> longLivedMap = ...
String veryLongString = ...
String shortString = veryLongString.subString(5,3);

Question: How much heap space have we just consumed by adding “abc”=>123 into our map? You might think that it would be just a handful of bytes – the 3-character String, the Integer, plus the overhead for the types. But you would be entirely wrong. Java Strings are backed by char arrays, and whilst the String.subString() method returns a new String, it is backed by the same char array as the originating String. So now the entire veryLongString char[] has a long-lived reference and can’t be garbage collected, even though only 3 chars of it are actually accessible. Rapid heap-exhaustion, coming right up!

The solution is pretty straightforward; if you want to hold a long-lived reference to a string, call new String(string) first. Something like

String shortString = veryLongString.subString(5,3);
longLivedMap.put(new String(shortString),123);

It would be counter-productive to do this on every substring, since most of the time the substring and the original source will go out of scope at the same time, so sharing the underlying char[] is a sensible way to reduce overhead.

Since you (hopefully) won’t have many separate places in your code where you’re storing long-lived references like this (just cache implementations of one sort or another), you can create the new strings inside the implementation of those classes instead. Now your caches are all light, and your heap is all right. Good Night.

March 10, 2011

QCon day 1

A lot of good stuff today, but I’m just going to jot down a couple of my favourite points:

Craig Larman talked about massive-scale Scrum-based projects. I don’t suppose I’m going
to be running (or even part of) at 1500-person dev team very much, but some of his points
are applicable at any scale:
  • The only job title on the team is “Team Member”. There might be people with specialist skills,
    but no-one can say “it’s not my job to fix that”
  • If you don’t align your dev. teams’ organisation with your customers, then Conway’s law means your
    architecture will not align with your customers either, and you won’t be able to react when their needs
  • Don’t have management-led tranfoormation projects. How will you know when they’re done? Instead,
    management’s role is just to remove impediments that the dev team runs up against – the “servant-leader”
Dan North spoke about how he moved from what he thought was a pretty cutting edge, agile environment
Thoughtworks, consulting to large organisations starting to become leaner/more agile) to a really agile
environment (DRW, releasing trading software 10s of times per day), and how if you have a team that
is technically strong, empowered, and embedded in the domain (i.e. really close to the users), you can do
away with many of the traditional rules of Agile. A couple that really struck me were:
  • Assume your code has a half-life. Don’t be afraid to just rewrite it, or bin it The stuff that stays in
    can get better over time, but it doesn’t have to be perfect from day 1
  • Don’t get emotionally attached to the software you create. Get attached to the capabilities you enable
    for your users
  • Remeber, anything is better than nothing.
Juergen Hoeller talked about what’s new in Spring 3.1. No amazing surprises here, but some nice stuff:
  • Environment-specific beans – avoid having to munge together different config files for system-test vs.
    pre-production vs. live, have a single context with everything defined in it (Even nicer, arguably, when you
    do it via Java Config and the @environment annotation)
  • c: namespace for constructor args. Tasty syntactic sugar for your XML, and the hackery they had to go through
    to get it to work is impressive (and explains why it wasn’t there from the start)
  • @cacheable annotation, with bindings for EHCache and GemFire (not for memcached yet, which is a bit of a surprise)
Liz Keogh talked about perverse incentives. Any time you have a gap between the perceived value that a metric
measures, and the actual value that you want to create, you make an environment where arbitrage can occur. People can’t
help but take advantage of the gap, even when they know at some level that they’re doing the “wrong thing”.
  • Focus on the best-performing parts of the organisation as well as the worst-performing. Don’t just say “This
    project failed; what went wrong”; make sure you also say “This project succeeded better than all the others; what went right?”
  • Don’t try and create solutions to organisation problems, or you’ll inevitably make perverse incentives. Instead, make
    make Systems (Systems-thinking, not computer programs) that allow those solutions to arise.
Chris Read and Dan North talked about Agile operations. Surprisingly for me, there wasn’t a great deal of novel stuff
here, but there were a couple of interesting points:
  • Apply an XP-ish approach to your organisational/process issues: Pick the single biggest drag on you delivering value and
    do the simplest thing to fix it. Then iterate.
  • Fast, reliable deploys are a big force multiplier for development. If you can deploy really fast, with low risk,
    then you’ll do it more often, get feedback faster, allow more experimentation, and generally waste less. The stuff that Dan
    and Chris work on (trading data) gets deployed straight from dev workstations to production in seconds; automated testing
    happens post-deploy

Yoav Landman talked about Module repositories. Alas, this was not the session I had hoped for; I was hoping for some
takeaways that we could apply to make our own build processes better, but this was really just a big plug for Artifactory,
which looks quite nice but really seems to solve a bunch of problems that I don’t run into on a daily basis. I’ve never needed
to care about providing fine-grained LDAP authorisation to our binary repo, nor to track exactly which version of hibernate was
used to build my app 2 years ago. The one problem I do have in this space (find every app which uses httpclient v3.0,
upgrade, and test it) is made somewhat easier by a tool like Artifactory, but that problem very rarely crops up, so it
doesn’t seem worth the effort of installing a repo manager to solve it. Also it doesn’t integrate with any SCM except Subversion,
which makes it pretty useless for us.

March 08, 2011

"Designing Software, Drawing Pictures

Not a huge amount of new stuff in this session, but a couple of useful things:

The goal of architecture: in particular up-front architecture, is first and foremost to communicate the vision for the system, and secondly to reduce the risk of poor design decisions having expensive consequences.

The Context->Container->Component diagram hierarchy:

The Context diagram shows a system, and the other systems with which it interacts (i.e. the context in which the system operates). It makes no attempt to detail the internal structure of any of the systems, and does not specify any particular technologies. It may contain high-level information about the interfaces or contracts between systems, if

The Container diagram introduces a new abstraction, the Container, which is a logical unit that might correspond to an application server, (J)VM, databases, or other well-isolated element of a system. The container diagram shows the containers within the system, as well as those immediately outside it (from the context diagram), and details the
communication paths, data flows, and dependencies between them.

The Component diagram looks within each container at the individual components, and outlines the responsibilties of each. At the component
level, techniques such as state/activity diagrams start to become useful
in exploring the dynamic behaviour of the system.

(There’s a fourth level of decomposition, the class diagram, at which we start to look at the high-level classes that make up a component, but I’m not sure I really regard this as an architectural concern)

The rule-of-thumb for what is and what isn’t architecture:

All architecture is design, but design is only architecture if it’s costly to change, poorly understood, or high risk. Of course, this means that “the architecture” is a moving target; if we can reduce the cost of change, develop a better understanding, and reduce the risk of an element then it can cease to be architecture any more and simply become part of the design.

March 07, 2011

Five things to take away from Nat Pryce and Steve Freeman's "TDD at the system scale" talk

  • When you run your system tests, build as much as possible of the environment from scratch.
    At the very least, build and deploy the app, and clear out the database before each run
  • For testing assemblies that include an asynchronous component, you want to wrap
    your assertions in a function that will repeatedly “probe” for the state you want
    until either it finds it, or it times out. Something like this

    Wrap the probe() function into a separate class that has access to the objects you
    want to probe to simplify things.

  • Don’t use the logging APIs directly for anything except low-level debug() messages, and maybe
    not even then. Instead, have a “Monitoring” topic, and push structured messages/objects onto
    that queue. Then you can separate out production of the messages from routing, handling, and
    persisting them. You can also have your system tests hook into these messages to detect hard-to-observe state changes
  • For system tests, build a “System Driver” that can act as a facade to the real system, giving
    test classes easy access to a properly-initialised test environment – managing the creation and
    cleanup of test data, access to monitoring queues, wrappers for probes, etc.
  • We really need to start using a proper queueing provider

February 24, 2011

Solaris IGB driver LSO latency.

Yes, it’s another google-bait entry, because I found it really difficult to find any useful information about this online. Hopefully it’ll help someone else find a solution faster than I did.

We migrated one of our applications from an old Sun V40z server, to a newer X4270. The migration went very smoothly, and the app (which is CPU-bound) was noticeably faster on the shiny new server. All good.

Except, that when I checked nagios, to see what the performance improvement looked like, I saw that every request to the server was taking exactly 3.4 seconds. Pingdom said the same thing, but a simple “time curl …” for the same URL came back in about 20 milliseconds. What gives?
More curiously still, if I changed the URL to one that didn’t return very much content, then the delay went away. Only a page that had more than a few KBs worth of HTML would reveal the problem

Running “strace” on the nagios check_http command line showed the client receiving all of the data, but then just hanging for a while on the last read(). The apache log showed the request completing in 0 seconds (and the log line was printed as soon as the command was executed).
A wireshark trace, though, showed a 3-second gap between packets at the end of the conversation:

23    0.028612    TCP    46690 > http [ACK] Seq=121 Ack=13033 Win=31936 Len=0 TSV=176390626 TSER=899836429
24    3.412081    TCP    [TCP segment of a reassembled PDU]
25    3.412177    TCP    46690 > http [ACK] Seq=121 Ack=14481 Win=34816 Len=0 TSV=176391472 TSER=899836768
26    3.412746    HTTP    HTTP/1.1 200 OK  (text/html)
27    3.412891    TCP    46690 > http [FIN, ACK] Seq=121 Ack=15517 Win=37696 Len=0 TSV=176391472 TSER=899836768

For comparison, here’s the equivalent packets from a “curl” request for the same URL (which didn’t suffer from any lag)

46    2.056284    TCP    49927 > http [ACK] Seq=159 Ack=15497 Win=37696 Len=0 TSV=172412227 TSER=898245102
47    2.073105    TCP    49927 > http [FIN, ACK] Seq=159 Ack=15497 Win=37696 Len=0 TSV=172412231 TSER=898245102
48    2.073361    TCP    http > 49927 [ACK] Seq=15497 Ack=160 Win=49232 Len=0 TSV=898245104 TSER=172412231
49    2.073414    TCP    http > 49927 [FIN, ACK] Seq=15497 Ack=160 Win=49232 Len=0 TSV=898245104 TSER=172412231

And now, it’s much more obvious what the problem is. Curl is counting the bytes received from the server, and when it’s got as many as the content-length header said it should expect, the client is closing the connection (packet 47, sending FIN). Nagios, meanwhile, isn’t smart enough to count bytes, so it waits for the server to send a FIN (packet 27), which is delayed by 3-and-a-bit seconds. Apache sends that FIN immediately, but for some reason it doesn’t make it to the client.

Armed with this information, a bit more googling picked up this mailing list entry from a year ago. This describes exactly the same set of symptoms. Apache sends the FIN packet, but it’s caught and buffered by the LSO driver. After a few seconds, the LSO buffer is flushed, the client gets the FIN packet, and everything closes down.
Because LSO is only used for large segments, requesting a page with only a small amount of content doesn’t trigger this behaviour, and we get the FIN immediately.

How to fix? The simplest workaround is to disable LSO:

#  ndd -set /dev/ip ip_lso_outbound 0

(n.b. I’m not sure whether that persists over reboots – it probably needs adding to a file in /kernel/drv somewhere). LSO is beneficial on network-bound servers, but ours isn’t so we’re OK there.

An alternative is to modify the application code to set the TCP PSH flag when closing the connection, but (a) I’m not about to start hacking with apache’s TCP code, (b) it’s not clear to me that this is the right solution, anyway.

A third option, specific to HTTP, is just to use an HTTP client that does it right. Neither nagios nor (it seems) pingdom appear to know how to count bytes and close the connection themselves, but curl does, and so does every browser I’ve tested. So you might just conclude that there’s no need to fix the server itself.

February 16, 2011

Deploying LAMP websites with git and gitosis

Requirement of the day: I’m providing a LAMP environment to some other developers elsewhere in the organisation. They’ll do all the PHP programming, but I’ll look after the server, keep it patched, upgrade it when necessary, and so on.

At some point in the future, we’ll doubtless need to rebuild the server (new OS version, hardware disaster, need to clone it for 10 other developers to each have their own, etc…), so we want as little configuration as possible on the server. Everything should be built from configuration that lives somewhere else.

So, the developers basically need to be able to do 3 things:
– update apache content – PHP code, CSS/JS/HTML, etc.
– update apache VHost config – a rewrite here, a location there. They don’t need to touch the “main” apache config (modules, MPM settings, etc)
– do stuff (load data, run queries, monkey about) with mysql.

Everything else (installation of packages, config files, cron jobs, yada yada) is mine, and is managed by puppet.

So, we decided to use git and gitosis to accomplish this. We’re running on Ubuntu Lucid, but this approach should translate pretty easily to any unix-ish server.

1: Install git and gitosis. Push two repositories – apache-config and apache-htdocs

2: As the gitosis user, clone the apache-config repository into /etc/apache2/git-apache-conf, and the apache-htdocs repository into /usr/share/apache/htdocs/git-apache-htdocs

3: Define a pair of post-receive hooks to update the checkouts when updates are pushed to gitosis.
The htdocs one is simple:

cd /usr/share/apache2/git-apache-htdocs && env -i git pull --rebase

the only gotcha here is that because the GIT_DIR environment variable is set in a post-receive hook, you must unset it with “env -i” before trying to pull, else you’ll get the “fatal: Not a git repository: ’.’” error.

The apache config one is a bit longer but hopefully self-explanatory:

cd /etc/apache2/git-apache-conf && env -i git pull --rebase && sudo /usr/sbin/apache2ctl configtest && sudo /usr/sbin/apache2ctl graceful"

Add a file into /etc/apache/conf.d with the text “Include /etc/apache2/git-apache-conf/*” so that apache picks up the new config.

We run a configtest before restarting apache to get more verbose errors in the event of an invalid config. Unfortunately if the config is broken, then the broken config will stay checked-out – it would be nice (but much more complex) to copy the rest of the config somewhere else, check out the changes in a commit hook, and reject the commit if the syntax is invalid.

And that’s it! make sure that the gitosis user has rights to sudo apachectl, and everything’s taken care of. Except,of course, mysql – developers will still need to furtle in the database, and there’s not much we can do about that except for making sure we have good backups.

You might be wondering why we chose to involve gitosis at all, and why we didn’t just let the developers push directly to the clone repositories in /etc/apache2 and /usr/share/apache2/htdocs. That would have been a perfectly workable approach, but my experience is that in team-based (as opposed to truly decentralised) development, it’s helpful to have a canonical version of each repository somewhere, and it’s helpful if that isn’t on someone’s desktop PC. Gitosis provides that canonical source
Otherwise, one person pushes some changes to test, some more to live, and then goes on holiday for a week. Their team-mate is left trying to understand and merge together test and live with their own, divergent, repo before they can push a two-line fix.
More experienced git users than I might be quite comfortable with this kind of workflow, but to me it still seems a bit scary. My years of CVS abuse mean I like to have a repo I can point to and say “this is the truth, and the whole truth, and nothing but the truth” :-)