Solaris SMF and HAProxy won't play nicely
Following on from the problems we had with mod_jk a while back, we’ve been using HAProxy as a replacement balancer on our newer apps, and for the most part it works very well. However, I’ve come up against what seems to be an irreconcilable difference between HAProxy and Solaris SMF, which is making me think about re-evaluating loadbalancers.
SMF manages services with what it terms a ‘contract’ . A contract is basically a process, or group of processes, which SMF will look after for you. If one fails, it will restart it automatically, and it will make sure that when your app stops, all the process in the contract stop, and so on.
Now, HAProxy has one feature which is unusual amongst load-balancers. Once you’ve started an HAProxy instance, you can’t modify it’s configuration. So you can’t drop one server out of a running instance and replace it with another (of course, it can automatically detect a dead server and remove that, but that’s a different problem). Instead, what you can do is tell the running instance to stop listening on it’s TCP port, but finish processing any active requests. You can start a new instance, with a new configuration, as soon as the old one has stopped listening, and the old process will run on in the background until such time as it completes all it requests; then it will exit.
Of course, this totally doesn’t work with SMF. SMF can’t cope with the idea that a process that was once part of a contract is no longer part of the contract. So, if you try to restart HAproxy, SMF will send the appropriate kill signal to tell the old instance to die, but it won’t start the new instance until the old one goes away. No use at all.
So we have 2 options:
1) Remove haproxy from SMF, and just use the traditional init-script approach. Not a bad idea, but we lose the ability to have the process auto-restarted if it dies in the night
2) Use another proxy – perlbal, pen, or even apache mod_balance.
Ho hum. Time to download and play…
UPDATE 7/11/2009: 2 years on, and we’re still happily using haproxy with SMF, just not using soft-restarts. Here’s a blog entry with a bit more detail