All entries for Wednesday 22 November 2006
November 22, 2006
A little while ago, following a new code release, our average render time for HTML pages jumped upwards, from about 100ms, to about 140ms. It did this very consistently, but there was nothing obvious in the release that could have caused such a jump. About the only thing of any significance was that we’d added a new column ( a Number(1) datatype) to one of our tables, a table that was queried at least once for every render.
We puzzled over this for a while, then noticed that at the same time, Oracle was reporting a sudden jump in a particular kind of Wait event -specifically “SQL*Net more data from client” waits.
Did you hear that, google? I’ll say it again: “SQL*Net more data from client”, Wait class Network.
Google knows next to nothing about the causes of this wait event, because as far as DBAs are concerned, this is an ‘idle wait’, AKA ‘someone-else’s-problem’ kind of wait. There’s nothing that can be done within the database to tune away such waits, so no-one writes much about it. It’s caused by the client application not sending data fast enough to the DB. Fix it by making your app faster.
Anyway, considerable digging later revealed a possible explanation. One scenario that can generate the kind of waits we saw is the following.
If you have a huge SQL expression (select field1, field2, ....field999 from table1, table2, ... table999 where clause1, clause2…. clause999), then it may get too big to fit into a single SQL*Net packet from the client to the server, in which case it is fragmented into as many packets as needed.
Now, if you go from a statement that fits into one SQL*Net packet to one that doesn’t, you double the number of round-trips between client and server. This has a very large relative effect on the amount of “SQL*Net wait for client” wait time spent on the server. If the query in question is executed a lot, this can become very noticeable.
And of course, this is a threshold thing. If your query is 1 byte shorter than the packet size, you get no wait. 1 byte longer, and you get loads. Could it be that that extra column we’d added had pushed us over the edge?
Well, we never quite found out for sure. We upgraded from the oracle classes12 (java 1.2+, oracle 8i+) JDBC drivers to the ojdbc14 (java 1.4, oracle10g) drivers; the wait completely disappeared, and the render times dropped back down again.
Whether ojdbc14 supports larger packets, or compression, AIO, or some other optimisation, something has changed which makes the whole problem go away. So we don’t need to start optimising our select statements, which is a good thing, because most of them are generated by Hibernate, and I’d really rather not start messing with it’s generation strategies.
Once again, I find myself glaring balefully at the output of garbage collection logs and wondering where my CPU is going. Sitebuilder2 has a very different GC profile to most of our apps, and whilst it’s not causing user-visible problems, it’s always good to have these things under control.
So, SB2 has an interesting set of requirements. Simplistically, we can say it does 3 things:
1) Serve HTML pages to users
2) Serve files to users
3) Let users edit HTML/Files/etc
these 3 things have interestingly different characteristics. HTML requests generate a moderate amount of garbage, but almost always execute much quicker than the gap between minor collections. So, in principle, as long as our young generation is big enough we should get hardly any old gen. garbage from them. Additionally, HTML requests need to execute quickly, else users will get bored and go elsewhere.
Requests for small files are rather similar to the HTML requests, but most of our file serving time is spent drip-feeding whacking great files (10MB and up) to slow clients. This kind of file-serving generates quite a lot of garbage, and it looks as if a lot of it sticks around for long enough that it ends up in the old gen. Certainly the requests themselves take much longer than the time between minor collects, so any objects which have a lifetime of the HTTP request will end up as heap garbage. Large file serving, though, is mostly unaffected by the odd GC pause. If your 50MB download hangs for a second or two halfway through, you most likely won’t notice.
Edit requests are a bit of a mishmash. Some are short and handle only a little data, others (uploading the aforementioned big files, for instance) are much longer running. But again, the odd pause here and there doesn’t really matter. There are orders of magnitude fewer edit requests than page/file views.
So, the VM is in something of a quandry. It needs a large heap to manage the large amounts of garbage generated from having multiple file serving requests going on at any given time. And it needs to minimise the number of Full GCs so as to minimise pauses for the HTML server. But, the cost of doing a minor collection goes as a function of the amount of old generation allocated, so a big, full heap implies a lot of CPU sucked up by the (parallel) minor collectors. It also means longer-running minor collections, and a greater chance of an unsuccessful minor collect, leading to a full GC.
(For reference, on our 8-way (4 proc) opteron box, a minor collect takes about 0.05s with 100MB of heap allocated, and about 0.7S with 1GB of heap allocated)
So, an obvious solution presents itself. Divide and Conquer.
Have a VM (or several) dedicated to serving HTML. These should have a small heap, and a large young generation, so that parallel GCs are generally fast, and even a full collection is not going to take too long. This VM will be very consistent, since pauses should be minimal.
Secondly, have a VM for serving big files. This needs a relatively big heap, but it can be instructed to do full GCs fairly frequently to keep things under control. There will be the occasional pause, but it doesn’t matter too much. Minor collections on this box will become rather irrelevant, since most requests will outlive the minor GC interval.
Finally, have a VM for edit sessions. This needs a whacking big heap, but it can tolerate pauses as and when required. Since the frequency of editor operations is low, the frequency of minor collects (and hence their CPU overhead) is also low.
The only downside is that we go from having 2 active app server instances to 6 (each function has a pair of VMs so we can take one down without affecting service). But that really only represents a few extra hundred MB of memory footprint, and a couple of dozen more threads on the box. It should, I hope, be a worthwhile trade off.