February 24, 2011

Solaris IGB driver LSO latency.

Yes, it’s another google-bait entry, because I found it really difficult to find any useful information about this online. Hopefully it’ll help someone else find a solution faster than I did.

We migrated one of our applications from an old Sun V40z server, to a newer X4270. The migration went very smoothly, and the app (which is CPU-bound) was noticeably faster on the shiny new server. All good.

Except, that when I checked nagios, to see what the performance improvement looked like, I saw that every request to the server was taking exactly 3.4 seconds. Pingdom said the same thing, but a simple “time curl …” for the same URL came back in about 20 milliseconds. What gives?
More curiously still, if I changed the URL to one that didn’t return very much content, then the delay went away. Only a page that had more than a few KBs worth of HTML would reveal the problem

Running “strace” on the nagios check_http command line showed the client receiving all of the data, but then just hanging for a while on the last read(). The apache log showed the request completing in 0 seconds (and the log line was printed as soon as the command was executed).
A wireshark trace, though, showed a 3-second gap between packets at the end of the conversation:

23    0.028612    137.205.194.43    137.205.243.76    TCP    46690 > http [ACK] Seq=121 Ack=13033 Win=31936 Len=0 TSV=176390626 TSER=899836429
24    3.412081    137.205.243.76    137.205.194.43    TCP    [TCP segment of a reassembled PDU]
25    3.412177    137.205.194.43    137.205.243.76    TCP    46690 > http [ACK] Seq=121 Ack=14481 Win=34816 Len=0 TSV=176391472 TSER=899836768
26    3.412746    137.205.243.76    137.205.194.43    HTTP    HTTP/1.1 200 OK  (text/html)
27    3.412891    137.205.194.43    137.205.243.76    TCP    46690 > http [FIN, ACK] Seq=121 Ack=15517 Win=37696 Len=0 TSV=176391472 TSER=899836768

For comparison, here’s the equivalent packets from a “curl” request for the same URL (which didn’t suffer from any lag)

46    2.056284    137.205.194.43    137.205.243.76    TCP    49927 > http [ACK] Seq=159 Ack=15497 Win=37696 Len=0 TSV=172412227 TSER=898245102
47    2.073105    137.205.194.43    137.205.243.76    TCP    49927 > http [FIN, ACK] Seq=159 Ack=15497 Win=37696 Len=0 TSV=172412231 TSER=898245102
48    2.073361    137.205.243.76    137.205.194.43    TCP    http > 49927 [ACK] Seq=15497 Ack=160 Win=49232 Len=0 TSV=898245104 TSER=172412231
49    2.073414    137.205.243.76    137.205.194.43    TCP    http > 49927 [FIN, ACK] Seq=15497 Ack=160 Win=49232 Len=0 TSV=898245104 TSER=172412231

And now, it’s much more obvious what the problem is. Curl is counting the bytes received from the server, and when it’s got as many as the content-length header said it should expect, the client is closing the connection (packet 47, sending FIN). Nagios, meanwhile, isn’t smart enough to count bytes, so it waits for the server to send a FIN (packet 27), which is delayed by 3-and-a-bit seconds. Apache sends that FIN immediately, but for some reason it doesn’t make it to the client.

Armed with this information, a bit more googling picked up this mailing list entry from a year ago. This describes exactly the same set of symptoms. Apache sends the FIN packet, but it’s caught and buffered by the LSO driver. After a few seconds, the LSO buffer is flushed, the client gets the FIN packet, and everything closes down.
Because LSO is only used for large segments, requesting a page with only a small amount of content doesn’t trigger this behaviour, and we get the FIN immediately.

How to fix? The simplest workaround is to disable LSO:

#  ndd -set /dev/ip ip_lso_outbound 0

(n.b. I’m not sure whether that persists over reboots – it probably needs adding to a file in /kernel/drv somewhere). LSO is beneficial on network-bound servers, but ours isn’t so we’re OK there.

An alternative is to modify the application code to set the TCP PSH flag when closing the connection, but (a) I’m not about to start hacking with apache’s TCP code, (b) it’s not clear to me that this is the right solution, anyway.

A third option, specific to HTTP, is just to use an HTTP client that does it right. Neither nagios nor (it seems) pingdom appear to know how to count bytes and close the connection themselves, but curl does, and so does every browser I’ve tested. So you might just conclude that there’s no need to fix the server itself.


- No comments Not publicly viewable


Add a comment

You are not allowed to comment on this entry as it has restricted commenting permissions.

Most recent entries

Loading…

Search this blog

on twitter...


    Tags

    Not signed in
    Sign in

    Powered by BlogBuilder
    © MMXIX