Solaris IGB driver LSO latency.
Yes, it’s another google-bait entry, because I found it really difficult to find any useful information about this online. Hopefully it’ll help someone else find a solution faster than I did.
We migrated one of our applications from an old Sun V40z server, to a newer X4270. The migration went very smoothly, and the app (which is CPU-bound) was noticeably faster on the shiny new server. All good.
Except, that when I checked nagios, to see what the performance improvement looked like, I saw that every request to the server was taking exactly 3.4 seconds. Pingdom said the same thing, but a simple “time curl …” for the same URL came back in about 20 milliseconds. What gives?
More curiously still, if I changed the URL to one that didn’t return very much content, then the delay went away. Only a page that had more than a few KBs worth of HTML would reveal the problem
Running “strace” on the nagios check_http command line showed the client receiving all of the data, but then just hanging for a while on the last read(). The apache log showed the request completing in 0 seconds (and the log line was printed as soon as the command was executed).
A wireshark trace, though, showed a 3-second gap between packets at the end of the conversation:
23 0.028612 184.108.40.206 220.127.116.11 TCP 46690 > http [ACK] Seq=121 Ack=13033 Win=31936 Len=0 TSV=176390626 TSER=899836429 24 3.412081 18.104.22.168 22.214.171.124 TCP [TCP segment of a reassembled PDU] 25 3.412177 126.96.36.199 188.8.131.52 TCP 46690 > http [ACK] Seq=121 Ack=14481 Win=34816 Len=0 TSV=176391472 TSER=899836768 26 3.412746 184.108.40.206 220.127.116.11 HTTP HTTP/1.1 200 OK (text/html) 27 3.412891 18.104.22.168 22.214.171.124 TCP 46690 > http [FIN, ACK] Seq=121 Ack=15517 Win=37696 Len=0 TSV=176391472 TSER=899836768
For comparison, here’s the equivalent packets from a “curl” request for the same URL (which didn’t suffer from any lag)
46 2.056284 126.96.36.199 188.8.131.52 TCP 49927 > http [ACK] Seq=159 Ack=15497 Win=37696 Len=0 TSV=172412227 TSER=898245102 47 2.073105 184.108.40.206 220.127.116.11 TCP 49927 > http [FIN, ACK] Seq=159 Ack=15497 Win=37696 Len=0 TSV=172412231 TSER=898245102 48 2.073361 18.104.22.168 22.214.171.124 TCP http > 49927 [ACK] Seq=15497 Ack=160 Win=49232 Len=0 TSV=898245104 TSER=172412231 49 2.073414 126.96.36.199 188.8.131.52 TCP http > 49927 [FIN, ACK] Seq=15497 Ack=160 Win=49232 Len=0 TSV=898245104 TSER=172412231
And now, it’s much more obvious what the problem is. Curl is counting the bytes received from the server, and when it’s got as many as the content-length header said it should expect, the client is closing the connection (packet 47, sending FIN). Nagios, meanwhile, isn’t smart enough to count bytes, so it waits for the server to send a FIN (packet 27), which is delayed by 3-and-a-bit seconds. Apache sends that FIN immediately, but for some reason it doesn’t make it to the client.
Armed with this information, a bit more googling picked up this mailing list entry from a year ago. This describes exactly the same set of symptoms. Apache sends the FIN packet, but it’s caught and buffered by the LSO driver. After a few seconds, the LSO buffer is flushed, the client gets the FIN packet, and everything closes down.
Because LSO is only used for large segments, requesting a page with only a small amount of content doesn’t trigger this behaviour, and we get the FIN immediately.
How to fix? The simplest workaround is to disable LSO:
# ndd -set /dev/ip ip_lso_outbound 0
(n.b. I’m not sure whether that persists over reboots – it probably needs adding to a file in /kernel/drv somewhere). LSO is beneficial on network-bound servers, but ours isn’t so we’re OK there.
An alternative is to modify the application code to set the TCP PSH flag when closing the connection, but (a) I’m not about to start hacking with apache’s TCP code, (b) it’s not clear to me that this is the right solution, anyway.
A third option, specific to HTTP, is just to use an HTTP client that does it right. Neither nagios nor (it seems) pingdom appear to know how to count bytes and close the connection themselves, but curl does, and so does every browser I’ve tested. So you might just conclude that there’s no need to fix the server itself.