2008-03-30 05:53:27

by Deomid Ryabkov

[permalink] [raw]
Subject: Send-Q on UDP socket growing steadily - why?

This has started recently and i'm at a loss as to why.
Send-Q on a moderately active UDP socket keeps growing steadily until it
reaches ~128K (wmem_max?) at which point socket writes start failing.
The application in question is standard ntpd from Fedora 7, kernel is
the latest available for the distro, that is
2.6.23.15-80.fc7 #1 SMP Sun Feb 10 16:52:18 EST 2008 x86_64

BIND, running on the same machine, does not exhibit this problem, but
that may be because it does not get nearly as much load as ntpd,
which is part of the pool.ntp.org. That said, load is really not very
high, on the order of 10 QPS, and machine is 99+% idle.
ntpd seems to be doing its usual select-recvmsg-sendto routine, nothing
out of the ordinary.
And yet, Send-Q keeps growing at _exactly_ 360 bytes every 10 seconds,
here's a sample of output shortly after ntpd restart:

# while sleep 1; do netstat -na | grep 177:123; done
udp 0 17280 89.111.168.177:123
0.0.0.0:*
udp 0 17280 89.111.168.177:123
0.0.0.0:*
udp 0 17280 89.111.168.177:123
0.0.0.0:*
udp 0 17280 89.111.168.177:123
0.0.0.0:*
udp 0 17280 89.111.168.177:123
0.0.0.0:*
udp 0 17280 89.111.168.177:123
0.0.0.0:*
udp 0 17280 89.111.168.177:123
0.0.0.0:*
udp 0 17280 89.111.168.177:123
0.0.0.0:*
-------> +360 bytes
udp 0 17640 89.111.168.177:123
0.0.0.0:*
udp 0 17640 89.111.168.177:123
0.0.0.0:*
udp 0 17640 89.111.168.177:123
0.0.0.0:*
udp 0 17640 89.111.168.177:123
0.0.0.0:*
udp 0 17640 89.111.168.177:123
0.0.0.0:*
udp 0 17640 89.111.168.177:123
0.0.0.0:*
udp 0 17640 89.111.168.177:123
0.0.0.0:*
udp 0 17640 89.111.168.177:123
0.0.0.0:*
udp 0 17640 89.111.168.177:123
0.0.0.0:*
udp 0 17640 89.111.168.177:123
0.0.0.0:*
-------> +360 bytes, 10 seconds later
udp 0 18000 89.111.168.177:123
0.0.0.0:*
udp 0 18000 89.111.168.177:123
0.0.0.0:*
udp 0 18000 89.111.168.177:123
0.0.0.0:*
udp 0 18000 89.111.168.177:123
0.0.0.0:*
udp 0 18000 89.111.168.177:123
0.0.0.0:*
udp 0 18000 89.111.168.177:123
0.0.0.0:*
udp 0 18000 89.111.168.177:123
0.0.0.0:*
udp 0 18000 89.111.168.177:123
0.0.0.0:*
udp 0 18000 89.111.168.177:123
0.0.0.0:*
udp 0 18000 89.111.168.177:123
0.0.0.0:*
-------> +360 bytes, 10 seconds later
udp 0 18360 89.111.168.177:123 0.0.0.0:*
[...]
etc, etc.

My understanding is that non-empty send queue for UDP sockets should be
very rare occurence,
maybe under extreme loads. And then there's this steady creep...
What's going on? It almost looks like something is leaking somewhere.

--
Deomid Ryabkov aka Rojer
[email protected]
[email protected]
ICQ: 8025844


2008-03-30 22:03:18

by Denys Vlasenko

[permalink] [raw]
Subject: Re: Send-Q on UDP socket growing steadily - why?

On Sunday 30 March 2008 07:43, Deomid Ryabkov wrote:
> This has started recently and i'm at a loss as to why.
> Send-Q on a moderately active UDP socket keeps growing steadily until it
> reaches ~128K (wmem_max?) at which point socket writes start failing.
> The application in question is standard ntpd from Fedora 7, kernel is
> the latest available for the distro, that is
> 2.6.23.15-80.fc7 #1 SMP Sun Feb 10 16:52:18 EST 2008 x86_64
>
> BIND, running on the same machine, does not exhibit this problem, but
> that may be because it does not get nearly as much load as ntpd,
> which is part of the pool.ntp.org. That said, load is really not very
> high, on the order of 10 QPS, and machine is 99+% idle.
> ntpd seems to be doing its usual select-recvmsg-sendto routine, nothing
> out of the ordinary.

Wher does it (tries to) send these packets?

I managed to reproduced something like this if I try to send
UDPs to nonexistent host on local subnet. Kernel tries to find it,
it emits ARP probes but no reply is coming. As long as kernel
doesn't know how to send queued UDP packet, I see nonempty
queue.

However, in my simple case kernel decides that it is a lost case
in a few seconds, and drops packets (queue len 0).

I imagine whit routing table tricks and/or iptables/arptables
you may end up with situation where kernel is stuck in
"I don't know how to send these packets" mode forever.

You can strace ntpd, get a list of IPs it is trying to send packets
to, and then do "echo TEST | nc -u <ip> 123" for each of these.
will nc's queue become nonempty (at least for some IP)?
--
vda