I've noticed that old versions of traceroute no longer work properly
with the latest kernel. 2.6.13.4 is OK. I've done a bit of strace and am
posting the differences here. These are from a 64 bit kernel using
traceroute 0.6.2 as shipped with most versions of SuSE. I have confirmed
that the same problem is present in a 32 bit kernel on a different
machine. A later traceroute 1.4a12 works properly.
2.6.13.4
poll([{fd=6, events=POLLERR, revents=POLLERR}, {fd=3, events=POLLERR},
{fd=4, events=POLLERR}, {fd=5, events=POLLERR}], 4, 1) = 1
recvmsg(6, {msg_name(16)={sa_family=AF_INET, sin_port=htons(33443),
sin_addr=inet_addr("69.10.132.115")}, msg_iov(0)=[], msg_controllen=80,
{cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=0x1d /* SCM_??? */, ...},
msg_flags=MSG_ERRQUEUE}, MSG_ERRQUEUE) = 0
2.6.14
poll([{fd=3, events=POLLERR, revents=POLLERR}, {fd=4, events=POLLERR},
{fd=5, events=POLLERR}, {fd=6, events=POLLERR}, {fd=7, events=POLLERR},
{fd=8, events=POLLERR}], 6, 1) = 1
recvmsg(3, 0x7fffffa30960, MSG_ERRQUEUE) = -1 EAGAIN (Resource
temporarily unavailable)
....
....
poll([{fd=3, events=POLLERR, revents=POLLERR}, {fd=4, events=POLLERR,
revents=POLLERR}, {fd=5, events=POLLERR}, {fd=6, events=POLLERR}, {fd=7,
events=POLLERR}, {fd=8, events=POLLERR}, {fd=9, events=POLLERR}], 7, 1) = 2
recvmsg(3, 0x7fffffa30960, MSG_ERRQUEUE) = -1 EAGAIN (Resource
temporarily unavailable)
recvmsg(4, 0x7fffffa30960, MSG_ERRQUEUE) = -1 EFAULT (Bad address)
I'm up for more diagnostics if necessary.
Cheers
David
David R wrote:
> I've noticed that old versions of traceroute no longer work properly
> with the latest kernel. 2.6.13.4 is OK. I've done a bit of strace and am
> posting the differences here. These are from a 64 bit kernel using
> traceroute 0.6.2 as shipped with most versions of SuSE. I have confirmed
> that the same problem is present in a 32 bit kernel on a different
> machine. A later traceroute 1.4a12 works properly.
>
> 2.6.13.4
>
> poll([{fd=6, events=POLLERR, revents=POLLERR}, {fd=3, events=POLLERR},
> {fd=4, events=POLLERR}, {fd=5, events=POLLERR}], 4, 1) = 1
> recvmsg(6, {msg_name(16)={sa_family=AF_INET, sin_port=htons(33443),
> sin_addr=inet_addr("69.10.132.115")}, msg_iov(0)=[], msg_controllen=80,
> {cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=0x1d /* SCM_??? */, ...},
> msg_flags=MSG_ERRQUEUE}, MSG_ERRQUEUE) = 0
>
>
> 2.6.14
> poll([{fd=3, events=POLLERR, revents=POLLERR}, {fd=4, events=POLLERR},
> {fd=5, events=POLLERR}, {fd=6, events=POLLERR}, {fd=7, events=POLLERR},
> {fd=8, events=POLLERR}], 6, 1) = 1
> recvmsg(3, 0x7fffffa30960, MSG_ERRQUEUE) = -1 EAGAIN (Resource
> temporarily unavailable)
> ....
> ....
> poll([{fd=3, events=POLLERR, revents=POLLERR}, {fd=4, events=POLLERR,
> revents=POLLERR}, {fd=5, events=POLLERR}, {fd=6, events=POLLERR}, {fd=7,
> events=POLLERR}, {fd=8, events=POLLERR}, {fd=9, events=POLLERR}], 7, 1) = 2
> recvmsg(3, 0x7fffffa30960, MSG_ERRQUEUE) = -1 EAGAIN (Resource
> temporarily unavailable)
> recvmsg(4, 0x7fffffa30960, MSG_ERRQUEUE) = -1 EFAULT (Bad address)
>
> I'm up for more diagnostics if necessary.
>
> Cheers
> David
Smells suspiciously similar to the BIND trouble that's been reported
here in the last few days:
http://lkml.org/lkml/2005/10/29/247
http://lkml.org/lkml/2005/10/30/32
You might like to look into it on the netdev list.
HTH,
Chris
--
Chris Boot
[email protected]
http://www.bootc.net/
Hi David,
* On Mon, Oct 31, 2005 at 07:30 PM (+0000), David R wrote:
> I've noticed that old versions of traceroute no longer work properly
> with the latest kernel. 2.6.13.4 is OK. I've done a bit of strace and am
> posting the differences here. These are from a 64 bit kernel using
> traceroute 0.6.2 as shipped with most versions of SuSE.
I am experiencing exactly the same problem with traceroute-0.6.2
running on SuSE 10.0 together with kernel 2.6.14. The whole thing
happens on a single core dual Opteron machine. I've tested this
kernel version on that machine only, yet.
I've also tried the latest traceroute version (1.0.2) from
ftp://ftp.lst.de/pub/people/okir/traceroute/
and experienced the same behaviour.
Olaf Kirch has just sent me a patch against 2.6.14. It has also
been discussed in NETDEV.
This fixed it for me:
--- a/net/core/datagram.c 2005-11-01 11:38:31.000000000 +0100
+++ b/net/core/datagram.c 2005-11-01 11:38:45.000000000 +0100
@@ -213,6 +213,10 @@
{
int i, err, fraglen, end = 0;
struct sk_buff *next = skb_shinfo(skb)->frag_list;
+
+ if (!len)
+ return 0;
+
next_skb:
fraglen = skb_headlen(skb);
i = -1;
Bye,
Steffen