2005-10-30 02:35:37

by Steinar H. Gunderson

[permalink] [raw]
Subject: BIND hangs with 2.6.14

[Please Cc me on any replies, I'm not subscribed]

Hi,

We upgraded one of our servers (single Opteron, running 64-bit kernel but
32-bit userland) from 2.6.11.9 to 2.6.14 (with the additional NFS patches,
but that shouldn't really matter) today, and now BIND seems to hang every few
hours. (Everything on the machine except for the kernel is Debian sarge, so
we're using BIND 9.2.4 and glibc 2.3.2, with NPTL.) I'm unsure what's really
happening, but it doesn't respond to any requests at all, a plain strace on
the process gives nothing, ltrace gives nothing, and it doesn't use any CPU.

gdb shows four threads, one of them in sigsuspend, another in select, a third
in __JCR_LIST__ and the fourth just showing garbage. I'm sorry I can't be
more specific here -- I can't find a reliable way to provoke it into this
hanging behaviour, but I've got an strace running now to at least have _some_
tracking information when it goes awry.

Does anybody have a clue as of what might break it in this way? I've skimmed
the changelogs, but couldn't find anything obvious.

/* Steinar */
--
Homepage: http://www.sesse.net/


2005-10-30 10:11:52

by bert hubert

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 03:35:57AM +0100, Steinar H. Gunderson wrote:

> We upgraded one of our servers (single Opteron, running 64-bit kernel but
> 32-bit userland) from 2.6.11.9 to 2.6.14 (with the additional NFS patches,
> but that shouldn't really matter) today, and now BIND seems to hang every few
> hours. (Everything on the machine except for the kernel is Debian sarge, so
> we're using BIND 9.2.4 and glibc 2.3.2, with NPTL.) I'm unsure what's really
> happening, but it doesn't respond to any requests at all, a plain strace on
> the process gives nothing, ltrace gives nothing, and it doesn't use any CPU.

Is BIND touching anything on NFS?

--
http://www.PowerDNS.com Open source, database driven DNS Software
http://netherlabs.nl Open and Closed source services

2005-10-30 10:44:39

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 02:44:15AM +0000, Paul Erkkila wrote:
> You check dmesg for kernel tracebacks in udp6_receive? or somewhere
> there about? I've hit a bug there I havn't had a chance to debug yet.

No kernel tracebacks, sorry.

/* Steinar */
--
Homepage: http://www.sesse.net/

2005-10-30 10:45:05

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 11:11:48AM +0100, bert hubert wrote:
>> We upgraded one of our servers (single Opteron, running 64-bit kernel but
>> 32-bit userland) from 2.6.11.9 to 2.6.14 (with the additional NFS patches,
>> but that shouldn't really matter) today, and now BIND seems to hang every few
>> hours. (Everything on the machine except for the kernel is Debian sarge, so
>> we're using BIND 9.2.4 and glibc 2.3.2, with NPTL.) I'm unsure what's really
>> happening, but it doesn't respond to any requests at all, a plain strace on
>> the process gives nothing, ltrace gives nothing, and it doesn't use any CPU.
> Is BIND touching anything on NFS?

No, it isn't.

/* Steinar */
--
Homepage: http://www.sesse.net/

2005-10-30 11:36:27

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 12:00:21PM +0100, bert hubert wrote:
> Ok - no further ideas then. I think people would be interested in the strace
> you describe as 'garbage'.

The strace is a little big (1.5GB), but sure, people can have it if they're
interested. OTOH, I just noticed that syslogd shows:

Oct 30 09:55:50 cirkus named[13364]: errno2result.c:109: unexpected error:
Oct 30 09:55:50 cirkus named[13364]: unable to convert errno to isc_result: 14: Bad address
Oct 30 09:55:50 cirkus named[13364]: UDP client handler shutting down due to fatal receive error: unexpected error

...and after that, everything seems to crash and burn. The related strace
call is:

[pid 13365] recvmsg(22, 0x561329b0, 0) = -1 EFAULT (Bad address)

Might this be a BIND bug instead? In that case, why doesn't it show up with
2.6.11.9? I've restarted BIND now without NPTL, to see if it might be
thread-related.

> Alternatively, give PowerDNS 2.9.19 a try :-) It reads most bind named.conf
> files directly, especially if you don't do views, dynamic updates etc.

We're not going to change DNS server software anytime soon on our production
servers, but thanks for the tip :-)

/* Steinar */
--
Homepage: http://www.sesse.net/

2005-10-30 11:45:41

by bert hubert

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 12:36:51PM +0100, Steinar H. Gunderson wrote:
> [pid 13365] recvmsg(22, 0x561329b0, 0) = -1 EFAULT (Bad address)
>
> Might this be a BIND bug instead? In that case, why doesn't it show up with
> 2.6.11.9? I've restarted BIND now without NPTL, to see if it might be
> thread-related.

Check if the address passed, 0x561329b0, is very different from addresses
passed during regular operations. The error the kernel returns basically
says that this address is bogus, which it might be, but if more or less the
same address worked previously chances are that the kernel is confused

Switching to/from NPTL might very well change the address btw.

Good luck.

--
http://www.PowerDNS.com Open source, database driven DNS Software
http://netherlabs.nl Open and Closed source services

2005-10-30 11:54:07

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 12:45:38PM +0100, bert hubert wrote:
> Check if the address passed, 0x561329b0, is very different from addresses
> passed during regular operations. The error the kernel returns basically
> says that this address is bogus, which it might be, but if more or less the
> same address worked previously chances are that the kernel is confused

Yes, I know what EFAULT means. The interesting part is that I can find one
single call to recvmsg() in the UNIX part of BIND, and there the argument is
on the stack...

strace doesn't show the address except when something fails, but these are
the distinct calls to recvmsg with given addresses (sorted):

[pid 13365] recvmsg(20, 0x56131f60, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(22, 0x56131f60, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(22, 0x56132980, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(22, 0x561329b0, 0) = -1 EFAULT (Bad address)
[pid 13365] recvmsg(23, 0x56131ef0, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(23, 0x56132540, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(23, 0x56132870, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(23, 0x561328a0, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(23, 0x56132940, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(23, 0x561329b0, 0) = -1 ECONNRESET (Connection reset by peer)
[pid 13365] recvmsg(26, 0x56132910, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(28, 0x561324f0, 0) = -1 EFAULT (Bad address)
[pid 13365] recvmsg(28, 0x56132840, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13365] recvmsg(28, 0x56132840, 0) = -1 EFAULT (Bad address)
[pid 13365] recvmsg(29, 0x56132840, 0) = -1 EAGAIN (Resource temporarily unavailable)

Note that there's multiple EFAULTs in there, and some even go against an
address that was valid at some point... I think there's something fishy in
here.

/* Steinar */
--
Homepage: http://www.sesse.net/

2005-10-30 14:22:29

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 12:45:38PM +0100, bert hubert wrote:
> Switching to/from NPTL might very well change the address btw.

FWIW, I tried without NPTL, and it gave the same problem after an hour or so.

I've compiled some extra debugging code (printing the msg_hdr structure if
recvmsg() should fail with a hard error) into named, so I'm waiting for the
problem to manifest itself again.

/* Steinar */
--
Homepage: http://www.sesse.net/

2005-10-30 14:49:04

by bert hubert

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 03:22:23PM +0100, Steinar H. Gunderson wrote:

> I've compiled some extra debugging code (printing the msg_hdr structure if
> recvmsg() should fail with a hard error) into named, so I'm waiting for the
> problem to manifest itself again.

Sounds like a wise move, the msg_hdr structure again contains pointers which
might point to invalid memory.

What you could try to do is master git and get it to output a diff of
relevant files from your previous version to 2.6.14. Sadly, that might be a
lot of files.. If you start tracing from udp_recvmsg you quickly end up in a
heap of files which might be generating EFAULT.

Good luck!

--
http://www.PowerDNS.com Open source, database driven DNS Software
http://netherlabs.nl Open and Closed source services

2005-10-30 18:54:41

by Ray Lee

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

> We upgraded one of our servers (single Opteron, running 64-bit kernel but
> 32-bit userland) from 2.6.11.9 to 2.6.14 (with the additional NFS patches,
> but that shouldn't really matter) today, and now BIND seems to hang every few
> hours. (Everything on the machine except for the kernel is Debian sarge, so
> we're using BIND 9.2.4 and glibc 2.3.2, with NPTL.) I'm unsure what's really
> happening, but it doesn't respond to any requests at all, a plain strace on
> the process gives nothing, ltrace gives nothing, and it doesn't use any CPU.

It seems a not unreasonable assumption that 2.6.13 works okay, or
there would have been reports of unhappiness (though that is a pure
assumption). Since it only takes a few hours to get the problem to
occur, is it feasible to try to bisect the problem space by testing
some snapshots between 2.6.13 and 2.6.14?

Ray

2005-10-30 19:05:42

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 11:54:37AM -0700, Ray Lee wrote:
> It seems a not unreasonable assumption that 2.6.13 works okay, or
> there would have been reports of unhappiness (though that is a pure
> assumption). Since it only takes a few hours to get the problem to
> occur, is it feasible to try to bisect the problem space by testing
> some snapshots between 2.6.13 and 2.6.14?

Unfortunately, the machine does quite a bit of other work apart from BIND, so
unless somebody can reproduce this on another machine, it will be a bit
difficult.

I have a run going in valgrind now to see if it can find anything bad about
the pointers in the msg_hdr structure (the structure itself appears to be OK,
judging from my printf-debugging); it's been going for a few hours, so I hope
it will be entering its zombie mode now soon :-)

/* Steinar */
--
Homepage: http://www.sesse.net/

2005-10-30 21:40:55

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 08:05:38PM +0100, Steinar H. Gunderson wrote:
> I have a run going in valgrind now to see if it can find anything bad about
> the pointers in the msg_hdr structure (the structure itself appears to be OK,
> judging from my printf-debugging); it's been going for a few hours, so I hope
> it will be entering its zombie mode now soon :-)

I finally caught it with gdb, after inserting some debug probes. Excerpts
(removed a few syntax errors):

(gdb) bt
#0 0xffffe405 in __kernel_vsyscall ()
#1 0x55840885 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0x55842002 in abort () from /lib/tls/i686/cmov/libc.so.6
#3 0x557e1383 in doio_recv (sock=0x80d1230, dev=0x8197ac8) at socket.c:917
#4 0x557e41d5 in internal_recv (me=0x81975a0, ev=0x80d1284) at socket.c:2012
#5 0x557d6259 in dispatch (manager=0x8094960) at task.c:855
#6 0x557d64c7 in run (uap=0x8094960) at task.c:998
#7 0x5580cca3 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#8 0x558eff5a in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) up
#1 0x55840885 in raise () from /lib/tls/i686/cmov/libc.so.6
(gdb)
#2 0x55842002 in abort () from /lib/tls/i686/cmov/libc.so.6
(gdb)
#3 0x557e1383 in doio_recv (sock=0x80d1230, dev=0x8197ac8) at socket.c:917
917 abort();
(gdb) print msghdr
$1 = {msg_name = 0x8197b14, msg_namelen = 28, msg_iov = 0x561519e0, msg_iovlen = 1, msg_control = 0x809a810, msg_controllen = 52, msg_flags = 0}
(gdb) print msghdr.msg_name
$2 = (void *) 0x8197b14
(gdb) print (char *)msghdr.msg_name
$3 = 0x8197b14 ""
(gdb) print ((char *)msghdr.msg_name)[0]
$4 = 0 '\0'
(gdb) print ((char *)msghdr.msg_name)[27]
$5 = 0 '\0'
(gdb) print ((char *)msghdr.msg_control)[0]
$6 = 20 '\024'
(gdb) print ((char *)msghdr.msg_control)[51]
$7 = -66 '¾'
(gdb) print *(msghdr.msg_iov)
$9 = {iov_base = 0x8171208, iov_len = 4096}
(gdb) print ((char*)msghdr.msg_iov.iov_base)[0]
$10 = -30 'â'
(gdb) print ((char*)msghdr.msg_iov.iov_base)[4095]
$11 = -66 '¾'
(gdb) print sock->fd
$12 = 22
(gdb) print recvmsg(sock->fd, &msghdr, 0)
$14 = 42

IOW, the call that just failed suddenly worked in the debugger. I can't
really believe this is a BIND bug anymore... I'm lost here. Anyone?

/* Steinar */
--
Homepage: http://www.sesse.net/

2005-10-31 07:20:28

by Lionel Elie Mamane

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Sun, Oct 30, 2005 at 12:54:30PM +0100, Steinar H. Gunderson wrote:
> On Sun, Oct 30, 2005 at 12:45:38PM +0100, bert hubert wrote:

>> Check if the address passed, 0x561329b0, is very different from addresses
>> passed during regular operations.

> strace doesn't show the address except when something fails, but these are
> the distinct calls to recvmsg with given addresses (sorted):

Try "strace -e verbose=none" to always get an address.

--
Lionel

2005-10-31 10:18:06

by Stefan Schmidt,,,

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

In linux.kernel, you wrote:
> Unfortunately, the machine does quite a bit of other work apart from BIND, so
> unless somebody can reproduce this on another machine, it will be a bit
> difficult.

I saw the same error with 2.6.14-rc5 and filed bug #5505 against it. I
was also able to reproduce the behaviour using the same kernel and same
library/compiler versions on another machine using the same set of zones
and tcpreplay to simulate the query load. Our machine gets around 10
queries per second at ~40k slave zones. BIND version is 9.3.2v1 which
runs smoothly on our in-production nameserver with kernel 2.6.13.4 now.

netdev and the usual suspects are aware of the problem.

Stefan

2005-10-31 10:28:10

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Mon, Oct 31, 2005 at 08:20:20AM +0100, Lionel Elie Mamane wrote:
>> strace doesn't show the address except when something fails, but these are
>> the distinct calls to recvmsg with given addresses (sorted):
> Try "strace -e verbose=none" to always get an address.

Well, the address was definitely valid (as the gdb session showed). Anyhow,
the machine is down to 2.6.13.4 now (otherwise same .config file, etc.), and
it seems to have fixed the problem (>10 hours with no problem so far).

/* Steinar */
--
Homepage: http://www.sesse.net/

2005-10-31 16:52:31

by Andrew Walrond

[permalink] [raw]
Subject: Re: BIND hangs with 2.6.14

On Monday 31 October 2005 10:17, Stefan Schmidt,,, wrote:
> In linux.kernel, you wrote:
> > Unfortunately, the machine does quite a bit of other work apart from
> > BIND, so unless somebody can reproduce this on another machine, it will
> > be a bit difficult.
>
> I saw the same error with 2.6.14-rc5 and filed bug #5505 against it. I

I'm suprised 2.6.14 went out the door without this being resolved. I'm glad I
spotted this thread in time to avoid upgrading my servers! I guess it was
reported on the cusp of the release though.

> was also able to reproduce the behaviour using the same kernel and same
> library/compiler versions on another machine using the same set of zones
> and tcpreplay to simulate the query load. Our machine gets around 10
> queries per second at ~40k slave zones. BIND version is 9.3.2v1 which
> runs smoothly on our in-production nameserver with kernel 2.6.13.4 now.
>
> netdev and the usual suspects are aware of the problem.

I guess I'll follow this on bugzilla.

Andrew Walrond