2005-02-24 10:19:02

by Jörn Nettingsmeier

[permalink] [raw]
Subject: FUTEX deadlock in ping?

hi !


disclaimer: i'm not a kernel guy ;)

after reading the FUTEX deadlock thread
(http://thread.gmane.org/gmane.linux.kernel/280900), i was wondering:

ever since moving to ldap for passwd/group/shadow/hosts lookup, ping to
a non-reachable host just freezes up and never returns:

spunk:~ # strace ping herrnilsson
execve("/bin/ping", ["ping", "herrnilsson"], [/* 61 vars */]) = 0
uname({sys="Linux", node="spunk", ...}) = 0
brk(0) = 0x8063000
...
...
munmap(0x40504000, 4096) = 0
brk(0x80a5000) = 0x80a5000
uname({sys="Linux", node="spunk", ...}) = 0
futex(0x401540f4, FUTEX_WAIT, 2, NULL
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

is this one related to the FUTEX problem olof described?


best,

j?rn


ps: i'd appreciate being cc:ed on replies. thanks.



for the record:

spunk:~ # uname -a
Linux spunk 2.6.8-24.11-smp #1 SMP Fri Jan 14 13:01:26 UTC 2005 i686
i686 i386 GNU/Linux

SuSE 9.2

problem happens also on ia32 UP (same version as before) and amd64 UP
(2.6.11-rc4-bk7)

ldap lookup is ok, for instance

spunk:~ # getent hosts herrnilsson
192.168.0.3 herrnilsson.villakunterbunt.netz herrnilsson

traceroute and others work as well.

on an otherwise identical system without ldap, ping correctly gives
"unreachable" messages.


2005-02-24 12:08:52

by bert hubert

[permalink] [raw]
Subject: Re: FUTEX deadlock in ping?

On Thu, Feb 24, 2005 at 11:14:45AM +0100, J?rn Nettingsmeier wrote:

> ever since moving to ldap for passwd/group/shadow/hosts lookup, ping to
> a non-reachable host just freezes up and never returns:
>
> spunk:~ # strace ping herrnilsson
> execve("/bin/ping", ["ping", "herrnilsson"], [/* 61 vars */]) = 0
> uname({sys="Linux", node="spunk", ...}) = 0
> brk(0) = 0x8063000
> ...
> ...
> munmap(0x40504000, 4096) = 0
> brk(0x80a5000) = 0x80a5000
> uname({sys="Linux", node="spunk", ...}) = 0
> futex(0x401540f4, FUTEX_WAIT, 2, NULL
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Try ping -n. This is most likely something else.

--
http://www.PowerDNS.com Open source, database driven DNS Software
http://netherlabs.nl Open and Closed source services

2005-02-24 13:44:34

by Joern Nettingsmeier

[permalink] [raw]
Subject: Re: FUTEX deadlock in ping?

hi bert!

thanks for your reply.

bert hubert wrote:
> On Thu, Feb 24, 2005 at 11:14:45AM +0100, J?rn Nettingsmeier wrote:
>
>
>>ever since moving to ldap for passwd/group/shadow/hosts lookup, ping to
>>a non-reachable host just freezes up and never returns:
>>
>>spunk:~ # strace ping herrnilsson
>>execve("/bin/ping", ["ping", "herrnilsson"], [/* 61 vars */]) = 0
>>uname({sys="Linux", node="spunk", ...}) = 0
>>brk(0) = 0x8063000
>>...
>>...
>>munmap(0x40504000, 4096) = 0
>>brk(0x80a5000) = 0x80a5000
>>uname({sys="Linux", node="spunk", ...}) = 0
>>futex(0x401540f4, FUTEX_WAIT, 2, NULL
>>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>
> Try ping -n. This is most likely something else.
>

hmm. ping -n would still have to look up the hostname, and indeed the
strace is identical (hanging). but when i ping the ip (this is probably
what you had in mind), it works as expected.

anyway, why would a process block endlessly in a futex?


2005-02-24 15:00:37

by Olof Johansson

[permalink] [raw]
Subject: Re: FUTEX deadlock in ping?

On Thu, Feb 24, 2005 at 11:14:45AM +0100, J?rn Nettingsmeier wrote:

> futex(0x401540f4, FUTEX_WAIT, 2, NULL
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> is this one related to the FUTEX problem olof described?

As bert said, it's likely something else. Is the process killable, and
does "ps aux" complete? If so, then this is a different problem.


-Olof

2005-02-24 15:39:33

by Jörn Nettingsmeier

[permalink] [raw]
Subject: Re: FUTEX deadlock in ping?

Olof Johansson wrote:
> On Thu, Feb 24, 2005 at 11:14:45AM +0100, J?rn Nettingsmeier wrote:
>
>
>>futex(0x401540f4, FUTEX_WAIT, 2, NULL
>>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>>is this one related to the FUTEX problem olof described?
>
>
> As bert said, it's likely something else. Is the process killable, and
> does "ps aux" complete?

yes and yes.

> If so, then this is a different problem.

too bad. i thought i had finally found a clue.. sorry for the noise, and
many thanks for explaining!