Hi,
I have a somewhat unclear bug report against the 2.6 kernel, where
the sysrq output seems to indicate there's a deadlock on sv_lock
somewhere. There is a stuck process is in tcp_data_ready, but
unfortunately the sysrq output is not complete, so I don't know what
the other CPUs were doing at the time.
Looking at the code, I did notice however that some new code was
added recently (svc_defer, svc_revisit) that uses spin_lock instead of
spin_lock_bh when grabbing the sv_lock.
So it seems there's potential for deadlock if TCP data arrives while
one of these new functions hold sv_lock.
Comments?
Olaf
--
Olaf Kirch | The Hardware Gods hate me.
[email protected] |
---------------+
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Hello Olaf,
> I have a somewhat unclear bug report against the 2.6 kernel, where
> the sysrq output seems to indicate there's a deadlock on sv_lock
> somewhere. There is a stuck process is in tcp_data_ready, but
> unfortunately the sysrq output is not complete, so I don't know what
> the other CPUs were doing at the time.
>
> Looking at the code, I did notice however that some new code was
> added recently (svc_defer, svc_revisit) that uses spin_lock instead of
> spin_lock_bh when grabbing the sv_lock.
>
> So it seems there's potential for deadlock if TCP data arrives while
> one of these new functions hold sv_lock.
>
> Comments?
When we tried to use 2.6.7 it happend two time that the machine was working
but the clients simply could not reach the nfs-server.
We didn't have much time to care about it and simply rebooted the server,
however, before doing that we issued sysrq+t.
I've attached the full trace, maybe its related to your trace?
Cheers,
Bernd
PS: Unfortunality I can't read those traces, oopses, etc., is there any
documentation how to read it?