2006-01-19 13:17:12

by Trond Myklebust

[permalink] [raw]
Subject: Lock recursion in rpc_pipefs+auth_gss...

_______________________________________________
NFSv4 mailing list
[email protected]
http://linux-nfs.org/cgi-bin/mailman/listinfo/nfsv4


Attachments:
linux-2.6.16-03-gss_lock_recursion.dif (6.21 kB)
(No filename) (138.00 B)
Download all attachments

2006-01-19 19:27:11

by Daniel Phillips

[permalink] [raw]
Subject: Re: Lock recursion in rpc_pipefs+auth_gss...

Hi Trond,

You wrote:
> I believe I've finally figured out what is causing the Oopses that Vince
> and Vincent were seeing. It all boils down to a mutex being held after
> the inode is released due to a deadlock situation.

That in itself should not cause an oops. I would think there is a
reference-counting problem still lurking. I'm a little concerned that a
patch like this one may just make the oops a lot rarer without solving it.

Regards,

Daniel


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-01-19 23:09:53

by Myklebust, Trond

[permalink] [raw]
Subject: Re: Re: Lock recursion in rpc_pipefs+auth_gss...

On Thu, 2006-01-19 at 11:27 -0800, Daniel Phillips wrote:

> That in itself should not cause an oops. I would think there is a
> reference-counting problem still lurking. I'm a little concerned that a
> patch like this one may just make the oops a lot rarer without solving it.

The lock recursion is _real_: I've been able to trigger it on
2.6.16-rc1. The only question here is therefore whether or not it
suffices to explain the Oops.

Cheers,
Trond


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-01-20 01:18:53

by Daniel Phillips

[permalink] [raw]
Subject: Re: Re: Lock recursion in rpc_pipefs+auth_gss...

Trond Myklebust wrote:
> On Thu, 2006-01-19 at 11:27 -0800, Daniel Phillips wrote:
>>That in itself should not cause an oops. I would think there is a
>>reference-counting problem still lurking. I'm a little concerned that a
>>patch like this one may just make the oops a lot rarer without solving it.
>
> The lock recursion is _real_: I've been able to trigger it on
> 2.6.16-rc1. The only question here is therefore whether or not it
> suffices to explain the Oops.

I don't see how it could.

I do not doubt the veracity of the lock recursion, but holding the i_sem/mutex
for a long time seems to be one of the things we need to do to trigger the
oops, so this deadlock is our friend. Does it always trigger the oops? Do
you have a recipe handy so we can try it here?

The symptoms we see suggest the pipe inode was freed while somebody was
waiting to get the i_mutex. What prevents this?

Re the lock recursion itself, is there any good reason to serialize upcalls
against downcalls in the first place?

Regards,

Daniel


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs