2014-01-03 19:16:16

by Knut Petersen

[permalink] [raw]
Subject: [BUG 3.13.0-rc6] reiserfs possible circular locking dependency

Rebooting after a power failure on an openSuSE 13.1 system
with kernel 3.13.0-rc6 triggered the attached lockdep warning.

cu,
Knut


Attachments:
lockdep_reiser (7.94 kB)

2014-01-03 19:46:33

by Linus Torvalds

[permalink] [raw]
Subject: Re: [BUG 3.13.0-rc6] reiserfs possible circular locking dependency

On Fri, Jan 3, 2014 at 11:16 AM, Knut Petersen
<[email protected]> wrote:
> Rebooting after a power failure on an openSuSE 13.1 system
> with kernel 3.13.0-rc6 triggered the attached lockdep warning.

Hmm. It seems to be that the *normal* sequence should be:

- get i_mutex, call lookup, which gets sbi->lock (reiserfs_write_lock)

but in the mounting path, we have special circumstances.

That finish_unfinished() function does

- reiserfs_write_lock_nested() .
- remove_save_link
- iput(inode) with the write lock held

and that can apparently end up taking i_mutex in open_xa_dir (and then
recursively the write lock, but that's an explicitly recursive lock,
so that part should be ok).

Now, I don't think this can *really* deadlock with the normal order of
operations, because during mounting there is no other process that can
take those in the reverse order (since the filesystem isn't live), but
I do wonder if we should just release the reiserfs write lock over the
iputs. We release it in other parts anyway (like for the quota off)

Jeff, you already touched this exact case in commit d2d0395fd177
("reiserfs: locking, release lock around quota operations") except
that was for those quota operation cases.

Even if it's not a real problem, making lockdep happy sounds like a
good idea. Of course, the trouble is that this code path almost never
gets exercised (which is why this hasn't been noticed earlier), so
testing...

Jeff? Comments?

Linus

2014-01-03 22:04:50

by Jeff Mahoney

[permalink] [raw]
Subject: Re: [BUG 3.13.0-rc6] reiserfs possible circular locking dependency

On 1/3/14, 2:46 PM, Linus Torvalds wrote:
> On Fri, Jan 3, 2014 at 11:16 AM, Knut Petersen
> <[email protected]> wrote:
>> Rebooting after a power failure on an openSuSE 13.1 system
>> with kernel 3.13.0-rc6 triggered the attached lockdep warning.
>
> Hmm. It seems to be that the *normal* sequence should be:
>
> - get i_mutex, call lookup, which gets sbi->lock (reiserfs_write_lock)
>
> but in the mounting path, we have special circumstances.
>
> That finish_unfinished() function does
>
> - reiserfs_write_lock_nested() .
> - remove_save_link
> - iput(inode) with the write lock held
>
> and that can apparently end up taking i_mutex in open_xa_dir (and then
> recursively the write lock, but that's an explicitly recursive lock,
> so that part should be ok).
>
> Now, I don't think this can *really* deadlock with the normal order of
> operations, because during mounting there is no other process that can
> take those in the reverse order (since the filesystem isn't live), but
> I do wonder if we should just release the reiserfs write lock over the
> iputs. We release it in other parts anyway (like for the quota off)
>
> Jeff, you already touched this exact case in commit d2d0395fd177
> ("reiserfs: locking, release lock around quota operations") except
> that was for those quota operation cases.
>
> Even if it's not a real problem, making lockdep happy sounds like a
> good idea. Of course, the trouble is that this code path almost never
> gets exercised (which is why this hasn't been noticed earlier), so
> testing...
>
> Jeff? Comments?

If someone ever invents a time machine, I'd go back to 2004 and tell
myself to fight harder to make a reiserfs v3.7 with real extended
attribute items. This code will haunt me to my death.

Anyway, yeah. The right thing here is to drop the lock for the iput.
More than that would be ok too. finish_unfinished happens when the file
system goes read-write and that includes the remount path. There can be
other users of the file system but it would be a recursive acquire so we
wouldn't actually deadlock there.

I'll work something up over the weekend or on Monday.

-Jeff

--
Jeff Mahoney
SUSE Labs


Attachments:
signature.asc (841.00 B)
OpenPGP digital signature

2014-01-15 23:32:33

by Jeff Mahoney

[permalink] [raw]
Subject: Re: [BUG 3.13.0-rc6] reiserfs possible circular locking dependency

On 1/3/14, 5:04 PM, Jeff Mahoney wrote:
> On 1/3/14, 2:46 PM, Linus Torvalds wrote:
>> On Fri, Jan 3, 2014 at 11:16 AM, Knut Petersen
>> <[email protected]> wrote:
>>> Rebooting after a power failure on an openSuSE 13.1 system
>>> with kernel 3.13.0-rc6 triggered the attached lockdep warning.
>>
>> Hmm. It seems to be that the *normal* sequence should be:
>>
>> - get i_mutex, call lookup, which gets sbi->lock (reiserfs_write_lock)
>>
>> but in the mounting path, we have special circumstances.
>>
>> That finish_unfinished() function does
>>
>> - reiserfs_write_lock_nested() .
>> - remove_save_link
>> - iput(inode) with the write lock held
>>
>> and that can apparently end up taking i_mutex in open_xa_dir (and then
>> recursively the write lock, but that's an explicitly recursive lock,
>> so that part should be ok).
>>
>> Now, I don't think this can *really* deadlock with the normal order of
>> operations, because during mounting there is no other process that can
>> take those in the reverse order (since the filesystem isn't live), but
>> I do wonder if we should just release the reiserfs write lock over the
>> iputs. We release it in other parts anyway (like for the quota off)
>>
>> Jeff, you already touched this exact case in commit d2d0395fd177
>> ("reiserfs: locking, release lock around quota operations") except
>> that was for those quota operation cases.
>>
>> Even if it's not a real problem, making lockdep happy sounds like a
>> good idea. Of course, the trouble is that this code path almost never
>> gets exercised (which is why this hasn't been noticed earlier), so
>> testing...
>>
>> Jeff? Comments?
>
> If someone ever invents a time machine, I'd go back to 2004 and tell
> myself to fight harder to make a reiserfs v3.7 with real extended
> attribute items. This code will haunt me to my death.
>
> Anyway, yeah. The right thing here is to drop the lock for the iput.
> More than that would be ok too. finish_unfinished happens when the file
> system goes read-write and that includes the remount path. There can be
> other users of the file system but it would be a recursive acquire so we
> wouldn't actually deadlock there.
>
> I'll work something up over the weekend or on Monday.

As a quick update here, I do have patches to fix this particular issue
but it's tough to depend on xfstests to detect regressions when xfstests
causes other lockdep issues. I'm taking this an an opportunity to clean
up the locking enough to pass xfstests.

-Jeff

--
Jeff Mahoney
SUSE Labs


Attachments:
signature.asc (841.00 B)
OpenPGP digital signature