LinuxLists.cc - [RFC][PATCH] f2fs: Avoid print false deadlock messages.

2013-05-15 06:59:10

Subject: [RFC][PATCH] f2fs: Avoid print false deadlock messages.

When mounted the f2fs, kernel will print the following messages:

[ 105.533038] =============================================
[ 105.533065] [ INFO: possible recursive locking detected ]
[ 105.533088] 3.10.0-rc1+ #101 Not tainted
[ 105.533105] ---------------------------------------------
[ 105.533127] mount/5833 is trying to acquire lock:
[ 105.533147] (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
[ 105.533204]
[ 105.533204] but task is already holding lock:
[ 105.533228] (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
[ 105.533278]
[ 105.533278] other info that might help us debug this:
[ 105.533305] Possible unsafe locking scenario:
[ 105.533305]
[ 105.533329] CPU0
[ 105.533341] ----
[ 105.533353] lock(&sbi->fs_lock[i]);
[ 105.533373] lock(&sbi->fs_lock[i]);
[ 105.533394]
[ 105.533394] *** DEADLOCK ***
[ 105.533394]
[ 105.533419] May be due to missing lock nesting notation
[ 105.533419]
[ 105.533447] 3 locks held by mount/5833:
[ 105.533463] #0: (&type->s_umount_key#23/1){+.+.+.}, at: [<ffffffff81167b2f>] sget+0x2af/0x620
[ 105.533519] #1: (&sbi->cp_mutex){+.+...}, at: [<ffffffffa0201745>] write_checkpoint+0x55/0xaf0 [f2fs]
[ 105.533573] #2: (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
[ 105.533626]
[ 105.533626] stack backtrace:
[ 105.533647] CPU: 0 PID: 5833 Comm: mount Not tainted 3.10.0-rc1+ #101
[ 105.533673] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015 11/09/2011
[ 105.533714] ffffffff82240690 ffff8800a6f759d0 ffffffff8168286a ffff8800a6f75ac0
[ 105.533754] ffffffff810a03e3 0000000000000002 ffff8800a1b7a3f0 0000000000000000
[ 105.533794] ffff8800a6f75b28 ffffffff812c376e ffff8800a1b7a3f0 0000000000000000
[ 105.533833] Call Trace:
[ 105.533848] [<ffffffff8168286a>] dump_stack+0x19/0x1b
[ 105.533872] [<ffffffff810a03e3>] __lock_acquire+0x733/0x1ee0
[ 105.533898] [<ffffffff812c376e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 105.533926] [<ffffffff8168c65c>] ? retint_restore_args+0xe/0xe
[ 105.533951] [<ffffffff810a2205>] lock_acquire+0x95/0x130
[ 105.533977] [<ffffffffa02017a6>] ? write_checkpoint+0xb6/0xaf0 [f2fs]
[ 105.534011] [<ffffffff816878da>] mutex_lock_nested+0x7a/0x3c0
[ 105.534011] [<ffffffffa02017a6>] ? write_checkpoint+0xb6/0xaf0 [f2fs]
[ 105.534011] [<ffffffffa02017a6>] ? write_checkpoint+0xb6/0xaf0 [f2fs]
[ 105.534011] [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
[ 105.534011] [<ffffffff8109e35b>] ? mark_held_locks+0xbb/0x140
[ 105.534011] [<ffffffff81688907>] ?__mutex_unlock_slowpath+0xc7/0x170
[ 105.534011] [<ffffffffa020f43c>] recover_fsync_data+0x43c/0x4f0 [f2fs]
[ 105.534011] [<ffffffff812673bb>] ? security_d_instantiate+0x1b/0x30
[ 105.534011] [<ffffffffa0200795>] f2fs_fill_super+0x845/0x870 [f2fs]
[ 105.534011] [<ffffffff81168104>] mount_bdev+0x1b4/0x1f0
[ 105.534011] [<ffffffffa01fff50>] ? validate_superblock+0x170/0x170 [f2fs]
[ 105.534011] [<ffffffffa01fe745>] f2fs_mount+0x15/0x20 [f2fs]
[ 105.534011] [<ffffffff81168e03>] mount_fs+0x43/0x1b0
[ 105.534011] [<ffffffff8112d9d0>] ? __alloc_percpu+0x10/0x20
[ 105.534011] [<ffffffff81183be4>] vfs_kern_mount+0x74/0x120
[ 105.534011] [<ffffffff811862b9>] do_mount+0x259/0xa20
[ 105.534011] [<ffffffff8112890b>] ? strndup_user+0x5b/0x80
[ 105.534011] [<ffffffff81186b0e>] SyS_mount+0x8e/0xe0
[ 105.534011] [<ffffffff81694b02>] system_call_fastpath+0x16/0x1b

By adding some messages, i found this problem because the gcc
optimizing. For those codes:
> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> mutex_init(&sbi->fs_lock[i]);
The defination of mutex_init is:
> #define mutex_init(mutex)
>do {
>
> static struct lock_class_key __key;
>
>
> __mutex_init((mutex), #mutex, &__key);
>
>} while (0)

Because the optimizing of gcc, there are only one __key rather than
NR_GLOBAL_LOCKS times.

Add there is other problems about lockname.Using 'for()' the lockname is
the same which is '&sbi->fs_lock[i]'.If it met problem about
mutex-operation, it can't find which one.

Although my patch can work,i think it's not best.Because if
NR_GLOBAL_LOCKS changed, we may leak to change this.

BTY, if who know how to avoid optimize, please tell me. Thanks!

Signed-off-by: Jianpeng Ma <[email protected]>
---
fs/f2fs/super.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 8555f7d..ce08b96 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -520,7 +520,6 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
struct buffer_head *raw_super_buf;
struct inode *root;
long err = -EINVAL;
- int i;

/* allocate memory for f2fs-specific super block info */
sbi = kzalloc(sizeof(struct f2fs_sb_info), GFP_KERNEL);
@@ -578,8 +577,16 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
mutex_init(&sbi->gc_mutex);
mutex_init(&sbi->writepages);
mutex_init(&sbi->cp_mutex);
- for (i = 0; i < NR_GLOBAL_LOCKS; i++)
- mutex_init(&sbi->fs_lock[i]);
+
+ mutex_init(&sbi->fs_lock[0]);
+ mutex_init(&sbi->fs_lock[1]);
+ mutex_init(&sbi->fs_lock[2]);
+ mutex_init(&sbi->fs_lock[3]);
+ mutex_init(&sbi->fs_lock[4]);
+ mutex_init(&sbi->fs_lock[5]);
+ mutex_init(&sbi->fs_lock[6]);
+ mutex_init(&sbi->fs_lock[7]);
+
mutex_init(&sbi->node_write);
sbi->por_doing = 0;
spin_lock_init(&sbi->stat_lock);
--
1.8.3.rc1.44.gb387c77

2013-05-15 07:22:30

by Libo Chen

[permalink] [raw]

Subject: Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.

On 2013/5/15 14:58, majianpeng wrote:
> By adding some messages, i found this problem because the gcc
> optimizing. For those codes:
>> > for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>> > mutex_init(&sbi->fs_lock[i]);
> The defination of mutex_init is:
>> > #define mutex_init(mutex)
>> >do {
>> >
>> > static struct lock_class_key __key;
>> >
>> >
>> > __mutex_init((mutex), #mutex, &__key);
>> >
>> >} while (0)
> Because the optimizing of gcc, there are only one __key rather than
> NR_GLOBAL_LOCKS times.
>
> Add there is other problems about lockname.Using 'for()' the lockname is
> the same which is '&sbi->fs_lock[i]'.If it met problem about
> mutex-operation, it can't find which one.
>
> Although my patch can work,i think it's not best.Because if
> NR_GLOBAL_LOCKS changed, we may leak to change this.
>
> BTY, if who know how to avoid optimize, please tell me. Thanks!
>
> Signed-off-by: Jianpeng Ma <[email protected]>
> ---
> fs/f2fs/super.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 8555f7d..ce08b96 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -520,7 +520,6 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> struct buffer_head *raw_super_buf;
> struct inode *root;
> long err = -EINVAL;
> - int i;
>
> /* allocate memory for f2fs-specific super block info */
> sbi = kzalloc(sizeof(struct f2fs_sb_info), GFP_KERNEL);
> @@ -578,8 +577,16 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> mutex_init(&sbi->gc_mutex);
> mutex_init(&sbi->writepages);
> mutex_init(&sbi->cp_mutex);
> - for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> - mutex_init(&sbi->fs_lock[i]);
> +

you can try barrier() or mb() to avoid compile optimization.

2013-05-15 07:33:34

by majianpeng

[permalink] [raw]

Subject: Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.

On 05/15/2013 03:21 PM, Libo Chen wrote:
> On 2013/5/15 14:58, majianpeng wrote:
>> By adding some messages, i found this problem because the gcc
>> optimizing. For those codes:
>>>> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>>>> mutex_init(&sbi->fs_lock[i]);
>> The defination of mutex_init is:
>>>> #define mutex_init(mutex)
>>>> do {
>>>>
>>>> static struct lock_class_key __key;
>>>>
>>>>
>>>> __mutex_init((mutex), #mutex, &__key);
>>>>
>>>> } while (0)
>> Because the optimizing of gcc, there are only one __key rather than
>> NR_GLOBAL_LOCKS times.
>>
>> Add there is other problems about lockname.Using 'for()' the lockname is
>> the same which is '&sbi->fs_lock[i]'.If it met problem about
>> mutex-operation, it can't find which one.
>>
>> Although my patch can work,i think it's not best.Because if
>> NR_GLOBAL_LOCKS changed, we may leak to change this.
>>
>> BTY, if who know how to avoid optimize, please tell me. Thanks!
>>
>> Signed-off-by: Jianpeng Ma <[email protected]>
>> ---
>> fs/f2fs/super.c | 13 ++++++++++---
>> 1 file changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>> index 8555f7d..ce08b96 100644
>> --- a/fs/f2fs/super.c
>> +++ b/fs/f2fs/super.c
>> @@ -520,7 +520,6 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>> struct buffer_head *raw_super_buf;
>> struct inode *root;
>> long err = -EINVAL;
>> - int i;
>>
>> /* allocate memory for f2fs-specific super block info */
>> sbi = kzalloc(sizeof(struct f2fs_sb_info), GFP_KERNEL);
>> @@ -578,8 +577,16 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>> mutex_init(&sbi->gc_mutex);
>> mutex_init(&sbi->writepages);
>> mutex_init(&sbi->cp_mutex);
>> - for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>> - mutex_init(&sbi->fs_lock[i]);
>> +
> you can try barrier() or mb() to avoid compile optimization.
>
Hi,
They aren't work!But thanks very much!

Thanks!
Jianpeng Ma

2013-05-15 08:28:48

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.

On Wed, May 15, 2013 at 02:58:53PM +0800, majianpeng wrote:
> When mounted the f2fs, kernel will print the following messages:
>
> [ 105.533038] =============================================
> [ 105.533065] [ INFO: possible recursive locking detected ]
> [ 105.533088] 3.10.0-rc1+ #101 Not tainted
> [ 105.533105] ---------------------------------------------
> [ 105.533127] mount/5833 is trying to acquire lock:
> [ 105.533147] (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
> [ 105.533204]
> [ 105.533204] but task is already holding lock:
> [ 105.533228] (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
> [ 105.533278]
> [ 105.533278] other info that might help us debug this:
> [ 105.533305] Possible unsafe locking scenario:
> [ 105.533305]
> [ 105.533329] CPU0
> [ 105.533341] ----
> [ 105.533353] lock(&sbi->fs_lock[i]);
> [ 105.533373] lock(&sbi->fs_lock[i]);
> [ 105.533394]
> [ 105.533394] *** DEADLOCK ***
> [ 105.533394]

> By adding some messages, i found this problem because the gcc
> optimizing. For those codes:
> > for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> > mutex_init(&sbi->fs_lock[i]);
> The defination of mutex_init is:
> > #define mutex_init(mutex)
> >do {
> >
> > static struct lock_class_key __key;
> >
> >
> > __mutex_init((mutex), #mutex, &__key);
> >
> >} while (0)
>
> Because the optimizing of gcc, there are only one __key rather than
> NR_GLOBAL_LOCKS times.

Its not a gcc specific optimization, any C compiler would. Its also very
much on purpose.

> Add there is other problems about lockname.Using 'for()' the lockname is
> the same which is '&sbi->fs_lock[i]'.If it met problem about
> mutex-operation, it can't find which one.
>
> Although my patch can work,i think it's not best.Because if
> NR_GLOBAL_LOCKS changed, we may leak to change this.
>
> BTY, if who know how to avoid optimize, please tell me. Thanks!

There isn't. What you typically want to do is annotate the lock site.

In particular it looks like mutex_lock_all() is the offensive piece of
code (horrible function name though; the only redeeming thing being that
f2fs.h isn't likely to be included elsewhere).

One thing you can do here is modify it to look like:

static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
{
int i;

for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
/*
* This is the only time we take multiple fs_lock[]
* instances; the order is immaterial since we
* always hold cp_mutex, which serializes multiple
* such operations.
*/
mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
}
}

That tells the lock validator that it is ok to lock multiple instances
of the fs_lock[i] class because the lock order is guarded by cp_mutex.

While your patch also works, it has multiple down-sides; its easy to get
out of sync when you modify NR_GLOBAL_LOCKS; also it consumes more
static lockdep resources (lockdep has to allocate all its resources
from static arrays since allocating memory also uses locks -- recursive
problem).

2013-05-16 01:16:54

by majianpeng

[permalink] [raw]

Subject: Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.

On 05/15/2013 04:28 PM, Peter Zijlstra wrote:
> On Wed, May 15, 2013 at 02:58:53PM +0800, majianpeng wrote:
>> When mounted the f2fs, kernel will print the following messages:
>>
>> [ 105.533038] =============================================
>> [ 105.533065] [ INFO: possible recursive locking detected ]
>> [ 105.533088] 3.10.0-rc1+ #101 Not tainted
>> [ 105.533105] ---------------------------------------------
>> [ 105.533127] mount/5833 is trying to acquire lock:
>> [ 105.533147] (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
>> [ 105.533204]
>> [ 105.533204] but task is already holding lock:
>> [ 105.533228] (&sbi->fs_lock[i]){+.+...}, at: [<ffffffffa02017a6>] write_checkpoint+0xb6/0xaf0 [f2fs]
>> [ 105.533278]
>> [ 105.533278] other info that might help us debug this:
>> [ 105.533305] Possible unsafe locking scenario:
>> [ 105.533305]
>> [ 105.533329] CPU0
>> [ 105.533341] ----
>> [ 105.533353] lock(&sbi->fs_lock[i]);
>> [ 105.533373] lock(&sbi->fs_lock[i]);
>> [ 105.533394]
>> [ 105.533394] *** DEADLOCK ***
>> [ 105.533394]
>> By adding some messages, i found this problem because the gcc
>> optimizing. For those codes:
>>> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>>> mutex_init(&sbi->fs_lock[i]);
>> The defination of mutex_init is:
>>> #define mutex_init(mutex)
>>> do {
>>>
>>> static struct lock_class_key __key;
>>>
>>>
>>> __mutex_init((mutex), #mutex, &__key);
>>>
>>> } while (0)
>> Because the optimizing of gcc, there are only one __key rather than
>> NR_GLOBAL_LOCKS times.
> Its not a gcc specific optimization, any C compiler would. Its also very
> much on purpose.
>
>> Add there is other problems about lockname.Using 'for()' the lockname is
>> the same which is '&sbi->fs_lock[i]'.If it met problem about
>> mutex-operation, it can't find which one.
>>
>> Although my patch can work,i think it's not best.Because if
>> NR_GLOBAL_LOCKS changed, we may leak to change this.
>>
>> BTY, if who know how to avoid optimize, please tell me. Thanks!
Thanks your answer! Your patch looks good.
> There isn't. What you typically want to do is annotate the lock site.
> In particular it looks like mutex_lock_all() is the offensive piece of
> code (horrible function name though; the only redeeming thing being that
> f2fs.h isn't likely to be included elsewhere).
>
> One thing you can do here is modify it to look like:
>
> static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
> {
> int i;
>
> for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
> /*
> * This is the only time we take multiple fs_lock[]
> * instances; the order is immaterial since we
> * always hold cp_mutex, which serializes multiple
> * such operations.
> */
> mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
> }
> }
>
> That tells the lock validator that it is ok to lock multiple instances
> of the fs_lock[i] class because the lock order is guarded by cp_mutex.
> While your patch also works, it has multiple down-sides; its easy to get
> out of sync when you modify NR_GLOBAL_LOCKS; also it consumes more
> static lockdep resources (lockdep has to allocate all its resources
> from static arrays since allocating memory also uses locks -- recursive
> problem).
>
Yes, but there is a problem if fs_block[] met deadlock. How to find which one?
Because the lock->name is the same.

Thanks!
Jianpeng Ma

2013-05-16 08:43:33

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.

On Thu, May 16, 2013 at 09:16:45AM +0800, majianpeng wrote:

> > There isn't. What you typically want to do is annotate the lock site.
> > In particular it looks like mutex_lock_all() is the offensive piece of
> > code (horrible function name though; the only redeeming thing being that
> > f2fs.h isn't likely to be included elsewhere).
> >
> > One thing you can do here is modify it to look like:
> >
> > static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
> > {
> > int i;
> >
> > for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
> > /*
> > * This is the only time we take multiple fs_lock[]
> > * instances; the order is immaterial since we
> > * always hold cp_mutex, which serializes multiple
> > * such operations.
> > */
> > mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
> > }
> > }
> >
> > That tells the lock validator that it is ok to lock multiple instances
> > of the fs_lock[i] class because the lock order is guarded by cp_mutex.
> > While your patch also works, it has multiple down-sides; its easy to get
> > out of sync when you modify NR_GLOBAL_LOCKS; also it consumes more
> > static lockdep resources (lockdep has to allocate all its resources
> > from static arrays since allocating memory also uses locks -- recursive
> > problem).
> >
> Yes, but there is a problem if fs_block[] met deadlock. How to find which one?
> Because the lock->name is the same.

The most useful part of the lockdep report are the call traces that tell you
where locks where acquired; the names are secondary. That is, while they are at
times helpful in finding the right lock site, they're rarely _that_ important.

Remember, your code will very likely not have the exact number hardcoded either.
It'll be a variable. So having the number in the lockdep output will not help
you find the offending code any sooner.

Suppose there's another site that acquires two fs_block[] locks; currently this
would generate another such warning as this thread started with -- lockdep
doesn't look at lock instances but at classes; and it cannot differentiate
between two locks of the same class and thus reports the possible deadlock.

The way to find the offending code is to look at the "locks held" section of
the lockdep report along with the call traces.

Once you find the way in which the two locks nest the specific numbers are
irrelevant. Your aim then is to prove your locking scheme is indeed deadlock
free and then tell lockdep about it.

2013-05-16 11:34:32

by majianpeng

[permalink] [raw]

Subject: Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.

On 05/16/2013 04:41 PM, Peter Zijlstra wrote:
> On Thu, May 16, 2013 at 09:16:45AM +0800, majianpeng wrote:
>
>>> There isn't. What you typically want to do is annotate the lock site.
>>> In particular it looks like mutex_lock_all() is the offensive piece of
>>> code (horrible function name though; the only redeeming thing being that
>>> f2fs.h isn't likely to be included elsewhere).
>>>
>>> One thing you can do here is modify it to look like:
>>>
>>> static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
>>> {
>>> int i;
>>>
>>> for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
>>> /*
>>> * This is the only time we take multiple fs_lock[]
>>> * instances; the order is immaterial since we
>>> * always hold cp_mutex, which serializes multiple
>>> * such operations.
>>> */
>>> mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
>>> }
>>> }
>>>
>>> That tells the lock validator that it is ok to lock multiple instances
>>> of the fs_lock[i] class because the lock order is guarded by cp_mutex.
>>> While your patch also works, it has multiple down-sides; its easy to get
>>> out of sync when you modify NR_GLOBAL_LOCKS; also it consumes more
>>> static lockdep resources (lockdep has to allocate all its resources
>>> from static arrays since allocating memory also uses locks -- recursive
>>> problem).
>>>
>> Yes, but there is a problem if fs_block[] met deadlock. How to find which one?
>> Because the lock->name is the same.
> The most useful part of the lockdep report are the call traces that tell you
> where locks where acquired; the names are secondary. That is, while they are at
> times helpful in finding the right lock site, they're rarely _that_ important.
>
> Remember, your code will very likely not have the exact number hardcoded either.
> It'll be a variable. So having the number in the lockdep output will not help
> you find the offending code any sooner.
>
> Suppose there's another site that acquires two fs_block[] locks; currently this
> would generate another such warning as this thread started with -- lockdep
> doesn't look at lock instances but at classes; and it cannot differentiate
> between two locks of the same class and thus reports the possible deadlock.
>
> The way to find the offending code is to look at the "locks held" section of
> the lockdep report along with the call traces.
>
> Once you find the way in which the two locks nest the specific numbers are
> irrelevant. Your aim then is to prove your locking scheme is indeed deadlock
> free and then tell lockdep about it.
>
>
Thanks very much! I'll take times to understand.
Can you send a patch about this?

Thanks!
Jianpeng Ma

2013-05-16 18:05:00

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [RFC][PATCH] f2fs: Avoid print false deadlock messages.

On Thu, May 16, 2013 at 07:34:15PM +0800, majianpeng wrote:

> Thanks very much! I'll take times to understand.
> Can you send a patch about this?

You mean the below or something else?

---
Subject: f2fs, lockdep: Annotate mutex_lock_all()

Majianpeng reported a lockdep splat for f2fs. It turns out mutex_lock_all()
acquires an array of locks (in global/local lock style).

Any such operation is always serialized using cp_mutex, therefore there is no
fs_lock[] lock-order issue; tell lockdep about this using the
mutex_lock_nest_lock() primitive.

Reported-by: majianpeng <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
---
fs/f2fs/f2fs.h | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 20aab02..8454277 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -495,9 +495,17 @@ static inline void clear_ckpt_flags(struct f2fs_checkpoint *cp, unsigned int f)

static inline void mutex_lock_all(struct f2fs_sb_info *sbi)
{
- int i = 0;
- for (; i < NR_GLOBAL_LOCKS; i++)
- mutex_lock(&sbi->fs_lock[i]);
+ int i;
+
+ for (i = 0; i < NR_GLOBAL_LOCKS; i++) {
+ /*
+ * This is the only time we take multiple fs_lock[]
+ * instances; the order is immaterial since we
+ * always hold cp_mutex, which serializes multiple
+ * such operations.
+ */
+ mutex_lock_nest_lock(&sbi->fs_lock[i], &sbi->cp_mutex);
+ }
}

static inline void mutex_unlock_all(struct f2fs_sb_info *sbi)