2019-11-06 22:06:15

by Xiaochen Shen

[permalink] [raw]
Subject: [PATCH] x86/resctrl: Fix potential lockdep warning

rdtgroup_cpus_write() and mkdir_rdt_prepare() call
rdtgroup_kn_lock_live() -> kernfs_to_rdtgroup() to get 'rdtgrp', and
then call rdt_last_cmd_xxx() functions which will check if
rdtgroup_mutex is held/requires its caller to hold rdtgroup_mutex.
But if 'rdtgrp' returned from kernfs_to_rdtgroup() is NULL,
rdtgroup_mutex is not held and calling rdt_last_cmd_xxx() will result
in a lockdep warning.

Remove rdt_last_cmd_xxx() in these two paths. Just returning error
should be sufficient to report to the user that the entry doesn't exist
any more.

Fixes: 94457b36e8a5 ("x86/intel_rdt: Add diagnostics when writing the cpus file")
Fixes: cfd0f34e4cd5 ("x86/intel_rdt: Add diagnostics when making directories")
Signed-off-by: Xiaochen Shen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index a46dee8e78db..2e3b06d6bbc6 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -461,10 +461,8 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
}

rdtgrp = rdtgroup_kn_lock_live(of->kn);
- rdt_last_cmd_clear();
if (!rdtgrp) {
ret = -ENOENT;
- rdt_last_cmd_puts("Directory was removed\n");
goto unlock;
}

@@ -2648,10 +2646,8 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
int ret;

prdtgrp = rdtgroup_kn_lock_live(prgrp_kn);
- rdt_last_cmd_clear();
if (!prdtgrp) {
ret = -ENODEV;
- rdt_last_cmd_puts("Directory was removed\n");
goto out_unlock;
}

--
1.8.3.1


2019-11-13 11:48:24

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: Fix potential lockdep warning

On Thu, Nov 07, 2019 at 06:36:36AM +0800, Xiaochen Shen wrote:
> rdtgroup_cpus_write() and mkdir_rdt_prepare() call
> rdtgroup_kn_lock_live() -> kernfs_to_rdtgroup() to get 'rdtgrp', and
> then call rdt_last_cmd_xxx() functions which will check if

Write those names like this:

rdt_last_cmd_{clear,puts,...} but not with an "xxx" which confuses
people unfamiliar with the code.

> rdtgroup_mutex is held/requires its caller to hold rdtgroup_mutex.
> But if 'rdtgrp' returned from kernfs_to_rdtgroup() is NULL,
> rdtgroup_mutex is not held and calling rdt_last_cmd_xxx() will result
> in a lockdep warning.

That's more of a self-incurred lockdep warning. You can't be calling
lockdep_assert_held() after a function which doesn't always grab the
mutex. Looks like the design needs changing here...

> Remove rdt_last_cmd_xxx() in these two paths. Just returning error
> should be sufficient to report to the user that the entry doesn't exist
> any more.

... or that.

In any case, you should consider fixing such patterns in the code as it
looks sub-optimal from where I'm standing.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Subject: [tip: x86/urgent] x86/resctrl: Fix potential lockdep warning

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: c8eafe1495303bfd0eedaa8156b1ee9082ee9642
Gitweb: https://git.kernel.org/tip/c8eafe1495303bfd0eedaa8156b1ee9082ee9642
Author: Xiaochen Shen <[email protected]>
AuthorDate: Thu, 07 Nov 2019 06:36:36 +08:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 13 Nov 2019 12:34:44 +01:00

x86/resctrl: Fix potential lockdep warning

rdtgroup_cpus_write() and mkdir_rdt_prepare() call
rdtgroup_kn_lock_live() -> kernfs_to_rdtgroup() to get 'rdtgrp', and
then call the rdt_last_cmd_{clear,puts,...}() functions which will check
if rdtgroup_mutex is held/requires its caller to hold rdtgroup_mutex.

But if 'rdtgrp' returned from kernfs_to_rdtgroup() is NULL,
rdtgroup_mutex is not held and calling rdt_last_cmd_{clear,puts,...}()
will result in a self-incurred, potential lockdep warning.

Remove the rdt_last_cmd_{clear,puts,...}() calls in these two paths.
Just returning error should be sufficient to report to the user that the
entry doesn't exist any more.

[ bp: Massage. ]

Fixes: 94457b36e8a5 ("x86/intel_rdt: Add diagnostics when writing the cpus file")
Fixes: cfd0f34e4cd5 ("x86/intel_rdt: Add diagnostics when making directories")
Signed-off-by: Xiaochen Shen <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index a46dee8..2e3b06d 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -461,10 +461,8 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
}

rdtgrp = rdtgroup_kn_lock_live(of->kn);
- rdt_last_cmd_clear();
if (!rdtgrp) {
ret = -ENOENT;
- rdt_last_cmd_puts("Directory was removed\n");
goto unlock;
}

@@ -2648,10 +2646,8 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
int ret;

prdtgrp = rdtgroup_kn_lock_live(prgrp_kn);
- rdt_last_cmd_clear();
if (!prdtgrp) {
ret = -ENODEV;
- rdt_last_cmd_puts("Directory was removed\n");
goto out_unlock;
}

2019-11-16 16:16:05

by Xiaochen Shen

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: Fix potential lockdep warning

Hi Boris,

Thank you for your kind code review. Please find my comments inline.

On 11/13/2019 19:44, Borislav Petkov wrote:
> On Thu, Nov 07, 2019 at 06:36:36AM +0800, Xiaochen Shen wrote:
>> rdtgroup_cpus_write() and mkdir_rdt_prepare() call
>> rdtgroup_kn_lock_live() -> kernfs_to_rdtgroup() to get 'rdtgrp', and
>> then call rdt_last_cmd_xxx() functions which will check if
>
> Write those names like this:
>
> rdt_last_cmd_{clear,puts,...} but not with an "xxx" which confuses
> people unfamiliar with the code.

OK. I got it. rdt_last_cmd_{clear,puts,printf}().

>
>> rdtgroup_mutex is held/requires its caller to hold rdtgroup_mutex.
>> But if 'rdtgrp' returned from kernfs_to_rdtgroup() is NULL,
>> rdtgroup_mutex is not held and calling rdt_last_cmd_xxx() will result
>> in a lockdep warning.
>
> That's more of a self-incurred lockdep warning. You can't be calling
> lockdep_assert_held() after a function which doesn't always grab the
> mutex. Looks like the design needs changing here...

Actually this fix covers all the cases of an audit of the calling paths
of rdt_last_cmd_{clear,puts,printf}(), to make sure we only have the
lockdep_assert_held() in places where we are sure that it must be held.
Please find more background details as below.

>
>> Remove rdt_last_cmd_xxx() in these two paths. Just returning error
>> should be sufficient to report to the user that the entry doesn't exist
>> any more.
>
> ... or that.
>
> In any case, you should consider fixing such patterns in the code as it
> looks sub-optimal from where I'm standing.

I would like to provide more of the background details in the commit
comment in v2 patch:

-------------------
x86/resctrl: Fix potential lockdep warning

rdt_last_cmd_{clear,puts,printf}() call lockdep_assert_held() to assert
that rdtgroup_mutex is held.

During internal review of some other changes we found that there are
code paths that call rdt_last_cmd_{clear,puts}() when the rdtgroup_mutex
is not held.

An audit of calling sequences identified two different cases in
rdtgroup_kn_lock_live() which both returning NULL:
1.'rdtgrp' returned from kernfs_to_rdtgroup() is NULL, rdtgroup_mutex
is not held.
2.'rdtgrp' is being deleted, rdtgroup_mutex is held.

Checking all call sites of rdt_last_cmd_{clear,puts,printf}() found two
code paths where rdtgroup_mutex is not held: rdtgroup_cpus_write() and
mkdir_rdt_prepare().

Fix by removing rdt_last_cmd_{clear,puts}() in these two paths. Just
returning error should be sufficient to report to the user that the
entry doesn't exist any more.

Fixes: 94457b36e8a5 ("x86/intel_rdt: Add diagnostics when writing the
cpus file")
Fixes: cfd0f34e4cd5 ("x86/intel_rdt: Add diagnostics when making
directories")
Signed-off-by: Xiaochen Shen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
-------------------
Updated commit comment to provide additional context on how these were
found.

>
> Thx.
>

--
Best regards,
Xiaochen

2019-11-18 15:04:40

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: Fix potential lockdep warning

On Sun, Nov 17, 2019 at 12:13:20AM +0800, Xiaochen Shen wrote:
> Actually this fix covers all the cases of an audit of the calling paths
> of rdt_last_cmd_{clear,puts,printf}(), to make sure we only have the
> lockdep_assert_held() in places where we are sure that it must be held.

That's kinda what I suggested, isn't it?

All I meant was, not to have a

rdtgroup_kn_lock_live()

call in the code as this function does *not* unconditionally grab the
rdtgroup_mutex. And then call a function which unconditionally checks
whether the mutex is held.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2019-11-19 07:46:30

by Xiaochen Shen

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: Fix potential lockdep warning

On 11/18/2019 23:02, Borislav Petkov wrote:
> On Sun, Nov 17, 2019 at 12:13:20AM +0800, Xiaochen Shen wrote:
>> Actually this fix covers all the cases of an audit of the calling paths
>> of rdt_last_cmd_{clear,puts,printf}(), to make sure we only have the
>> lockdep_assert_held() in places where we are sure that it must be held.
>
> That's kinda what I suggested, isn't it?
>
> All I meant was, not to have a
>
> rdtgroup_kn_lock_live()
>
> call in the code as this function does *not* unconditionally grab the
> rdtgroup_mutex. And then call a function which unconditionally checks
> whether the mutex is held.
>

Hi Boris,

Thank you for your good suggestion. I will try to follow up if we could
improve the code in call sites of rdtgroup_kn_lock_live() in separate patch.

In my opinion, the potential lockdep issues in all call sites of
rdt_last_cmd_{clear,puts,...}() have been fixed in this patch.

Thank you.

--
Best regards,
Xiaochen