rdtgroup_cpus_write() and mkdir_rdt_prepare() call
rdtgroup_kn_lock_live() -> kernfs_to_rdtgroup() to get 'rdtgrp', and
then call rdt_last_cmd_xxx() functions which will check if
rdtgroup_mutex is held/requires its caller to hold rdtgroup_mutex.
But if 'rdtgrp' returned from kernfs_to_rdtgroup() is NULL,
rdtgroup_mutex is not held and calling rdt_last_cmd_xxx() will result
in a lockdep warning.
Remove rdt_last_cmd_xxx() in these two paths. Just returning error
should be sufficient to report to the user that the entry doesn't exist
any more.
Fixes: 94457b36e8a5 ("x86/intel_rdt: Add diagnostics when writing the cpus file")
Fixes: cfd0f34e4cd5 ("x86/intel_rdt: Add diagnostics when making directories")
Signed-off-by: Xiaochen Shen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index a46dee8e78db..2e3b06d6bbc6 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -461,10 +461,8 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
}
rdtgrp = rdtgroup_kn_lock_live(of->kn);
- rdt_last_cmd_clear();
if (!rdtgrp) {
ret = -ENOENT;
- rdt_last_cmd_puts("Directory was removed\n");
goto unlock;
}
@@ -2648,10 +2646,8 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
int ret;
prdtgrp = rdtgroup_kn_lock_live(prgrp_kn);
- rdt_last_cmd_clear();
if (!prdtgrp) {
ret = -ENODEV;
- rdt_last_cmd_puts("Directory was removed\n");
goto out_unlock;
}
--
1.8.3.1
On Thu, Nov 07, 2019 at 06:36:36AM +0800, Xiaochen Shen wrote:
> rdtgroup_cpus_write() and mkdir_rdt_prepare() call
> rdtgroup_kn_lock_live() -> kernfs_to_rdtgroup() to get 'rdtgrp', and
> then call rdt_last_cmd_xxx() functions which will check if
Write those names like this:
rdt_last_cmd_{clear,puts,...} but not with an "xxx" which confuses
people unfamiliar with the code.
> rdtgroup_mutex is held/requires its caller to hold rdtgroup_mutex.
> But if 'rdtgrp' returned from kernfs_to_rdtgroup() is NULL,
> rdtgroup_mutex is not held and calling rdt_last_cmd_xxx() will result
> in a lockdep warning.
That's more of a self-incurred lockdep warning. You can't be calling
lockdep_assert_held() after a function which doesn't always grab the
mutex. Looks like the design needs changing here...
> Remove rdt_last_cmd_xxx() in these two paths. Just returning error
> should be sufficient to report to the user that the entry doesn't exist
> any more.
... or that.
In any case, you should consider fixing such patterns in the code as it
looks sub-optimal from where I'm standing.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: c8eafe1495303bfd0eedaa8156b1ee9082ee9642
Gitweb: https://git.kernel.org/tip/c8eafe1495303bfd0eedaa8156b1ee9082ee9642
Author: Xiaochen Shen <[email protected]>
AuthorDate: Thu, 07 Nov 2019 06:36:36 +08:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 13 Nov 2019 12:34:44 +01:00
x86/resctrl: Fix potential lockdep warning
rdtgroup_cpus_write() and mkdir_rdt_prepare() call
rdtgroup_kn_lock_live() -> kernfs_to_rdtgroup() to get 'rdtgrp', and
then call the rdt_last_cmd_{clear,puts,...}() functions which will check
if rdtgroup_mutex is held/requires its caller to hold rdtgroup_mutex.
But if 'rdtgrp' returned from kernfs_to_rdtgroup() is NULL,
rdtgroup_mutex is not held and calling rdt_last_cmd_{clear,puts,...}()
will result in a self-incurred, potential lockdep warning.
Remove the rdt_last_cmd_{clear,puts,...}() calls in these two paths.
Just returning error should be sufficient to report to the user that the
entry doesn't exist any more.
[ bp: Massage. ]
Fixes: 94457b36e8a5 ("x86/intel_rdt: Add diagnostics when writing the cpus file")
Fixes: cfd0f34e4cd5 ("x86/intel_rdt: Add diagnostics when making directories")
Signed-off-by: Xiaochen Shen <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index a46dee8..2e3b06d 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -461,10 +461,8 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
}
rdtgrp = rdtgroup_kn_lock_live(of->kn);
- rdt_last_cmd_clear();
if (!rdtgrp) {
ret = -ENOENT;
- rdt_last_cmd_puts("Directory was removed\n");
goto unlock;
}
@@ -2648,10 +2646,8 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
int ret;
prdtgrp = rdtgroup_kn_lock_live(prgrp_kn);
- rdt_last_cmd_clear();
if (!prdtgrp) {
ret = -ENODEV;
- rdt_last_cmd_puts("Directory was removed\n");
goto out_unlock;
}
Hi Boris,
Thank you for your kind code review. Please find my comments inline.
On 11/13/2019 19:44, Borislav Petkov wrote:
> On Thu, Nov 07, 2019 at 06:36:36AM +0800, Xiaochen Shen wrote:
>> rdtgroup_cpus_write() and mkdir_rdt_prepare() call
>> rdtgroup_kn_lock_live() -> kernfs_to_rdtgroup() to get 'rdtgrp', and
>> then call rdt_last_cmd_xxx() functions which will check if
>
> Write those names like this:
>
> rdt_last_cmd_{clear,puts,...} but not with an "xxx" which confuses
> people unfamiliar with the code.
OK. I got it. rdt_last_cmd_{clear,puts,printf}().
>
>> rdtgroup_mutex is held/requires its caller to hold rdtgroup_mutex.
>> But if 'rdtgrp' returned from kernfs_to_rdtgroup() is NULL,
>> rdtgroup_mutex is not held and calling rdt_last_cmd_xxx() will result
>> in a lockdep warning.
>
> That's more of a self-incurred lockdep warning. You can't be calling
> lockdep_assert_held() after a function which doesn't always grab the
> mutex. Looks like the design needs changing here...
Actually this fix covers all the cases of an audit of the calling paths
of rdt_last_cmd_{clear,puts,printf}(), to make sure we only have the
lockdep_assert_held() in places where we are sure that it must be held.
Please find more background details as below.
>
>> Remove rdt_last_cmd_xxx() in these two paths. Just returning error
>> should be sufficient to report to the user that the entry doesn't exist
>> any more.
>
> ... or that.
>
> In any case, you should consider fixing such patterns in the code as it
> looks sub-optimal from where I'm standing.
I would like to provide more of the background details in the commit
comment in v2 patch:
-------------------
x86/resctrl: Fix potential lockdep warning
rdt_last_cmd_{clear,puts,printf}() call lockdep_assert_held() to assert
that rdtgroup_mutex is held.
During internal review of some other changes we found that there are
code paths that call rdt_last_cmd_{clear,puts}() when the rdtgroup_mutex
is not held.
An audit of calling sequences identified two different cases in
rdtgroup_kn_lock_live() which both returning NULL:
1.'rdtgrp' returned from kernfs_to_rdtgroup() is NULL, rdtgroup_mutex
is not held.
2.'rdtgrp' is being deleted, rdtgroup_mutex is held.
Checking all call sites of rdt_last_cmd_{clear,puts,printf}() found two
code paths where rdtgroup_mutex is not held: rdtgroup_cpus_write() and
mkdir_rdt_prepare().
Fix by removing rdt_last_cmd_{clear,puts}() in these two paths. Just
returning error should be sufficient to report to the user that the
entry doesn't exist any more.
Fixes: 94457b36e8a5 ("x86/intel_rdt: Add diagnostics when writing the
cpus file")
Fixes: cfd0f34e4cd5 ("x86/intel_rdt: Add diagnostics when making
directories")
Signed-off-by: Xiaochen Shen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
-------------------
Updated commit comment to provide additional context on how these were
found.
>
> Thx.
>
--
Best regards,
Xiaochen
On Sun, Nov 17, 2019 at 12:13:20AM +0800, Xiaochen Shen wrote:
> Actually this fix covers all the cases of an audit of the calling paths
> of rdt_last_cmd_{clear,puts,printf}(), to make sure we only have the
> lockdep_assert_held() in places where we are sure that it must be held.
That's kinda what I suggested, isn't it?
All I meant was, not to have a
rdtgroup_kn_lock_live()
call in the code as this function does *not* unconditionally grab the
rdtgroup_mutex. And then call a function which unconditionally checks
whether the mutex is held.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 11/18/2019 23:02, Borislav Petkov wrote:
> On Sun, Nov 17, 2019 at 12:13:20AM +0800, Xiaochen Shen wrote:
>> Actually this fix covers all the cases of an audit of the calling paths
>> of rdt_last_cmd_{clear,puts,printf}(), to make sure we only have the
>> lockdep_assert_held() in places where we are sure that it must be held.
>
> That's kinda what I suggested, isn't it?
>
> All I meant was, not to have a
>
> rdtgroup_kn_lock_live()
>
> call in the code as this function does *not* unconditionally grab the
> rdtgroup_mutex. And then call a function which unconditionally checks
> whether the mutex is held.
>
Hi Boris,
Thank you for your good suggestion. I will try to follow up if we could
improve the code in call sites of rdtgroup_kn_lock_live() in separate patch.
In my opinion, the potential lockdep issues in all call sites of
rdt_last_cmd_{clear,puts,...}() have been fixed in this patch.
Thank you.
--
Best regards,
Xiaochen