2020-10-30 18:40:55

by Xiaochen Shen

[permalink] [raw]
Subject: [PATCH 1/3] x86/resctrl: Remove superfluous kernfs_get() calls to prevent refcount leak

Willem reported growing of kernfs_node_cache entries in slabtop when
repeatedly creating and removing resctrl subdirectories as well as when
repeatedly mounting and unmounting resctrl filesystem.

On resource group (control as well as monitoring) creation via a mkdir
an extra kernfs_node reference is obtained to ensure that the rdtgroup
structure remains accessible for the rdtgroup_kn_unlock() calls where it
is removed on deletion. The kernfs_node reference count is dropped by
kernfs_put() in rdtgroup_kn_unlock().

With the above explaining the need for one kernfs_get()/kernfs_put()
pair in resctrl there are more places where a kernfs_node reference is
obtained without a corresponding release. The excessive amount of
reference count on kernfs nodes will never be dropped to 0 and the
kernfs nodes will never be freed in the call paths of rmdir and umount.
It leads to reference count leak and kernfs_node_cache memory leak.

Remove the superfluous kernfs_get() calls and expand the existing
comments surrounding the remaining kernfs_get()/kernfs_put() pair that
remains in use.

Superfluous kernfs_get() calls are removed from two areas:

(1) In call paths of mount and mkdir, when kernfs nodes for "info",
"mon_groups" and "mon_data" directories and sub-directories are
created, the reference count of newly created kernfs node is set to 1.
But after kernfs_create_dir() returns, superfluous kernfs_get() are
called to take an additional reference.

(2) kernfs_get() calls in rmdir call paths.

Cc: [email protected]
Fixes: 17eafd076291 ("x86/intel_rdt: Split resource group removal in two")
Fixes: 4af4a88e0c92 ("x86/intel_rdt/cqm: Add mount,umount support")
Fixes: f3cbeacaa06e ("x86/intel_rdt/cqm: Add rmdir support")
Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data")
Fixes: c7d9aac61311 ("x86/intel_rdt/cqm: Add mkdir support for RDT monitoring")
Fixes: 5dc1d5c6bac2 ("x86/intel_rdt: Simplify info and base file lists")
Fixes: 60cf5e101fd4 ("x86/intel_rdt: Add mkdir to resctrl file system")
Fixes: 4e978d06dedb ("x86/intel_rdt: Add "info" files to resctrl file system")
Reported-by: Willem de Bruijn <[email protected]>
Signed-off-by: Xiaochen Shen <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 35 ++--------------------------------
1 file changed, 2 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index af323e2e3100..2ab1266a5f14 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1769,7 +1769,6 @@ static int rdtgroup_mkdir_info_resdir(struct rdt_resource *r, char *name,
if (IS_ERR(kn_subdir))
return PTR_ERR(kn_subdir);

- kernfs_get(kn_subdir);
ret = rdtgroup_kn_set_ugid(kn_subdir);
if (ret)
return ret;
@@ -1792,7 +1791,6 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
kn_info = kernfs_create_dir(parent_kn, "info", parent_kn->mode, NULL);
if (IS_ERR(kn_info))
return PTR_ERR(kn_info);
- kernfs_get(kn_info);

ret = rdtgroup_add_files(kn_info, RF_TOP_INFO);
if (ret)
@@ -1813,12 +1811,6 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
goto out_destroy;
}

- /*
- * This extra ref will be put in kernfs_remove() and guarantees
- * that @rdtgrp->kn is always accessible.
- */
- kernfs_get(kn_info);
-
ret = rdtgroup_kn_set_ugid(kn_info);
if (ret)
goto out_destroy;
@@ -1847,12 +1839,6 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
if (dest_kn)
*dest_kn = kn;

- /*
- * This extra ref will be put in kernfs_remove() and guarantees
- * that @rdtgrp->kn is always accessible.
- */
- kernfs_get(kn);
-
ret = rdtgroup_kn_set_ugid(kn);
if (ret)
goto out_destroy;
@@ -2139,13 +2125,11 @@ static int rdt_get_tree(struct fs_context *fc)
&kn_mongrp);
if (ret < 0)
goto out_info;
- kernfs_get(kn_mongrp);

ret = mkdir_mondata_all(rdtgroup_default.kn,
&rdtgroup_default, &kn_mondata);
if (ret < 0)
goto out_mongrp;
- kernfs_get(kn_mondata);
rdtgroup_default.mon.mon_data_kn = kn_mondata;
}

@@ -2499,11 +2483,6 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
if (IS_ERR(kn))
return PTR_ERR(kn);

- /*
- * This extra ref will be put in kernfs_remove() and guarantees
- * that kn is always accessible.
- */
- kernfs_get(kn);
ret = rdtgroup_kn_set_ugid(kn);
if (ret)
goto out_destroy;
@@ -2838,8 +2817,8 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
/*
* kernfs_remove() will drop the reference count on "kn" which
* will free it. But we still need it to stick around for the
- * rdtgroup_kn_unlock(kn} call below. Take one extra reference
- * here, which will be dropped inside rdtgroup_kn_unlock().
+ * rdtgroup_kn_unlock(kn) call. Take one extra reference here,
+ * which will be dropped inside rdtgroup_kn_unlock().
*/
kernfs_get(kn);

@@ -3049,11 +3028,6 @@ static int rdtgroup_rmdir_mon(struct kernfs_node *kn, struct rdtgroup *rdtgrp,
WARN_ON(list_empty(&prdtgrp->mon.crdtgrp_list));
list_del(&rdtgrp->mon.crdtgrp_list);

- /*
- * one extra hold on this, will drop when we kfree(rdtgrp)
- * in rdtgroup_kn_unlock()
- */
- kernfs_get(kn);
kernfs_remove(rdtgrp->kn);

return 0;
@@ -3065,11 +3039,6 @@ static int rdtgroup_ctrl_remove(struct kernfs_node *kn,
rdtgrp->flags = RDT_DELETED;
list_del(&rdtgrp->rdtgroup_list);

- /*
- * one extra hold on this, will drop when we kfree(rdtgrp)
- * in rdtgroup_kn_unlock()
- */
- kernfs_get(kn);
kernfs_remove(rdtgrp->kn);
return 0;
}
--
1.8.3.1


2020-10-30 19:02:14

by Xiaochen Shen

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/resctrl: Remove superfluous kernfs_get() calls to prevent refcount leak

Sorry, please ignore this patch.

The correct version of patchset:
https://lkml.org/lkml/2020/10/30/998

On 10/31/2020 3:03, Xiaochen Shen wrote:
> Willem reported growing of kernfs_node_cache entries in slabtop when
> repeatedly creating and removing resctrl subdirectories as well as when
> repeatedly mounting and unmounting resctrl filesystem.
>
> On resource group (control as well as monitoring) creation via a mkdir
> an extra kernfs_node reference is obtained to ensure that the rdtgroup
> structure remains accessible for the rdtgroup_kn_unlock() calls where it
> is removed on deletion. The kernfs_node reference count is dropped by
> kernfs_put() in rdtgroup_kn_unlock().
>
> With the above explaining the need for one kernfs_get()/kernfs_put()
> pair in resctrl there are more places where a kernfs_node reference is
> obtained without a corresponding release. The excessive amount of
> reference count on kernfs nodes will never be dropped to 0 and the
> kernfs nodes will never be freed in the call paths of rmdir and umount.
> It leads to reference count leak and kernfs_node_cache memory leak.
>
> Remove the superfluous kernfs_get() calls and expand the existing
> comments surrounding the remaining kernfs_get()/kernfs_put() pair that
> remains in use.
>
> Superfluous kernfs_get() calls are removed from two areas:
>
> (1) In call paths of mount and mkdir, when kernfs nodes for "info",
> "mon_groups" and "mon_data" directories and sub-directories are
> created, the reference count of newly created kernfs node is set to 1.
> But after kernfs_create_dir() returns, superfluous kernfs_get() are
> called to take an additional reference.
>
> (2) kernfs_get() calls in rmdir call paths.
>
> Cc: [email protected]
> Fixes: 17eafd076291 ("x86/intel_rdt: Split resource group removal in two")
> Fixes: 4af4a88e0c92 ("x86/intel_rdt/cqm: Add mount,umount support")
> Fixes: f3cbeacaa06e ("x86/intel_rdt/cqm: Add rmdir support")
> Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data")
> Fixes: c7d9aac61311 ("x86/intel_rdt/cqm: Add mkdir support for RDT monitoring")
> Fixes: 5dc1d5c6bac2 ("x86/intel_rdt: Simplify info and base file lists")
> Fixes: 60cf5e101fd4 ("x86/intel_rdt: Add mkdir to resctrl file system")
> Fixes: 4e978d06dedb ("x86/intel_rdt: Add "info" files to resctrl file system")
> Reported-by: Willem de Bruijn <[email protected]>
> Signed-off-by: Xiaochen Shen <[email protected]>
> Reviewed-by: Reinette Chatre <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 35 ++--------------------------------
> 1 file changed, 2 insertions(+), 33 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index af323e2e3100..2ab1266a5f14 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1769,7 +1769,6 @@ static int rdtgroup_mkdir_info_resdir(struct rdt_resource *r, char *name,
> if (IS_ERR(kn_subdir))
> return PTR_ERR(kn_subdir);
>
> - kernfs_get(kn_subdir);
> ret = rdtgroup_kn_set_ugid(kn_subdir);
> if (ret)
> return ret;
> @@ -1792,7 +1791,6 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
> kn_info = kernfs_create_dir(parent_kn, "info", parent_kn->mode, NULL);
> if (IS_ERR(kn_info))
> return PTR_ERR(kn_info);
> - kernfs_get(kn_info);
>
> ret = rdtgroup_add_files(kn_info, RF_TOP_INFO);
> if (ret)
> @@ -1813,12 +1811,6 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
> goto out_destroy;
> }
>
> - /*
> - * This extra ref will be put in kernfs_remove() and guarantees
> - * that @rdtgrp->kn is always accessible.
> - */
> - kernfs_get(kn_info);
> -
> ret = rdtgroup_kn_set_ugid(kn_info);
> if (ret)
> goto out_destroy;
> @@ -1847,12 +1839,6 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
> if (dest_kn)
> *dest_kn = kn;
>
> - /*
> - * This extra ref will be put in kernfs_remove() and guarantees
> - * that @rdtgrp->kn is always accessible.
> - */
> - kernfs_get(kn);
> -
> ret = rdtgroup_kn_set_ugid(kn);
> if (ret)
> goto out_destroy;
> @@ -2139,13 +2125,11 @@ static int rdt_get_tree(struct fs_context *fc)
> &kn_mongrp);
> if (ret < 0)
> goto out_info;
> - kernfs_get(kn_mongrp);
>
> ret = mkdir_mondata_all(rdtgroup_default.kn,
> &rdtgroup_default, &kn_mondata);
> if (ret < 0)
> goto out_mongrp;
> - kernfs_get(kn_mondata);
> rdtgroup_default.mon.mon_data_kn = kn_mondata;
> }
>
> @@ -2499,11 +2483,6 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
> if (IS_ERR(kn))
> return PTR_ERR(kn);
>
> - /*
> - * This extra ref will be put in kernfs_remove() and guarantees
> - * that kn is always accessible.
> - */
> - kernfs_get(kn);
> ret = rdtgroup_kn_set_ugid(kn);
> if (ret)
> goto out_destroy;
> @@ -2838,8 +2817,8 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
> /*
> * kernfs_remove() will drop the reference count on "kn" which
> * will free it. But we still need it to stick around for the
> - * rdtgroup_kn_unlock(kn} call below. Take one extra reference
> - * here, which will be dropped inside rdtgroup_kn_unlock().
> + * rdtgroup_kn_unlock(kn) call. Take one extra reference here,
> + * which will be dropped inside rdtgroup_kn_unlock().
> */
> kernfs_get(kn);
>
> @@ -3049,11 +3028,6 @@ static int rdtgroup_rmdir_mon(struct kernfs_node *kn, struct rdtgroup *rdtgrp,
> WARN_ON(list_empty(&prdtgrp->mon.crdtgrp_list));
> list_del(&rdtgrp->mon.crdtgrp_list);
>
> - /*
> - * one extra hold on this, will drop when we kfree(rdtgrp)
> - * in rdtgroup_kn_unlock()
> - */
> - kernfs_get(kn);
> kernfs_remove(rdtgrp->kn);
>
> return 0;
> @@ -3065,11 +3039,6 @@ static int rdtgroup_ctrl_remove(struct kernfs_node *kn,
> rdtgrp->flags = RDT_DELETED;
> list_del(&rdtgrp->rdtgroup_list);
>
> - /*
> - * one extra hold on this, will drop when we kfree(rdtgrp)
> - * in rdtgroup_kn_unlock()
> - */
> - kernfs_get(kn);
> kernfs_remove(rdtgrp->kn);
> return 0;
> }

--
Best regards,
Xiaochen

2020-11-20 16:17:52

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/resctrl: Remove superfluous kernfs_get() calls to prevent refcount leak

On Sat, Oct 31, 2020 at 03:03:58AM +0800, Xiaochen Shen wrote:
> Willem reported growing of kernfs_node_cache entries in slabtop when
> repeatedly creating and removing resctrl subdirectories as well as when
> repeatedly mounting and unmounting resctrl filesystem.
>
> On resource group (control as well as monitoring) creation via a mkdir
> an extra kernfs_node reference is obtained to ensure that the rdtgroup
> structure remains accessible for the rdtgroup_kn_unlock() calls where it
> is removed on deletion. The kernfs_node reference count is dropped by
> kernfs_put() in rdtgroup_kn_unlock().
>
> With the above explaining the need for one kernfs_get()/kernfs_put()
> pair in resctrl there are more places where a kernfs_node reference is
> obtained without a corresponding release. The excessive amount of
> reference count on kernfs nodes will never be dropped to 0 and the
> kernfs nodes will never be freed in the call paths of rmdir and umount.
> It leads to reference count leak and kernfs_node_cache memory leak.
>
> Remove the superfluous kernfs_get() calls and expand the existing
> comments surrounding the remaining kernfs_get()/kernfs_put() pair that
> remains in use.
>
> Superfluous kernfs_get() calls are removed from two areas:
>
> (1) In call paths of mount and mkdir, when kernfs nodes for "info",
> "mon_groups" and "mon_data" directories and sub-directories are
> created, the reference count of newly created kernfs node is set to 1.
> But after kernfs_create_dir() returns, superfluous kernfs_get() are
> called to take an additional reference.
>
> (2) kernfs_get() calls in rmdir call paths.
>
> Cc: [email protected]
> Fixes: 17eafd076291 ("x86/intel_rdt: Split resource group removal in two")
> Fixes: 4af4a88e0c92 ("x86/intel_rdt/cqm: Add mount,umount support")
> Fixes: f3cbeacaa06e ("x86/intel_rdt/cqm: Add rmdir support")
> Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data")
> Fixes: c7d9aac61311 ("x86/intel_rdt/cqm: Add mkdir support for RDT monitoring")
> Fixes: 5dc1d5c6bac2 ("x86/intel_rdt: Simplify info and base file lists")
> Fixes: 60cf5e101fd4 ("x86/intel_rdt: Add mkdir to resctrl file system")
> Fixes: 4e978d06dedb ("x86/intel_rdt: Add "info" files to resctrl file system")

Are those 8(!) Fixes tags supposed to list *all* commits which add those
wrong kernfs_get() calls?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-11-21 01:58:14

by Xiaochen Shen

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/resctrl: Remove superfluous kernfs_get() calls to prevent refcount leak

Hi Boris,

Thank you very much for code review. More comments are inline.

But I am sorry that I sent this thread by mistake (--in-reply-to a wrong
Message-ID). Please ignore this thread and help review from following
threads:

The link of correct version of this patch [PATCH 1/3]:
https://lkml.kernel.org/r/[email protected]

The link of the patch series with 3 patches:
https://lkml.kernel.org/r/[email protected]

I am so sorry for the inconvenience.


On 11/21/2020 0:13, Borislav Petkov wrote:
> On Sat, Oct 31, 2020 at 03:03:58AM +0800, Xiaochen Shen wrote:
>> Willem reported growing of kernfs_node_cache entries in slabtop when
>> repeatedly creating and removing resctrl subdirectories as well as when
>> repeatedly mounting and unmounting resctrl filesystem.
>>
>> On resource group (control as well as monitoring) creation via a mkdir
>> an extra kernfs_node reference is obtained to ensure that the rdtgroup
>> structure remains accessible for the rdtgroup_kn_unlock() calls where it
>> is removed on deletion. The kernfs_node reference count is dropped by
>> kernfs_put() in rdtgroup_kn_unlock().
>>
>> With the above explaining the need for one kernfs_get()/kernfs_put()
>> pair in resctrl there are more places where a kernfs_node reference is
>> obtained without a corresponding release. The excessive amount of
>> reference count on kernfs nodes will never be dropped to 0 and the
>> kernfs nodes will never be freed in the call paths of rmdir and umount.
>> It leads to reference count leak and kernfs_node_cache memory leak.
>>
>> Remove the superfluous kernfs_get() calls and expand the existing
>> comments surrounding the remaining kernfs_get()/kernfs_put() pair that
>> remains in use.
>>
>> Superfluous kernfs_get() calls are removed from two areas:
>>
>> (1) In call paths of mount and mkdir, when kernfs nodes for "info",
>> "mon_groups" and "mon_data" directories and sub-directories are
>> created, the reference count of newly created kernfs node is set to 1.
>> But after kernfs_create_dir() returns, superfluous kernfs_get() are
>> called to take an additional reference.
>>
>> (2) kernfs_get() calls in rmdir call paths.
>>
>> Cc: [email protected]
>> Fixes: 17eafd076291 ("x86/intel_rdt: Split resource group removal in two")
>> Fixes: 4af4a88e0c92 ("x86/intel_rdt/cqm: Add mount,umount support")
>> Fixes: f3cbeacaa06e ("x86/intel_rdt/cqm: Add rmdir support")
>> Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data")
>> Fixes: c7d9aac61311 ("x86/intel_rdt/cqm: Add mkdir support for RDT monitoring")
>> Fixes: 5dc1d5c6bac2 ("x86/intel_rdt: Simplify info and base file lists")
>> Fixes: 60cf5e101fd4 ("x86/intel_rdt: Add mkdir to resctrl file system")
>> Fixes: 4e978d06dedb ("x86/intel_rdt: Add "info" files to resctrl file system")
> Are those 8(!) Fixes tags supposed to list *all* commits which add those
> wrong kernfs_get() calls?

Yes. Thank you.

--
Best regards,
Xiaochen