2021-03-02 21:00:44

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm: memcontrol: fix kernel stack account

On Tue 02-03-21 15:37:33, Muchun Song wrote:
> The alloc_thread_stack_node() cannot guarantee that allocated stack pages
> are in the same node when CONFIG_VMAP_STACK. Because we do not specify
> __GFP_THISNODE to __vmalloc_node_range(). Fix it by caling
> mod_lruvec_page_state() for each page one by one.

What is the actual problem you are trying to address by this patch?
991e7673859e has deliberately dropped the per page accounting. Can you
explain why that was incorrect? There surely is some imprecision
involved but does it matter and is it even observable?

> Fixes: 991e7673859e ("mm: memcontrol: account kernel stack per node")
> Signed-off-by: Muchun Song <[email protected]>
> ---
> kernel/fork.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index d66cd1014211..6e2201feb524 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -379,14 +379,19 @@ static void account_kernel_stack(struct task_struct *tsk, int account)
> void *stack = task_stack_page(tsk);
> struct vm_struct *vm = task_stack_vm_area(tsk);
>
> + if (vm) {
> + int i;
>
> - /* All stack pages are in the same node. */
> - if (vm)
> - mod_lruvec_page_state(vm->pages[0], NR_KERNEL_STACK_KB,
> - account * (THREAD_SIZE / 1024));
> - else
> + BUG_ON(vm->nr_pages != THREAD_SIZE / PAGE_SIZE);
> +
> + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
> + mod_lruvec_page_state(vm->pages[i], NR_KERNEL_STACK_KB,
> + account * (PAGE_SIZE / 1024));
> + } else {
> + /* All stack pages are in the same node. */
> mod_lruvec_kmem_state(stack, NR_KERNEL_STACK_KB,
> account * (THREAD_SIZE / 1024));
> + }
> }
>
> static int memcg_charge_kernel_stack(struct task_struct *tsk)
> --
> 2.11.0

--
Michal Hocko
SUSE Labs


2021-03-02 21:33:28

by Muchun Song

[permalink] [raw]
Subject: Re: [External] Re: [PATCH] mm: memcontrol: fix kernel stack account

On Tue, Mar 2, 2021 at 4:44 PM Michal Hocko <[email protected]> wrote:
>
> On Tue 02-03-21 15:37:33, Muchun Song wrote:
> > The alloc_thread_stack_node() cannot guarantee that allocated stack pages
> > are in the same node when CONFIG_VMAP_STACK. Because we do not specify
> > __GFP_THISNODE to __vmalloc_node_range(). Fix it by caling
> > mod_lruvec_page_state() for each page one by one.
>
> What is the actual problem you are trying to address by this patch?
> 991e7673859e has deliberately dropped the per page accounting. Can you
> explain why that was incorrect? There surely is some imprecision
> involved but does it matter and is it even observable?

When I read the code of account_kernel_stack(), I see a comment that
says "All stack pages are in the same node". I am confused about this.
IIUC, there is no guarantee about this. Right? Yeah, imprecision may
not be a problem. But if this is what we did deliberately, I think that
it is better to add a comment there. Thanks.

>
> > Fixes: 991e7673859e ("mm: memcontrol: account kernel stack per node")
> > Signed-off-by: Muchun Song <[email protected]>
> > ---
> > kernel/fork.c | 15 ++++++++++-----
> > 1 file changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index d66cd1014211..6e2201feb524 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -379,14 +379,19 @@ static void account_kernel_stack(struct task_struct *tsk, int account)
> > void *stack = task_stack_page(tsk);
> > struct vm_struct *vm = task_stack_vm_area(tsk);
> >
> > + if (vm) {
> > + int i;
> >
> > - /* All stack pages are in the same node. */
> > - if (vm)
> > - mod_lruvec_page_state(vm->pages[0], NR_KERNEL_STACK_KB,
> > - account * (THREAD_SIZE / 1024));
> > - else
> > + BUG_ON(vm->nr_pages != THREAD_SIZE / PAGE_SIZE);
> > +
> > + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++)
> > + mod_lruvec_page_state(vm->pages[i], NR_KERNEL_STACK_KB,
> > + account * (PAGE_SIZE / 1024));
> > + } else {
> > + /* All stack pages are in the same node. */
> > mod_lruvec_kmem_state(stack, NR_KERNEL_STACK_KB,
> > account * (THREAD_SIZE / 1024));
> > + }
> > }
> >
> > static int memcg_charge_kernel_stack(struct task_struct *tsk)
> > --
> > 2.11.0
>
> --
> Michal Hocko
> SUSE Labs

2021-03-02 21:37:44

by Michal Hocko

[permalink] [raw]
Subject: Re: [External] Re: [PATCH] mm: memcontrol: fix kernel stack account

On Tue 02-03-21 17:23:42, Muchun Song wrote:
> On Tue, Mar 2, 2021 at 4:44 PM Michal Hocko <[email protected]> wrote:
> >
> > On Tue 02-03-21 15:37:33, Muchun Song wrote:
> > > The alloc_thread_stack_node() cannot guarantee that allocated stack pages
> > > are in the same node when CONFIG_VMAP_STACK. Because we do not specify
> > > __GFP_THISNODE to __vmalloc_node_range(). Fix it by caling
> > > mod_lruvec_page_state() for each page one by one.
> >
> > What is the actual problem you are trying to address by this patch?
> > 991e7673859e has deliberately dropped the per page accounting. Can you
> > explain why that was incorrect? There surely is some imprecision
> > involved but does it matter and is it even observable?
>
> When I read the code of account_kernel_stack(), I see a comment that
> says "All stack pages are in the same node". I am confused about this.
> IIUC, there is no guarantee about this. Right?

Yes there is no guarantee indeed. Please always make sure to describe
the underlying reasoning for the patch. Subject of this patch refers to
a fix without explaining the actual problem. If a change is motivated by
code reading then make it explicit. Also if you are refering to a
different commit by Fixes: tag then it would be really helpful to
explicitly mention why that commit is incorrect or cause a visible
problems.

> Yeah, imprecision may
> not be a problem. But if this is what we did deliberately, I think that
> it is better to add a comment there. Thanks.

Yes the comment is quite confusing. I suspect it meant to say
/* All stack pages are accounted to the same node */

--
Michal Hocko
SUSE Labs

2021-03-02 21:38:58

by Muchun Song

[permalink] [raw]
Subject: Re: [External] Re: [PATCH] mm: memcontrol: fix kernel stack account

On Tue, Mar 2, 2021 at 5:34 PM Michal Hocko <[email protected]> wrote:
>
> On Tue 02-03-21 17:23:42, Muchun Song wrote:
> > On Tue, Mar 2, 2021 at 4:44 PM Michal Hocko <[email protected]> wrote:
> > >
> > > On Tue 02-03-21 15:37:33, Muchun Song wrote:
> > > > The alloc_thread_stack_node() cannot guarantee that allocated stack pages
> > > > are in the same node when CONFIG_VMAP_STACK. Because we do not specify
> > > > __GFP_THISNODE to __vmalloc_node_range(). Fix it by caling
> > > > mod_lruvec_page_state() for each page one by one.
> > >
> > > What is the actual problem you are trying to address by this patch?
> > > 991e7673859e has deliberately dropped the per page accounting. Can you
> > > explain why that was incorrect? There surely is some imprecision
> > > involved but does it matter and is it even observable?
> >
> > When I read the code of account_kernel_stack(), I see a comment that
> > says "All stack pages are in the same node". I am confused about this.
> > IIUC, there is no guarantee about this. Right?
>
> Yes there is no guarantee indeed. Please always make sure to describe
> the underlying reasoning for the patch. Subject of this patch refers to
> a fix without explaining the actual problem. If a change is motivated by
> code reading then make it explicit. Also if you are refering to a
> different commit by Fixes: tag then it would be really helpful to
> explicitly mention why that commit is incorrect or cause a visible
> problems.

Got it. Thanks for your teaching.

>
> > Yeah, imprecision may
> > not be a problem. But if this is what we did deliberately, I think that
> > it is better to add a comment there. Thanks.
>
> Yes the comment is quite confusing. I suspect it meant to say
> /* All stack pages are accounted to the same node */
>
> --
> Michal Hocko
> SUSE Labs

2021-03-04 06:30:54

by Shakeel Butt

[permalink] [raw]
Subject: Re: [External] Re: [PATCH] mm: memcontrol: fix kernel stack account

On Tue, Mar 2, 2021 at 1:34 AM Michal Hocko <[email protected]> wrote:
>
[snip]
> > Yeah, imprecision may
> > not be a problem. But if this is what we did deliberately, I think that
> > it is better to add a comment there. Thanks.
>
> Yes the comment is quite confusing. I suspect it meant to say
> /* All stack pages are accounted to the same node */
>

Most probably I meant to say "For simplicity let's just assume all
stack pages are on the same node." but we can easily make this more
accurate.