2017-11-06 17:09:56

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH] mm: do not rely on preempt_count in print_vma_addr



On 11/6/17 5:40 AM, Michal Hocko wrote:
> On Mon 06-11-17 13:12:22, Michal Hocko wrote:
>> On Mon 06-11-17 13:00:25, Peter Zijlstra wrote:
>>> On Mon, Nov 06, 2017 at 11:43:54AM +0100, Michal Hocko wrote:
>>>>> Yes the comment is very much accurate.
>>>>
>>>> Which suggests that print_vma_addr might be problematic, right?
>>>> Shouldn't we do trylock on mmap_sem instead?
>>>
>>> Yes that's complete rubbish. trylock will get spurious failures to print
>>> when the lock is contended.
>>
>> Yes, but I guess that it is acceptable to to not print the state under
>> that condition.
>
> So what do you think about this? I think this is more robust than
> playing tricks with the explicit preempt count checks and less tedious
> than checking to make it conditional on the context. This is on top of
> Linus tree and if accepted it should replace the patch discussed here.
> ---
> From 0de6d57cbc54ee2686d1f1e4ffcc4ed490ded8aa Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Mon, 6 Nov 2017 14:31:20 +0100
> Subject: [PATCH] mm: do not rely on preempt_count in print_vma_addr
>
> The preempt count check on print_vma_addr has been added by e8bff74afbdb
> ("x86: fix "BUG: sleeping function called from invalid context" in
> print_vma_addr()") and it relied on the elevated preempt count from
> preempt_conditional_sti because preempt_count check doesn't work on
> non preemptive kernels by default. The code has evolved though and
> d99e1bd175f4 ("x86/entry/traps: Refactor preemption and interrupt flag
> handling") has replaced preempt_conditional_sti by an explicit
> preempt_disable which is noop on !PREEMPT so the check in print_vma_addr
> is broken.
>
> Fix the issue by using trylock on mmap_sem rather than chacking the

s/chacking/checking

> preempt count. The allocation we are relying on has to be GFP_NOWAIT
> as well. There is a chance that we won't dump the vma state if the lock
> is contended or the memory short but this is acceptable outcome and much
> less fragile than the not working preemption check or tricks around it.
>
> Fixes: d99e1bd175f4 ("x86/entry/traps: Refactor preemption and interrupt flag handling")
> Signed-off-by: Michal Hocko <[email protected]>

Acked-by: Yang Shi <[email protected]>

Regards,
Yang

> ---
> mm/memory.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index a728bed16c20..1e308ac8ca0a 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4457,17 +4457,15 @@ void print_vma_addr(char *prefix, unsigned long ip)
> struct vm_area_struct *vma;
>
> /*
> - * Do not print if we are in atomic
> - * contexts (in exception stacks, etc.):
> + * we might be running from an atomic context so we cannot sleep
> */
> - if (preempt_count())
> + if (!down_read_trylock(&mm->mmap_sem))
> return;
>
> - down_read(&mm->mmap_sem);
> vma = find_vma(mm, ip);
> if (vma && vma->vm_file) {
> struct file *f = vma->vm_file;
> - char *buf = (char *)__get_free_page(GFP_KERNEL);
> + char *buf = (char *)__get_free_page(GFP_NOWAIT);
> if (buf) {
> char *p;
>
>

From 1583334901008470573@xxx Mon Nov 06 16:30:29 +0000 2017
X-GM-THRID: 1583326809651580499
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread


2017-11-06 16:30:29

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm: do not rely on preempt_count in print_vma_addr

On Tue 07-11-17 00:16:58, Yang Shi wrote:
>
>
> On 11/6/17 5:40 AM, Michal Hocko wrote:
> > On Mon 06-11-17 13:12:22, Michal Hocko wrote:
> > > On Mon 06-11-17 13:00:25, Peter Zijlstra wrote:
> > > > On Mon, Nov 06, 2017 at 11:43:54AM +0100, Michal Hocko wrote:
> > > > > > Yes the comment is very much accurate.
> > > > >
> > > > > Which suggests that print_vma_addr might be problematic, right?
> > > > > Shouldn't we do trylock on mmap_sem instead?
> > > >
> > > > Yes that's complete rubbish. trylock will get spurious failures to print
> > > > when the lock is contended.
> > >
> > > Yes, but I guess that it is acceptable to to not print the state under
> > > that condition.
> >
> > So what do you think about this? I think this is more robust than
> > playing tricks with the explicit preempt count checks and less tedious
> > than checking to make it conditional on the context. This is on top of
> > Linus tree and if accepted it should replace the patch discussed here.
> > ---
> > From 0de6d57cbc54ee2686d1f1e4ffcc4ed490ded8aa Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <[email protected]>
> > Date: Mon, 6 Nov 2017 14:31:20 +0100
> > Subject: [PATCH] mm: do not rely on preempt_count in print_vma_addr
> >
> > The preempt count check on print_vma_addr has been added by e8bff74afbdb
> > ("x86: fix "BUG: sleeping function called from invalid context" in
> > print_vma_addr()") and it relied on the elevated preempt count from
> > preempt_conditional_sti because preempt_count check doesn't work on
> > non preemptive kernels by default. The code has evolved though and
> > d99e1bd175f4 ("x86/entry/traps: Refactor preemption and interrupt flag
> > handling") has replaced preempt_conditional_sti by an explicit
> > preempt_disable which is noop on !PREEMPT so the check in print_vma_addr
> > is broken.
> >
> > Fix the issue by using trylock on mmap_sem rather than chacking the
>
> s/chacking/checking

ups, fixed

> > preempt count. The allocation we are relying on has to be GFP_NOWAIT
> > as well. There is a chance that we won't dump the vma state if the lock
> > is contended or the memory short but this is acceptable outcome and much
> > less fragile than the not working preemption check or tricks around it.
> >
> > Fixes: d99e1bd175f4 ("x86/entry/traps: Refactor preemption and interrupt flag handling")
> > Signed-off-by: Michal Hocko <[email protected]>
>
> Acked-by: Yang Shi <[email protected]>

Thanks!
--
Michal Hocko
SUSE Labs

From 1583327288106533179@xxx Mon Nov 06 14:29:29 +0000 2017
X-GM-THRID: 1583326809651580499
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread