2022-04-22 14:41:07

by Nico Pache

[permalink] [raw]
Subject: [RFC 3/3] exit: Check for MMF_OOM_SKIP in exit_mmap

The MMF_OOM_SKIP bit is used to indicate weather a mm_struct can not be
invalided or has already been invalided. exit_mmap currently calls
__oom_reap_task_mm unconditionally despite the fact that the oom reaper
may have already called this.

Add a check for the MMF_OOM_SKIP bit being set in exit_mmap to avoid
unnessary calls to the invalidate code.

A slight race can occur on the MMF_OOM_SKIP bit that will still allow
this to run twice. My testing has shown an ~66% decrease in double calls
to _oom_reap_task_mm.

Fixes: 27ae357fa82b ("mm, oom: fix concurrent munlock and oom reaper unmap, v3")
Cc: David Rientjes <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Signed-off-by: Nico Pache <[email protected]>
---
mm/mmap.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index a2968669fd4e..b867f408dacd 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3113,7 +3113,8 @@ void exit_mmap(struct mm_struct *mm)
/* mm's last user has gone, and its about to be pulled down */
mmu_notifier_release(mm);

- if (unlikely(mm_is_oom_victim(mm))) {
+ if (unlikely(mm_is_oom_victim(mm)) &&
+ !test_bit(MMF_OOM_SKIP, &mm->flags)) {
/*
* Manually reap the mm to free as much memory as possible.
* Then, as the oom reaper does, set MMF_OOM_SKIP to disregard
--
2.35.1


2022-04-22 18:35:10

by Michal Hocko

[permalink] [raw]
Subject: Re: [RFC 3/3] exit: Check for MMF_OOM_SKIP in exit_mmap

On Thu 21-04-22 15:05:33, Nico Pache wrote:
> The MMF_OOM_SKIP bit is used to indicate weather a mm_struct can not be
> invalided or has already been invalided. exit_mmap currently calls
> __oom_reap_task_mm unconditionally despite the fact that the oom reaper
> may have already called this.
>
> Add a check for the MMF_OOM_SKIP bit being set in exit_mmap to avoid
> unnessary calls to the invalidate code.

Why do we care about this?

> A slight race can occur on the MMF_OOM_SKIP bit that will still allow
> this to run twice. My testing has shown an ~66% decrease in double calls
> to _oom_reap_task_mm.
>
> Fixes: 27ae357fa82b ("mm, oom: fix concurrent munlock and oom reaper unmap, v3")

I do not see this would be fixing anything.

> Cc: David Rientjes <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Andrea Arcangeli <[email protected]>
> Signed-off-by: Nico Pache <[email protected]>
> ---
> mm/mmap.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index a2968669fd4e..b867f408dacd 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -3113,7 +3113,8 @@ void exit_mmap(struct mm_struct *mm)
> /* mm's last user has gone, and its about to be pulled down */
> mmu_notifier_release(mm);
>
> - if (unlikely(mm_is_oom_victim(mm))) {
> + if (unlikely(mm_is_oom_victim(mm)) &&
> + !test_bit(MMF_OOM_SKIP, &mm->flags)) {
> /*
> * Manually reap the mm to free as much memory as possible.
> * Then, as the oom reaper does, set MMF_OOM_SKIP to disregard
> --
> 2.35.1

--
Michal Hocko
SUSE Labs

2022-04-26 08:43:51

by Michal Hocko

[permalink] [raw]
Subject: Re: [RFC 3/3] exit: Check for MMF_OOM_SKIP in exit_mmap

On Mon 25-04-22 15:00:24, Nico Pache wrote:
>
>
> On 4/22/22 11:38, Michal Hocko wrote:
> > On Thu 21-04-22 15:05:33, Nico Pache wrote:
> >> The MMF_OOM_SKIP bit is used to indicate weather a mm_struct can not be
> >> invalided or has already been invalided. exit_mmap currently calls
> >> __oom_reap_task_mm unconditionally despite the fact that the oom reaper
> >> may have already called this.
> >>
> >> Add a check for the MMF_OOM_SKIP bit being set in exit_mmap to avoid
> >> unnessary calls to the invalidate code.
> >
> > Why do we care about this?
> Is there no cost to the MMU/TLB invalidation? The MMU notifier contains a lock
> too so perhaps we can also avoids some unnecessary MMU notifier lock contention.

I am pretty sure that this area can be micro optimized but I do not
really see a strong reason for that. OOM victims are/should be really
rare so I do not think that any performance optimization would be really
visible.

If you want to improve the code then I think a much better plan would be
to get rid of the whole oom special case altogether. This might be much
closer than ever after Hugh's recent m{un}lock changes. I didn't have
time to think that through though. I believe Suren Baghdasaryan has been
looking into that as well.
--
Michal Hocko
SUSE Labs

2022-04-26 08:53:06

by Nico Pache

[permalink] [raw]
Subject: Re: [RFC 3/3] exit: Check for MMF_OOM_SKIP in exit_mmap



On 4/22/22 11:38, Michal Hocko wrote:
> On Thu 21-04-22 15:05:33, Nico Pache wrote:
>> The MMF_OOM_SKIP bit is used to indicate weather a mm_struct can not be
>> invalided or has already been invalided. exit_mmap currently calls
>> __oom_reap_task_mm unconditionally despite the fact that the oom reaper
>> may have already called this.
>>
>> Add a check for the MMF_OOM_SKIP bit being set in exit_mmap to avoid
>> unnessary calls to the invalidate code.
>
> Why do we care about this?
Is there no cost to the MMU/TLB invalidation? The MMU notifier contains a lock
too so perhaps we can also avoids some unnecessary MMU notifier lock contention.
>
>> A slight race can occur on the MMF_OOM_SKIP bit that will still allow
>> this to run twice. My testing has shown an ~66% decrease in double calls
>> to _oom_reap_task_mm.
>>
>> Fixes: 27ae357fa82b ("mm, oom: fix concurrent munlock and oom reaper unmap, v3")
>
> I do not see this would be fixing anything.
Ok im just trying to make sure we are keeping an eye on what introduced this
double call. Davids commit above is what introduced the oom_reap_task_mm in the
exit_mmap code. It goes along with some other changes that I dont fully
understand without more studying, so that's why I was hoping he could provide
some input around that CVE (the main thing im concerned about re-introducing).
>
>> Cc: David Rientjes <[email protected]>
>> Cc: Michal Hocko <[email protected]>
>> Cc: Andrea Arcangeli <[email protected]>
>> Signed-off-by: Nico Pache <[email protected]>
>> ---
>> mm/mmap.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index a2968669fd4e..b867f408dacd 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -3113,7 +3113,8 @@ void exit_mmap(struct mm_struct *mm)
>> /* mm's last user has gone, and its about to be pulled down */
>> mmu_notifier_release(mm);
>>
>> - if (unlikely(mm_is_oom_victim(mm))) {
>> + if (unlikely(mm_is_oom_victim(mm)) &&
>> + !test_bit(MMF_OOM_SKIP, &mm->flags)) {
>> /*
>> * Manually reap the mm to free as much memory as possible.
>> * Then, as the oom reaper does, set MMF_OOM_SKIP to disregard
>> --
>> 2.35.1
>