LinuxLists.cc - [rfc patch] mm, oom: fix unnecessary killing of additional processes

2018-05-25 02:44:49

Subject: [rfc patch] mm, oom: fix unnecessary killing of additional processes

2018-05-25 02:49:16

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

David Rientjes wrote:
> The oom reaper ensures forward progress by setting MMF_OOM_SKIP itself if
> it cannot reap an mm. This can happen for a variety of reasons,
> including:
>
> - the inability to grab mm->mmap_sem in a sufficient amount of time,
>
> - when the mm has blockable mmu notifiers that could cause the oom reaper
> to stall indefinitely,
>
> but we can also add a third when the oom reaper can "reap" an mm but doing
> so is unlikely to free any amount of memory:
>
> - when the mm's memory is fully mlocked.

- when the mm's memory is fully mlocked (needs privilege) or
fully shared (does not need privilege)

>
> When all memory is mlocked, the oom reaper will not be able to free any
> substantial amount of memory. It sets MMF_OOM_SKIP before the victim can
> unmap and free its memory in exit_mmap() and subsequent oom victims are
> chosen unnecessarily. This is trivial to reproduce if all eligible
> processes on the system have mlocked their memory: the oom killer calls
> panic() even though forward progress can be made.

s/mlocked/mlocked or shared/g

>
> This is the same issue where the exit path sets MMF_OOM_SKIP before
> unmapping memory and additional processes can be chosen unnecessarily
> because the oom killer is racing with exit_mmap().
>
> We can't simply defer setting MMF_OOM_SKIP, however, because if there is
> a true oom livelock in progress, it never gets set and no additional
> killing is possible.
>
> To fix this, this patch introduces a per-mm reaping timeout, initially set
> at 10s. It requires that the oom reaper's list becomes a properly linked
> list so that other mm's may be reaped while waiting for an mm's timeout to
> expire.

I already proposed more simpler one at https://patchwork.kernel.org/patch/9877991/ .

>
> The exit path will now set MMF_OOM_SKIP only after all memory has been
> freed, so additional oom killing is justified, and rely on MMF_UNSTABLE to
> determine when it can race with the oom reaper.
>
> The oom reaper will now set MMF_OOM_SKIP only after the reap timeout has
> lapsed because it can no longer guarantee forward progress.
>
> The reaping timeout is intentionally set for a substantial amount of time
> since oom livelock is a very rare occurrence and it's better to optimize
> for preventing additional (unnecessary) oom killing than a scenario that
> is much more unlikely.

But before thinking about your proposal, please think about how to guarantee
that the OOM reaper and the exit path can run discussed at
http://lkml.kernel.org/r/[email protected] .

2018-05-25 07:27:05

Subject: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: [lkp-robot] [mm, oom] 2d251ff6e6: BUG:unable_to_handle_kernel

Attachments:

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

Subject: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: [patch v2] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: [RFC PATCH] mm, oom: oom_free_timeout_ms can be static

Subject: Re: [patch v2] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes

Subject: Re: [patch] mm, oom: fix unnecessary killing of additional processes