Before commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to
the charge path"), all memcg oom killers were delayed to page fault
path. And the explicit wakeup is used in this case:
thread A:
...
if (locked) { // complete oom-kill, hold the lock
mem_cgroup_oom_unlock(memcg);
...
}
...
thread B:
...
if (locked && !memcg->oom_kill_disable) {
...
} else {
schedule(); // can't acquire the lock
...
}
...
The reason is that thread A kicks off the OOM-killer, which leads to
wakeups from the uncharges of the exiting task. But thread B is not
guaranteed to see them if it enters the OOM path after the OOM kills
but before thread A releases the lock.
Now only oom_kill_disable case is handled from the #PF path. In that
case it is userspace to trigger the wake up not the #PF path itself.
All potential paths to free some charges are responsible to call
memcg_oom_recover() , so the explicit wakeup is not needed in the
mem_cgroup_oom_synchronize() path which doesn't release any memory
itself.
Signed-off-by: Haifeng Xu <[email protected]>
Suggested-by: Michal Hocko <[email protected]>
---
v2: split original into two and improve patch description
---
mm/memcontrol.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fbf4d2bb1003..710ce3e7824f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2003,15 +2003,8 @@ bool mem_cgroup_oom_synchronize(bool handle)
mem_cgroup_unmark_under_oom(memcg);
finish_wait(&memcg_oom_waitq, &owait.wait);
- if (locked) {
+ if (locked)
mem_cgroup_oom_unlock(memcg);
- /*
- * There is no guarantee that an OOM-lock contender
- * sees the wakeups triggered by the OOM kill
- * uncharges. Wake any sleepers explicitly.
- */
- memcg_oom_recover(memcg);
- }
cleanup:
current->memcg_in_oom = NULL;
css_put(&memcg->css);
--
2.25.1
On Wed 19-04-23 03:07:39, Haifeng Xu wrote:
> Before commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to
> the charge path"), all memcg oom killers were delayed to page fault
> path. And the explicit wakeup is used in this case:
>
> thread A:
> ...
> if (locked) { // complete oom-kill, hold the lock
> mem_cgroup_oom_unlock(memcg);
> ...
> }
> ...
>
> thread B:
> ...
>
> if (locked && !memcg->oom_kill_disable) {
> ...
> } else {
> schedule(); // can't acquire the lock
> ...
> }
> ...
>
> The reason is that thread A kicks off the OOM-killer, which leads to
> wakeups from the uncharges of the exiting task. But thread B is not
> guaranteed to see them if it enters the OOM path after the OOM kills
> but before thread A releases the lock.
>
> Now only oom_kill_disable case is handled from the #PF path. In that
> case it is userspace to trigger the wake up not the #PF path itself.
> All potential paths to free some charges are responsible to call
> memcg_oom_recover() , so the explicit wakeup is not needed in the
> mem_cgroup_oom_synchronize() path which doesn't release any memory
> itself.
>
> Signed-off-by: Haifeng Xu <[email protected]>
> Suggested-by: Michal Hocko <[email protected]>
I hope I haven't missed anything but this looks good to me. One way to
test this would be a parallel OOM triggering workload which uses page
faults and an automatic user space driven oom killer (detect under_oom
and send SIGKILL to a random task after a random timeout). Objective is
that no task gets stuck in mem_cgroup_oom_synchronize. I am pretty sure
this could be easily turned into a selftest.
Acked-by: Michal Hocko <[email protected]>
> ---
> v2: split original into two and improve patch description
> ---
> mm/memcontrol.c | 9 +--------
> 1 file changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index fbf4d2bb1003..710ce3e7824f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2003,15 +2003,8 @@ bool mem_cgroup_oom_synchronize(bool handle)
> mem_cgroup_unmark_under_oom(memcg);
> finish_wait(&memcg_oom_waitq, &owait.wait);
>
> - if (locked) {
> + if (locked)
> mem_cgroup_oom_unlock(memcg);
> - /*
> - * There is no guarantee that an OOM-lock contender
> - * sees the wakeups triggered by the OOM kill
> - * uncharges. Wake any sleepers explicitly.
> - */
> - memcg_oom_recover(memcg);
> - }
> cleanup:
> current->memcg_in_oom = NULL;
> css_put(&memcg->css);
> --
> 2.25.1
--
Michal Hocko
SUSE Labs