2019-11-02 15:18:57

by Konstantin Khlebnikov

[permalink] [raw]
Subject: [PATCH] mm/memcontrol: update documentation about invoking oom killer

Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
charge path") memcg invokes oom killer not only for user page-faults.
This means 0-order allocation will either succeed or task get killed.

Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
Signed-off-by: Konstantin Khlebnikov <[email protected]>
---
Documentation/admin-guide/cgroup-v2.rst | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 5361ebec3361..eb47815e137b 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.

Failed allocation in its turn could be returned into
userspace as -ENOMEM or silently ignored in cases like
- disk readahead. For now OOM in memory cgroup kills
- tasks iff shortage has happened inside page fault.
+ disk readahead.
+
+ Before 4.19 OOM in memory cgroup killed tasks iff
+ shortage has happened inside page fault, random
+ syscall may fail with ENOMEM or EFAULT. Since 4.19
+ failed memory cgroup allocation invokes oom killer and
+ keeps retrying until it succeeds.

This event is not raised if the OOM killer is not
considered as an option, e.g. for failed high-order


2019-11-02 16:06:16

by damian

[permalink] [raw]
Subject: Re: [PATCH] mm/memcontrol: update documentation about invoking oom killer

On Sat, 02. Nov 18:16, Konstantin Khlebnikov wrote:
> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
> charge path") memcg invokes oom killer not only for user page-faults.
> This means 0-order allocation will either succeed or task get killed.
>
> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
> Signed-off-by: Konstantin Khlebnikov <[email protected]>
> ---
> Documentation/admin-guide/cgroup-v2.rst | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 5361ebec3361..eb47815e137b 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>
> Failed allocation in its turn could be returned into
> userspace as -ENOMEM or silently ignored in cases like
> - disk readahead. For now OOM in memory cgroup kills
> - tasks iff shortage has happened inside page fault.
> + disk readahead.
> +
> + Before 4.19 OOM in memory cgroup killed tasks iff
Hello Konstantin,

iff --> if :-)

Best regards
Damian


> + shortage has happened inside page fault, random
> + syscall may fail with ENOMEM or EFAULT. Since 4.19
> + failed memory cgroup allocation invokes oom killer and
> + keeps retrying until it succeeds.
>
> This event is not raised if the OOM killer is not
> considered as an option, e.g. for failed high-order
>

2019-11-02 16:19:00

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [PATCH] mm/memcontrol: update documentation about invoking oom killer



On 02/11/2019 19.02, Damian Tometzki wrote:
> On Sat, 02. Nov 18:16, Konstantin Khlebnikov wrote:
>> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
>> charge path") memcg invokes oom killer not only for user page-faults.
>> This means 0-order allocation will either succeed or task get killed.
>>
>> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
>> Signed-off-by: Konstantin Khlebnikov <[email protected]>
>> ---
>> Documentation/admin-guide/cgroup-v2.rst | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
>> index 5361ebec3361..eb47815e137b 100644
>> --- a/Documentation/admin-guide/cgroup-v2.rst
>> +++ b/Documentation/admin-guide/cgroup-v2.rst
>> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>>
>> Failed allocation in its turn could be returned into
>> userspace as -ENOMEM or silently ignored in cases like
>> - disk readahead. For now OOM in memory cgroup kills
>> - tasks iff shortage has happened inside page fault.
>> + disk readahead.
>> +
>> + Before 4.19 OOM in memory cgroup killed tasks iff
> Hello Konstantin,
>
> iff --> if :-)
>

This "iff" is shortened "if and only if".
https://en.wikipedia.org/wiki/If_and_only_if

> Best regards
> Damian
>
>
>> + shortage has happened inside page fault, random
>> + syscall may fail with ENOMEM or EFAULT. Since 4.19
>> + failed memory cgroup allocation invokes oom killer and
>> + keeps retrying until it succeeds.
>>
>> This event is not raised if the OOM killer is not
>> considered as an option, e.g. for failed high-order
>>

2019-11-02 16:31:23

by damian

[permalink] [raw]
Subject: Re: [PATCH] mm/memcontrol: update documentation about invoking oom killer

On Sat, 02. Nov 19:14, Konstantin Khlebnikov wrote:
>
>
> On 02/11/2019 19.02, Damian Tometzki wrote:
> > On Sat, 02. Nov 18:16, Konstantin Khlebnikov wrote:
> >> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
> >> charge path") memcg invokes oom killer not only for user page-faults.
> >> This means 0-order allocation will either succeed or task get killed.
> >>
> >> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
> >> Signed-off-by: Konstantin Khlebnikov <[email protected]>
> >> ---
> >> Documentation/admin-guide/cgroup-v2.rst | 9 +++++++--
> >> 1 file changed, 7 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> >> index 5361ebec3361..eb47815e137b 100644
> >> --- a/Documentation/admin-guide/cgroup-v2.rst
> >> +++ b/Documentation/admin-guide/cgroup-v2.rst
> >> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
> >>
> >> Failed allocation in its turn could be returned into
> >> userspace as -ENOMEM or silently ignored in cases like
> >> - disk readahead. For now OOM in memory cgroup kills
> >> - tasks iff shortage has happened inside page fault.
> >> + disk readahead.
> >> +
> >> + Before 4.19 OOM in memory cgroup killed tasks iff
> > Hello Konstantin,
> >
> > iff --> if :-)
> >
>
> This "iff" is shortened "if and only if".
> https://en.wikipedia.org/wiki/If_and_only_if

good to know :-)

>
> > Best regards
> > Damian
> >
> >
> >> + shortage has happened inside page fault, random
> >> + syscall may fail with ENOMEM or EFAULT. Since 4.19
> >> + failed memory cgroup allocation invokes oom killer and
> >> + keeps retrying until it succeeds.
> >>
> >> This event is not raised if the OOM killer is not
> >> considered as an option, e.g. for failed high-order
> >>

2019-11-02 23:57:39

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH] mm/memcontrol: update documentation about invoking oom killer

On Sat, 2 Nov 2019, Konstantin Khlebnikov wrote:

> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
> charge path") memcg invokes oom killer not only for user page-faults.
> This means 0-order allocation will either succeed or task get killed.
>
> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
> Signed-off-by: Konstantin Khlebnikov <[email protected]>
> ---
> Documentation/admin-guide/cgroup-v2.rst | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 5361ebec3361..eb47815e137b 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>
> Failed allocation in its turn could be returned into
> userspace as -ENOMEM or silently ignored in cases like
> - disk readahead. For now OOM in memory cgroup kills
> - tasks iff shortage has happened inside page fault.
> + disk readahead.
> +
> + Before 4.19 OOM in memory cgroup killed tasks iff
> + shortage has happened inside page fault, random
> + syscall may fail with ENOMEM or EFAULT. Since 4.19
> + failed memory cgroup allocation invokes oom killer and
> + keeps retrying until it succeeds.
>
> This event is not raised if the OOM killer is not
> considered as an option, e.g. for failed high-order

The previous text is obviously incorrect for today's kernels, but I'm
curious if we should be conflating the documentation here by describing
the pre-4.19 behavior. OOM killing no longer happens only on page fault
so maybe better to document the exact behavior today and not attempt to
describe differences with previous versions?

2019-11-03 10:50:25

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [PATCH] mm/memcontrol: update documentation about invoking oom killer

On 03/11/2019 02.55, David Rientjes wrote:
> On Sat, 2 Nov 2019, Konstantin Khlebnikov wrote:
>
>> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
>> charge path") memcg invokes oom killer not only for user page-faults.
>> This means 0-order allocation will either succeed or task get killed.
>>
>> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
>> Signed-off-by: Konstantin Khlebnikov <[email protected]>
>> ---
>> Documentation/admin-guide/cgroup-v2.rst | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
>> index 5361ebec3361..eb47815e137b 100644
>> --- a/Documentation/admin-guide/cgroup-v2.rst
>> +++ b/Documentation/admin-guide/cgroup-v2.rst
>> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>>
>> Failed allocation in its turn could be returned into
>> userspace as -ENOMEM or silently ignored in cases like
>> - disk readahead. For now OOM in memory cgroup kills
>> - tasks iff shortage has happened inside page fault.
>> + disk readahead.
>> +
>> + Before 4.19 OOM in memory cgroup killed tasks iff
>> + shortage has happened inside page fault, random
>> + syscall may fail with ENOMEM or EFAULT. Since 4.19
>> + failed memory cgroup allocation invokes oom killer and
>> + keeps retrying until it succeeds.
>>
>> This event is not raised if the OOM killer is not
>> considered as an option, e.g. for failed high-order
>
> The previous text is obviously incorrect for today's kernels, but I'm
> curious if we should be conflating the documentation here by describing
> the pre-4.19 behavior. OOM killing no longer happens only on page fault
> so maybe better to document the exact behavior today and not attempt to
> describe differences with previous versions?
>

Previous behaviour was here for ages and 4.19 is not so old.
According too https://www.kernel.org/category/releases.html pre-4.19 will
be maintained for couple years at least. Let's keep this tombstone.

I've seen a lot of strange side effects of old behaviour.
Most obscure was a hang inside libc fork() when clone(CLONE_CHILD_SETTID)
silently fails to set child pid =)
https://lore.kernel.org/lkml/20150206162301.18031.32251.stgit@buzz/

2019-11-05 06:12:01

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm/memcontrol: update documentation about invoking oom killer

On Sat 02-11-19 18:16:33, Konstantin Khlebnikov wrote:
> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
> charge path") memcg invokes oom killer not only for user page-faults.
> This means 0-order allocation will either succeed or task get killed.
>
> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")

Is this really appropriate? 8e675f7af507 was correct at the time. It was
29ef680ae7c2 that hasn't updated the documentation. I would just drop
the Fixes tag.

> Signed-off-by: Konstantin Khlebnikov <[email protected]>
> ---
> Documentation/admin-guide/cgroup-v2.rst | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 5361ebec3361..eb47815e137b 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>
> Failed allocation in its turn could be returned into
> userspace as -ENOMEM or silently ignored in cases like
> - disk readahead. For now OOM in memory cgroup kills
> - tasks iff shortage has happened inside page fault.
> + disk readahead.
> +
> + Before 4.19 OOM in memory cgroup killed tasks iff

I would go with Kernels between 3.12 and 4.19 invoked the oom killer
only if shortage has happened inside page fault.

> + shortage has happened inside page fault, random
> + syscall may fail with ENOMEM or EFAULT. Since 4.19
> + failed memory cgroup allocation invokes oom killer and
> + keeps retrying until it succeeds.
>
> This event is not raised if the OOM killer is not
> considered as an option, e.g. for failed high-order

--
Michal Hocko
SUSE Labs