2022-07-26 08:26:56

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-memcontrol-fix-potential-oom_lock-recursion-deadlock.patch added to mm-unstable branch

As we have concluded there are two issues possible here which would be
great to have reflected in the changelog.

On Mon 25-07-22 15:00:32, Andrew Morton wrote:
> From: Tetsuo Handa <[email protected]>
> Subject: mm: memcontrol: fix potential oom_lock recursion deadlock
> Date: Fri, 22 Jul 2022 19:45:39 +0900
>
> syzbot is reporting GFP_KERNEL allocation with oom_lock held when
> reporting memcg OOM [1]. Such allocation request might deadlock the
> system, for __alloc_pages_may_oom() cannot invoke global OOM killer due to
> oom_lock being already held by the caller.

I would phrase it like this:
syzbot is reporting GFP_KERNEL allocation with oom_lock held when
reporting memcg OOM [1]. This is problematic because this creates a
dependency between GFP_NOFS and GFP_KERNEL over oom_lock which could
dead lock the system.

There is another problem here not reflected by the report though. If
memcg oom path happens during the global OOM situation then the system
might livelock as well because the GFP_KERNEL allocation from the
oom_lock context cannot trigger the global OOM killer because that
requires the oom_lock as well.

> Fix this problem by removing the allocation from memory_stat_format()

s@this problem@both issues@

> completely, and pass static buffer when calling from memcg OOM path.
>
> Link: https://syzkaller.appspot.com/bug?extid=2d2aeadc6ce1e1f11d45 [1]
> Link: https://lkml.kernel.org/r/[email protected]
> Fixes: c8713d0b23123759 ("mm: memcontrol: dump memory.stat during cgroup OOM")
> Signed-off-by: Tetsuo Handa <[email protected]>
> Reported-by: syzbot <[email protected]>
> Suggested-by: Michal Hocko <[email protected]>
> Acked-by: Michal Hocko <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Roman Gushchin <[email protected]>
> Cc: Shakeel Butt <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> ---
>
> mm/memcontrol.c | 22 +++++++++-------------
> 1 file changed, 9 insertions(+), 13 deletions(-)
>
> --- a/mm/memcontrol.c~mm-memcontrol-fix-potential-oom_lock-recursion-deadlock
> +++ a/mm/memcontrol.c
> @@ -1490,14 +1490,12 @@ static const unsigned int memcg_vm_event
> #endif
> };
>
> -static char *memory_stat_format(struct mem_cgroup *memcg)
> +static void memory_stat_format(struct mem_cgroup *memcg, char *buf, int bufsize)
> {
> struct seq_buf s;
> int i;
>
> - seq_buf_init(&s, kmalloc(PAGE_SIZE, GFP_KERNEL), PAGE_SIZE);
> - if (!s.buffer)
> - return NULL;
> + seq_buf_init(&s, buf, bufsize);
>
> /*
> * Provide statistics on the state of the memory subsystem as
> @@ -1539,8 +1537,6 @@ static char *memory_stat_format(struct m
>
> /* The above should easily fit into one page */
> WARN_ON_ONCE(seq_buf_has_overflowed(&s));
> -
> - return s.buffer;
> }
>
> #define K(x) ((x) << (PAGE_SHIFT-10))
> @@ -1576,7 +1572,10 @@ void mem_cgroup_print_oom_context(struct
> */
> void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
> {
> - char *buf;
> + /* Use static buffer, for the caller is holding oom_lock. */
> + static char buf[PAGE_SIZE];
> +
> + lockdep_assert_held(&oom_lock);
>
> pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n",
> K((u64)page_counter_read(&memcg->memory)),
> @@ -1597,11 +1596,8 @@ void mem_cgroup_print_oom_meminfo(struct
> pr_info("Memory cgroup stats for ");
> pr_cont_cgroup_path(memcg->css.cgroup);
> pr_cont(":");
> - buf = memory_stat_format(memcg);
> - if (!buf)
> - return;
> + memory_stat_format(memcg, buf, sizeof(buf));
> pr_info("%s", buf);
> - kfree(buf);
> }
>
> /*
> @@ -6405,11 +6401,11 @@ static int memory_events_local_show(stru
> static int memory_stat_show(struct seq_file *m, void *v)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
> - char *buf;
> + char *buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
>
> - buf = memory_stat_format(memcg);
> if (!buf)
> return -ENOMEM;
> + memory_stat_format(memcg, buf, PAGE_SIZE);
> seq_puts(m, buf);
> kfree(buf);
> return 0;
> _
>
> Patches currently in -mm which might be from [email protected] are
>
> mm-shrinkers-fix-double-kfree-on-shrinker-name.patch
> mm-memcontrol-fix-potential-oom_lock-recursion-deadlock.patch

--
Michal Hocko
SUSE Labs


2022-07-26 11:49:22

by Tetsuo Handa

[permalink] [raw]
Subject: Re: + mm-memcontrol-fix-potential-oom_lock-recursion-deadlock.patch added to mm-unstable branch

On 2022/07/26 17:14, Michal Hocko wrote:
> As we have concluded there are two issues possible here which would be
> great to have reflected in the changelog.
>
> On Mon 25-07-22 15:00:32, Andrew Morton wrote:
>> From: Tetsuo Handa <[email protected]>
>> Subject: mm: memcontrol: fix potential oom_lock recursion deadlock
>> Date: Fri, 22 Jul 2022 19:45:39 +0900
>>
>> syzbot is reporting GFP_KERNEL allocation with oom_lock held when
>> reporting memcg OOM [1]. Such allocation request might deadlock the
>> system, for __alloc_pages_may_oom() cannot invoke global OOM killer due to
>> oom_lock being already held by the caller.
>
> I would phrase it like this:

This report is difficult to explain correctly.

> syzbot is reporting GFP_KERNEL allocation with oom_lock held when
> reporting memcg OOM [1].

Correct. But

> This is problematic because this creates a
> dependency between GFP_NOFS and GFP_KERNEL over oom_lock which could
> dead lock the system.

oom_lock is irrelevant when trying GFP_KERNEL allocation from GFP_NOFS
context. Therefore, something like:

----------
syzbot is reporting GFP_KERNEL allocation with oom_lock held when
reporting memcg OOM [1]. If this allocation triggers the global OOM
situation then the system can livelock because the GFP_KERNEL allocation
with oom_lock held cannot trigger the global OOM killer because
__alloc_pages_may_oom() fails to hold oom_lock.

Fix this problem by removing the allocation from memory_stat_format()
completely, and pass static buffer when calling from memcg OOM path.

Note that the caller holding filesystem lock was the trigger for syzbot
to report this locking dependency. Doing GFP_KERNEL allocation with
filesystem lock held can deadlock the system even without involving OOM
situation.
----------

2022-07-26 12:21:57

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-memcontrol-fix-potential-oom_lock-recursion-deadlock.patch added to mm-unstable branch

On Tue 26-07-22 20:31:17, Tetsuo Handa wrote:
> On 2022/07/26 17:14, Michal Hocko wrote:
> > As we have concluded there are two issues possible here which would be
> > great to have reflected in the changelog.
> >
> > On Mon 25-07-22 15:00:32, Andrew Morton wrote:
> >> From: Tetsuo Handa <[email protected]>
> >> Subject: mm: memcontrol: fix potential oom_lock recursion deadlock
> >> Date: Fri, 22 Jul 2022 19:45:39 +0900
> >>
> >> syzbot is reporting GFP_KERNEL allocation with oom_lock held when
> >> reporting memcg OOM [1]. Such allocation request might deadlock the
> >> system, for __alloc_pages_may_oom() cannot invoke global OOM killer due to
> >> oom_lock being already held by the caller.
> >
> > I would phrase it like this:
>
> This report is difficult to explain correctly.
>
> > syzbot is reporting GFP_KERNEL allocation with oom_lock held when
> > reporting memcg OOM [1].
>
> Correct. But
>
> > This is problematic because this creates a
> > dependency between GFP_NOFS and GFP_KERNEL over oom_lock which could
> > dead lock the system.
>
> oom_lock is irrelevant when trying GFP_KERNEL allocation from GFP_NOFS
> context. Therefore, something like:

I meant to say there is a dependency chain
potential_fs_lock
GFP_NOFS
oom_lock
GFP_KERNEL
potentiaL_lock
oom_lock

> ----------
> syzbot is reporting GFP_KERNEL allocation with oom_lock held when
> reporting memcg OOM [1]. If this allocation triggers the global OOM
> situation then the system can livelock because the GFP_KERNEL allocation
> with oom_lock held cannot trigger the global OOM killer because
> __alloc_pages_may_oom() fails to hold oom_lock.
>
> Fix this problem by removing the allocation from memory_stat_format()
> completely, and pass static buffer when calling from memcg OOM path.
>
> Note that the caller holding filesystem lock was the trigger for syzbot
> to report this locking dependency. Doing GFP_KERNEL allocation with
> filesystem lock held can deadlock the system even without involving OOM
> situation.
> ----------

But this sounds good as well.

Thanks!

--
Michal Hocko
SUSE Labs

2022-07-26 19:13:28

by Andrew Morton

[permalink] [raw]
Subject: Re: + mm-memcontrol-fix-potential-oom_lock-recursion-deadlock.patch added to mm-unstable branch

On Tue, 26 Jul 2022 20:31:17 +0900 Tetsuo Handa <[email protected]> wrote:

> syzbot is reporting GFP_KERNEL allocation with oom_lock held when
> reporting memcg OOM [1]. If this allocation triggers the global OOM
> situation then the system can livelock because the GFP_KERNEL allocation
> with oom_lock held cannot trigger the global OOM killer because
> __alloc_pages_may_oom() fails to hold oom_lock.
>
> Fix this problem by removing the allocation from memory_stat_format()
> completely, and pass static buffer when calling from memcg OOM path.
>
> Note that the caller holding filesystem lock was the trigger for syzbot
> to report this locking dependency. Doing GFP_KERNEL allocation with
> filesystem lock held can deadlock the system even without involving OOM
> situation.

I used the above as the new changelog text.