2020-08-13 06:48:14

by Stephen Rothwell

[permalink] [raw]
Subject: linux-next: runtime warning in Linus' tree

Hi all,

Testing Linus' tree today, my qemu runs (PowerPC
powerpc_pseries_le_defconfig) produce the following WARNING:

[ 0.021401][ T0] Mount-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[ 0.021529][ T0] Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[ 0.053969][ T0] ------------[ cut here ]------------
[ 0.055220][ T0] WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5220 mem_cgroup_css_alloc+0x350/0x904
[ 0.055355][ T0] Modules linked in:
[ 0.055812][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.8.0 #5
[ 0.055976][ T0] NIP: c000000000410010 LR: c00000000040fd68 CTR: 0000000000000000
[ 0.056097][ T0] REGS: c0000000011e7ab0 TRAP: 0700 Not tainted (5.8.0)
[ 0.056162][ T0] MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 24000888 XER: 00000000
[ 0.056449][ T0] CFAR: c00000000040fd80 IRQMASK: 0
[ 0.056449][ T0] GPR00: c00000000040fd68 c0000000011e7d40 c0000000011e8300 0000000000000001
[ 0.056449][ T0] GPR04: 0000000000000228 0000000000000000 0000000000000001 ffffffffffffffff
[ 0.056449][ T0] GPR08: c00000007d003208 0000000000000000 0000000000000000 c00000007d002fe8
[ 0.056449][ T0] GPR12: 0000000000000001 c0000000013d0000 0000000000000000 00000000011dd528
[ 0.056449][ T0] GPR16: 00000000011dd840 00000000011dd690 0000000000000018 0000000000000003
[ 0.056449][ T0] GPR20: 0000000000000001 c0000000010cbcf8 0000000000000003 c0000000010cd540
[ 0.056449][ T0] GPR24: c0000000010e8778 c0000000010e9080 c0000000010cbcd8 0000000000000000
[ 0.056449][ T0] GPR28: 0000000000000000 c00000007e2a1000 c0000000010cbcc8 c00000000118ea00
[ 0.057109][ T0] NIP [c000000000410010] mem_cgroup_css_alloc+0x350/0x904
[ 0.057177][ T0] LR [c00000000040fd68] mem_cgroup_css_alloc+0xa8/0x904
[ 0.057394][ T0] Call Trace:
[ 0.057534][ T0] [c0000000011e7d40] [c00000000040fd68] mem_cgroup_css_alloc+0xa8/0x904 (unreliable)
[ 0.057814][ T0] [c0000000011e7dc0] [c000000000f5b13c] cgroup_init_subsys+0xbc/0x210
[ 0.057903][ T0] [c0000000011e7e10] [c000000000f5b690] cgroup_init+0x220/0x598
[ 0.057973][ T0] [c0000000011e7ee0] [c000000000f34354] start_kernel+0x67c/0x6ec
[ 0.058047][ T0] [c0000000011e7f90] [c00000000000cb88] start_here_common+0x1c/0x614
[ 0.058241][ T0] Instruction dump:
[ 0.058420][ T0] eac10030 eae10038 eb410050 eb610058 4bffff60 60000000 60000000 60000000
[ 0.058550][ T0] 3be00100 4bfffdfc 60000000 60000000 <0fe00000> 4bfffd70 60000000 60000000
[ 0.059381][ T0] ---[ end trace cb2d79b4994ef1fe ]---
[ 0.059810][ T0] ------------[ cut here ]------------
[ 0.059872][ T0] WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5135 mem_cgroup_css_alloc+0x750/0x904
[ 0.059930][ T0] Modules linked in:
[ 0.060053][ T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.8.0 #5
[ 0.060113][ T0] NIP: c000000000410410 LR: c00000000040ff2c CTR: 0000000000000000
[ 0.060171][ T0] REGS: c0000000011e7ab0 TRAP: 0700 Tainted: G W (5.8.0)
[ 0.060229][ T0] MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 24000880 XER: 00000000
[ 0.060332][ T0] CFAR: c00000000040fe48 IRQMASK: 0
[ 0.060332][ T0] GPR00: c00000000040ff2c c0000000011e7d40 c0000000011e8300 c00000007e234c00
[ 0.060332][ T0] GPR04: 0000000000000000 0000000000000000 c00000007e235000 0000000000000013
[ 0.060332][ T0] GPR08: 000000007ec00000 0000000000000000 0000000000000000 0000000000000001
[ 0.060332][ T0] GPR12: 0000000000000000 c0000000013d0000 0000000000000000 00000000011dd528
[ 0.060332][ T0] GPR16: 00000000011dd840 00000000011dd690 0000000000000018 0000000000000003
[ 0.060332][ T0] GPR20: c000000001223300 c000000000e95900 c00000000118ea00 c0000000012232c0
[ 0.060332][ T0] GPR24: c0000000010e8778 c0000000010e9080 0000000000400cc0 0000000000000000
[ 0.060332][ T0] GPR28: 0000000000000000 c00000007e2a1000 c00000007e234c00 0000000000000000
[ 0.060855][ T0] NIP [c000000000410410] mem_cgroup_css_alloc+0x750/0x904
[ 0.060911][ T0] LR [c00000000040ff2c] mem_cgroup_css_alloc+0x26c/0x904
[ 0.060958][ T0] Call Trace:
[ 0.061003][ T0] [c0000000011e7d40] [c00000000040ff2c] mem_cgroup_css_alloc+0x26c/0x904 (unreliable)
[ 0.061081][ T0] [c0000000011e7dc0] [c000000000f5b13c] cgroup_init_subsys+0xbc/0x210
[ 0.061165][ T0] [c0000000011e7e10] [c000000000f5b690] cgroup_init+0x220/0x598
[ 0.061233][ T0] [c0000000011e7ee0] [c000000000f34354] start_kernel+0x67c/0x6ec
[ 0.061303][ T0] [c0000000011e7f90] [c00000000000cb88] start_here_common+0x1c/0x614
[ 0.061364][ T0] Instruction dump:
[ 0.061408][ T0] ebe1fff8 7c0803a6 4e800020 60000000 60000000 3d220004 e929d230 7c3c4800
[ 0.061508][ T0] 41820190 e93c03d2 4bfffc80 60000000 <0fe00000> 4bfffa38 60000000 60000000
[ 0.061630][ T0] ---[ end trace cb2d79b4994ef1ff ]---
[ 0.096387][ T1] EEH: pSeries platform initialized
[ 0.097232][ T1] POWER8 performance monitor hardware support registered

[The line numbers in the final linux next are 5226 and 5141 due to
later patches.]

Introduced (or exposed) by commit

3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")

This commit actually adds the WARN_ON, so it either adds the bug that
sets it off, or the bug already existed.

Unfotunately, the version of this patch in linux-next up tuntil today
is different. :-(

I have left this as I have no idea how to fix it :-)

--
Cheers,
Stephen Rothwell


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-08-13 15:24:35

by Johannes Weiner

[permalink] [raw]
Subject: Re: linux-next: runtime warning in Linus' tree

On Thu, Aug 13, 2020 at 04:46:54PM +1000, Stephen Rothwell wrote:
> [ 0.055220][ T0] WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5220 mem_cgroup_css_alloc+0x350/0x904

> [The line numbers in the final linux next are 5226 and 5141 due to
> later patches.]
>
> Introduced (or exposed) by commit
>
> 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")
>
> This commit actually adds the WARN_ON, so it either adds the bug that
> sets it off, or the bug already existed.
>
> Unfotunately, the version of this patch in linux-next up tuntil today
> is different. :-(

Sorry, I made a last-minute request to include these checks in that
patch to make the code a bit more robust, but they trigger a false
positive here. Let's remove them.

---

From de8ea7c96c056c3cbe7b93995029986a158fb9cd Mon Sep 17 00:00:00 2001
From: Johannes Weiner <[email protected]>
Date: Thu, 13 Aug 2020 10:40:54 -0400
Subject: [PATCH] mm: memcontrol: fix warning when allocating the root cgroup

Commit 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the
parent cgroup") adds memory tracking to the memcg kernel structures
themselves to make cgroups liable for the memory they are consuming
through the allocation of child groups (which can be significant).

This code is a bit awkward as it's spread out through several
functions: The outermost function does memalloc_use_memcg(parent) to
set up current->active_memcg, which designates which cgroup to charge,
and the inner functions pass GFP_ACCOUNT to request charging for
specific allocations. To make sure this dependency is satisfied at all
times - to make sure we don't randomly charge whoever is calling the
functions - the inner functions warn on !current->active_memcg.

However, this triggers a false warning when the root memcg itself is
allocated. No parent exists in this case, and so current->active_memcg
is rightfully NULL. It's a false positive, not indicative of a bug.

Delete the warnings for now, we can revisit this later.

Fixes: 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")
Signed-off-by: Johannes Weiner <[email protected]>
---
mm/memcontrol.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d59fd9af6e63..9d87082e64aa 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5137,9 +5137,6 @@ static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
if (!pn)
return 1;

- /* We charge the parent cgroup, never the current task */
- WARN_ON_ONCE(!current->active_memcg);
-
pn->lruvec_stat_local = alloc_percpu_gfp(struct lruvec_stat,
GFP_KERNEL_ACCOUNT);
if (!pn->lruvec_stat_local) {
@@ -5222,9 +5219,6 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
goto fail;
}

- /* We charge the parent cgroup, never the current task */
- WARN_ON_ONCE(!current->active_memcg);
-
memcg->vmstats_local = alloc_percpu_gfp(struct memcg_vmstats_percpu,
GFP_KERNEL_ACCOUNT);
if (!memcg->vmstats_local)
--
2.28.0

2020-08-13 16:00:17

by Roman Gushchin

[permalink] [raw]
Subject: Re: linux-next: runtime warning in Linus' tree

On Thu, Aug 13, 2020 at 11:20:33AM -0400, Johannes Weiner wrote:
> On Thu, Aug 13, 2020 at 04:46:54PM +1000, Stephen Rothwell wrote:
> > [ 0.055220][ T0] WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5220 mem_cgroup_css_alloc+0x350/0x904
>
> > [The line numbers in the final linux next are 5226 and 5141 due to
> > later patches.]
> >
> > Introduced (or exposed) by commit
> >
> > 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")
> >
> > This commit actually adds the WARN_ON, so it either adds the bug that
> > sets it off, or the bug already existed.
> >
> > Unfotunately, the version of this patch in linux-next up tuntil today
> > is different. :-(
>
> Sorry, I made a last-minute request to include these checks in that
> patch to make the code a bit more robust, but they trigger a false
> positive here. Let's remove them.
>
> ---
>
> From de8ea7c96c056c3cbe7b93995029986a158fb9cd Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <[email protected]>
> Date: Thu, 13 Aug 2020 10:40:54 -0400
> Subject: [PATCH] mm: memcontrol: fix warning when allocating the root cgroup
>
> Commit 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the
> parent cgroup") adds memory tracking to the memcg kernel structures
> themselves to make cgroups liable for the memory they are consuming
> through the allocation of child groups (which can be significant).
>
> This code is a bit awkward as it's spread out through several
> functions: The outermost function does memalloc_use_memcg(parent) to
> set up current->active_memcg, which designates which cgroup to charge,
> and the inner functions pass GFP_ACCOUNT to request charging for
> specific allocations. To make sure this dependency is satisfied at all
> times - to make sure we don't randomly charge whoever is calling the
> functions - the inner functions warn on !current->active_memcg.
>
> However, this triggers a false warning when the root memcg itself is
> allocated. No parent exists in this case, and so current->active_memcg
> is rightfully NULL. It's a false positive, not indicative of a bug.
>
> Delete the warnings for now, we can revisit this later.
>
> Fixes: 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")
> Signed-off-by: Johannes Weiner <[email protected]>

Acked-by: Roman Gushchin <[email protected]>

Thanks!


> ---
> mm/memcontrol.c | 6 ------
> 1 file changed, 6 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index d59fd9af6e63..9d87082e64aa 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5137,9 +5137,6 @@ static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
> if (!pn)
> return 1;
>
> - /* We charge the parent cgroup, never the current task */
> - WARN_ON_ONCE(!current->active_memcg);
> -
> pn->lruvec_stat_local = alloc_percpu_gfp(struct lruvec_stat,
> GFP_KERNEL_ACCOUNT);
> if (!pn->lruvec_stat_local) {
> @@ -5222,9 +5219,6 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
> goto fail;
> }
>
> - /* We charge the parent cgroup, never the current task */
> - WARN_ON_ONCE(!current->active_memcg);
> -
> memcg->vmstats_local = alloc_percpu_gfp(struct memcg_vmstats_percpu,
> GFP_KERNEL_ACCOUNT);
> if (!memcg->vmstats_local)
> --
> 2.28.0
>