Received: by 2002:a05:7412:bbc7:b0:fc:a2b0:25d7 with SMTP id kh7csp440539rdb; Thu, 1 Feb 2024 13:02:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IHaYoBALAHne3aS7F6wE7J6GfiioszVRpSeGOuJxWfJKFBTB6tE7l+s8wAvDVS1xdDP5bdl X-Received: by 2002:a05:620a:16bc:b0:784:b12:ba1c with SMTP id s28-20020a05620a16bc00b007840b12ba1cmr286632qkj.78.1706821351387; Thu, 01 Feb 2024 13:02:31 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706821351; cv=pass; d=google.com; s=arc-20160816; b=RV7i9/CYTf/TJ8yq4d7ltLtY2HkNx2jkRh9ph9KnzHQGig+zRz3e8+q1hamu5AFDoL 1Jy7NHzdxKBciyIjpDpNEQ08sYHDJT1EsD3c6g9F9K/7OLyQ9fWECRX6Vq9qHf5vt0Mv hL8i6uUlICNyf9Czj5Y2MvJ9V9f0AMeOZJLW3ywNEu5iO9JY/ImcxPBIpKssGqZQbBO7 0PrveUo+606MFFBmrwnSJiLZHv+SMZlLjS6AodesEzbh2kNlUOgPOuIjFN1cCsTs/RoW Z4XF97/iZQ7fpgzhNyP5xD3PBl3QI7jbyBsP3m7nYXfZ8Om4BV6FBzCnQTtUMl+2aGsS SUKA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=jOZ2TseNRwk51MKJZKHvasS/mSGbsWbhCUKsCqe4rtM=; fh=3axg0V9dcSLKg01EC6Bnkjh+meJX/dkuy4dkXWTu+oU=; b=CLa1Dxob1xLmZrDMR603/KX6JbAZ4L/QkYnxkxFzM7ZQg7VU8sPVPRA/C4Vvwqfvy0 wkf5rYv4fsubTl7YiqkxmcdLXCsr6gAZ9zD0YbOqorS7Gu8hC3g77W5iX8sR62zcuZCL SxcdYupVnL8eJ7RTNQlEWy3ElZdMOH5JLsgWnXQTiLa3ZYabHdIjrYUmdA09146BOCZK vVjMUL31Q2oIqaMuJ4OqnsOg3nmzdERY9LeJ1h9Otlxi8XT6/NSfi9qGPCZO6dm91i+v jj7Ase3FfnunvDWVVzAgG1to4nGK2kpBHkjy/8ycVQIVLABZS15UkFoGVe/moSjUUHaI Gvgw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=f9+n7d1p; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-48839-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-48839-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com X-Forwarded-Encrypted: i=1; AJvYcCXfe33yBeRIsYRZPrralOtdJjq1xus/4Fkwu1WpMMP7UC4ss1nuDQssxYWbjlt8xC4EAkew+1ImvQH8WKgYzEUzZjD9qg5qvxSWqhwLVA== Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id v4-20020a05622a014400b0042bee3b375esi388009qtw.10.2024.02.01.13.02.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Feb 2024 13:02:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-48839-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=f9+n7d1p; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-48839-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-48839-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id E7D1C1C23DD2 for ; Thu, 1 Feb 2024 21:02:30 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 40CDF3E49B; Thu, 1 Feb 2024 21:02:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="f9+n7d1p" Received: from mail-ot1-f47.google.com (mail-ot1-f47.google.com [209.85.210.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7CA33CF7A for ; Thu, 1 Feb 2024 21:02:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706821339; cv=none; b=ZfKnC00RuQ7piOtdAWm1gX8kRanfsujwbW1nsRe3XPWs/obLprDRxUwAfwKXU/0BGBWviqWrZaYnErbQGU0bOXM/pXXoZrjzhpt8/EB0ZfgwsP4at6px7Uf9ILvw8DvybsF7/Q10ZDPzLQeRYlxmS8qij3w+nrsF7SkNfmMfhcg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706821339; c=relaxed/simple; bh=0dfXP2IydAYbm2ze2LiRZyGi7712o8/4DN3sqKc38LI=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=URcZMhgK7ivd5S8NN6gXRihqlQYKv8r/IF2kiM5uujYyehWeEuxmma1ABKDgXgZeR3FzGNEXHUM7TWCwQpNmwptmltxFpt35xoKqT/lGvfuRLL6rIBI/WFiqodhhxG6N+MkXXRLtgLcVg80y1pD/8ZXIsQoimdFFZKt6xA8M6Qg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=f9+n7d1p; arc=none smtp.client-ip=209.85.210.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ot1-f47.google.com with SMTP id 46e09a7af769-6e118da997cso729890a34.3 for ; Thu, 01 Feb 2024 13:02:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1706821336; x=1707426136; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jOZ2TseNRwk51MKJZKHvasS/mSGbsWbhCUKsCqe4rtM=; b=f9+n7d1pVe5IX5pOCgqB/gZExV6rwdURzZtcwPkThCGkfq9edDmyTdC7R7uK7j3TKI 3jZVjUfTXKZ6lFmA+gwSWcB35Sj73Utsy6XA6/oeYRRumDylBRxIKnjQS2eTsVGbc7zK gRRlP7efVebnEwoduCmYZIm3LlOG6subE9e5yJA0oU/kwGw/U4v0bU3iWncYbJVs9Mqr T4SjU++mRYGX2cGrgL/9YAc8bm9vMBNHLYrTgdPb15pKzfRBWhd8wUXzXjh3mhaDhYIK NpKL50CrpPUZckEslY7QvKlFhNFhtpORJIaQjMxS4UqhJlH3anJe2YjXYgRPVg+0mQsS dCnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706821336; x=1707426136; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jOZ2TseNRwk51MKJZKHvasS/mSGbsWbhCUKsCqe4rtM=; b=v8MppPAvFOdHaHYZMK9PFiAPox1faFIfAYSGr42BUylLVUtQ/Q2jvC783R20DoZHzw OydNSYb1tKKGG0gUdlxDOsvFfsZsw6cAPYWhgKb4NjkMB8YKF+qNtmo3FC2V+mEfHdKh 8wZwHm1CYS8c8r9p6ll3b9n4sVb5sSmNHyehxkluyns4L/Bxb86fEqqx1pxP3mH7P2VF 6kgUe9FoXPfWOPIMkNJr34VCOEyHl1EQ0xGIp9Olip8GYZFw6YJwBmwKuTyqFMNFgFyd jJxhE5yT8b31X+0O5IVd35nrPTCoVRq8/VDLRBjLmny6iHt8I1dX3t3Dyu+v6AidK3ve 0RqA== X-Gm-Message-State: AOJu0YwkX7CfHwaw0vUGXnYx1XGjd5/8yh64PIKstT4dJS3HK5MewZKp 75UEB1llGohhTKbgb87tSqXRzhAy+Jlc878U5mJ1eVPnjAVu9xEzYVbwXyUS/3mlQoVMTjxY9kK ZPO/HSvrOcVSDo7//vuf7HsFdoA9lSI3dWxrT X-Received: by 2002:a05:6830:100d:b0:6e1:34d7:27b1 with SMTP id a13-20020a056830100d00b006e134d727b1mr6035791otp.8.1706821336502; Thu, 01 Feb 2024 13:02:16 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240126211927.1171338-1-tjmercier@google.com> In-Reply-To: From: "T.J. Mercier" Date: Thu, 1 Feb 2024 13:02:04 -0800 Message-ID: Subject: Re: [PATCH v2] mm: memcg: Don't periodically flush stats when memcg is disabled To: =?UTF-8?Q?Michal_Koutn=C3=BD?= Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , android-mm@google.com, Minchan Kim , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Feb 1, 2024 at 6:26=E2=80=AFAM Michal Koutn=C3=BD wrote: > > On Fri, Jan 26, 2024 at 09:19:25PM +0000, "T.J. Mercier" wrote: > > The root memcg is onlined even when memcg is disabled. When it's online= d > > a 2 second periodic stat flush is started, but no stat flushing is > > required when memcg is disabled because there can be no child memcgs. > > Most calls to flush memcg stats are avoided when memcg is disabled as a > > result of the mem_cgroup_disabled check added in 7d7ef0a4686a > > ("mm: memcg: restore subtree stats flushing"), but the periodic flushin= g > > started in mem_cgroup_css_online is not. Skip it. > > Have you tried > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -6099,6 +6099,9 @@ int __init cgroup_init(void) > cgroup_unlock(); > > for_each_subsys(ss, ssid) { > + if (!cgroup_ssid_enabled(ssid)) > + continue; > + > if (ss->early_init) { > struct cgroup_subsys_state *css =3D > init_css_set.subsys[ss->id]; > @@ -6118,9 +6121,6 @@ int __init cgroup_init(void) > * disabled flag and cftype registration needs kmalloc, > * both of which aren't available during early_init. > */ > - if (!cgroup_ssid_enabled(ssid)) > - continue; > - > if (cgroup1_ssid_disabled(ssid)) > pr_info("Disabling %s control group subsystem in = v1 mounts\n", > ss->legacy_name); > ? > I'm asking about a try because I'm not sure whether this does not blow > up due to something missing. But I think disabled controllers would not > need to be (root-)onlined at all. > > Thanks, > Michal Hi Michal, It does blow up, but not how I was expecting. There's a null pointer dereference inside find_css_set when trying to get a css pointer for the memory controller, I think because the allocation in cgroup_init_subsys is skipped: [ 9.591766] BUG: kernel NULL pointer dereference, address: 0000000000000= 000 [ 9.591909] #PF: supervisor read access in kernel mode [ 9.592023] #PF: error_code(0x0000) - not-present page [ 9.592138] PGD 0 P4D 0 [ 9.592199] Oops: 0000 [#1] PREEMPT SMP PTI [ 9.592289] CPU: 1 PID: 1 Comm: init Tainted: G E 6.7.0-mainline-maybe-dirty #1 [ 9.592490] Hardware name: emulation qemu-x86/qemu-x86, BIOS 2023.07-rc6-gb8315a3184b2-ab11347395 07/01/2023 [ 9.592731] RIP: 0010:find_css_set+0x5c3/0x7a0 [ 9.592850] Code: 20 69 cf b2 4e 8b 3c 33 48 c7 c7 1d 3b 41 b2 4c 89 fe e8 10 b0 f7 00 49 8b b4 24 a0 00 00 00 4e 8d 24 2b 49 81 c4 b0 fc ff ff <49> 8b 0f 4c 01 e9 48 c7 c7 c4 c0 46 b2 4c 89 e2 e8 e8 af f7 00 49 [ 9.593251] RSP: 0018:ffffb6218000bb90 EFLAGS: 00010087 [ 9.593370] RAX: 0000000000000021 RBX: ffff99181044a200 RCX: 4f5e789f089= a0c00 [ 9.593554] RDX: ffffb6218000ba50 RSI: ffffffffb2448284 RDI: ffffffffb2c= 91950 [ 9.593735] RBP: ffffb6218000bc28 R08: 0000000000000fff R09: ffffffffb2c= 79950 [ 9.593909] R10: 0000000000002ffd R11: 0000000000000004 R12: ffff9918104= 4a2d8 [ 9.594102] R13: 0000000000000428 R14: 0000000000000020 R15: 00000000000= 00000 [ 9.594291] FS: 00007f3d2f986fc8(0000) GS:ffff99182bd00000(0000) knlGS:0000000000000000 [ 9.594467] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9.594610] CR2: 0000000000000000 CR3: 00000001001d8001 CR4: 00000000003= 70eb0 [ 9.594818] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000= 00000 [ 9.595007] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 00000000000= 00400 [ 9.595178] Call Trace: [ 9.595237] [ 9.595297] ? __die_body+0x5e/0xb0 [ 9.595392] ? __die+0x9e/0xb0 [ 9.595481] ? page_fault_oops+0x35f/0x3d0 [ 9.595574] ? do_user_addr_fault+0x6ab/0x780 [ 9.595690] ? prb_read_valid+0x28/0x60 [ 9.595782] ? exc_page_fault+0x83/0x220 [ 9.595874] ? asm_exc_page_fault+0x2b/0x30 [ 9.595966] ? find_css_set+0x5c3/0x7a0 [ 9.596059] cgroup_migrate_prepare_dst+0x75/0x2b0 [ 9.596194] cgroup_attach_task+0x293/0x450 [ 9.596305] ? cgroup_attach_task+0xb6/0x450 [ 9.596449] __cgroup_procs_write+0xef/0x1a0 [ 9.596589] cgroup_procs_write+0x16/0x30 [ 9.596733] cgroup_file_write+0x9d/0x260 [ 9.596840] kernfs_fop_write_iter+0x145/0x1e0 [ 9.596981] vfs_write+0x276/0x2e0 [ 9.597092] ksys_write+0x73/0xe0 [ 9.597198] __x64_sys_write+0x1a/0x30 [ 9.597303] do_syscall_64+0x5a/0x100 [ 9.597430] entry_SYSCALL_64_after_hwframe+0x6e/0x76 But there are also calls to mem_cgroup_css_from_folio that I think would cause a different null pointer deref even if we had the css but no root_mem_cgroup. There seems to be an assumption that we'll have a memcg to charge against even when the controller is disabled. To me it looks like that's to simplify the possible combinations of CONFIG_MEMCG and memcg being boot-time disabled or not. Best, T.J.