Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp2190464rdb; Thu, 21 Sep 2023 11:00:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEo0xKtHQEbzqPaytcuhrOVYgRg1chz4LRdJEIWF79tlg4IBCznG9q/9I3QotbjFEr692ed X-Received: by 2002:a05:6358:2787:b0:139:c7cb:77b8 with SMTP id l7-20020a056358278700b00139c7cb77b8mr7530773rwb.20.1695319229080; Thu, 21 Sep 2023 11:00:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695319229; cv=none; d=google.com; s=arc-20160816; b=UGAJm0jZWwh5+yEinI01WVAdN2pQ4MFkRxGQKkxZLrfS6dN/daONfBtMqZdhPKfGLj xb5X+rhfP82Pt2G2qvYWUv27vlV2/14AtWQB7m+oyKF9O1Pu4fop/fKAfsyT9Z9A36jL Id1aK4C9Phdn+P3I0+Wy/GR66ydbCUhZXV3Tbu7pyLg1xmaMAbfh26i7pZXPrNgf7Zhe yRwbGCWSORhKDV9Ojdw12p0SuO1sMAfqV0ygjifCen7A/youTn7L1IFzEFPR4fYbVSKM wNt8zh0Gx4gGQAcYEQgEprZ8NkBdw+uPoYBKBaW+ClCFYMkLjjnWPlDU9As6qJAo54kk 3bFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=nEO84+iq6kC3VEaU1+DAhOho9ryfXauqS0h4qPx+tVo=; fh=e+JCg35yaW7N/gWlOR0L/RCSino+W5T9C2oW0PyEm9k=; b=VZwJdKAx222TIqfaAL5AiqPUJem/WwPGI9xbJx00SmZitQlB8wAkLGXfysdmq8wW2/ nkwXZ3MeUECcB09/X1Kh+3g3pFeLHp404886v+iyT3mK0J8kp7i26HOGrY1fzA0R4y9n XvGkb15o8p6jlxiLYxEhv+5e1CkCdI/JVldgj7Q3jTbLPEQAGBaNI8w+jY6GZNo6aHdp ZX5wQAa31tcoStj2E2hDxN/a6s9G0F9rnk38swPZ1Kfj+cmgqjbA0wXh73URkB1uVjrc flcMVT04gYsITfH1zVv23XGLAq7nwguzagfgh7cwmVEIXg3IkfxJZufvi98BBIRHTqHo QONw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=UmgM4a5e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id r20-20020a6560d4000000b00565eee2d0fesi1563918pgv.324.2023.09.21.11.00.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 11:00:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=UmgM4a5e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 77F6282A048B; Thu, 21 Sep 2023 10:53:16 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229842AbjIURxF (ORCPT + 99 others); Thu, 21 Sep 2023 13:53:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230093AbjIURwm (ORCPT ); Thu, 21 Sep 2023 13:52:42 -0400 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75CBB301E7 for ; Thu, 21 Sep 2023 10:25:24 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1c4084803f1so10545ad.0 for ; Thu, 21 Sep 2023 10:25:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695317123; x=1695921923; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nEO84+iq6kC3VEaU1+DAhOho9ryfXauqS0h4qPx+tVo=; b=UmgM4a5eO87NvvqT4VbsUD0cVBF7B5KILCjrBj+T+pVfrTpG9Rg47p6TdtG/FmvTtP lmh/zpJ54NrDpw09b64iEmxoDm5ZwAkgb3eI4yKkHGbKZIA50+BUoksudnqUWQkrgEfy Enw2O0EfHEppZeg/y78ZI2SnIBpAJJAxpobZ2XdhAa75qu2F7mqU/JbOYsMz7YM0I14I cUQx+o+5dj7442hvkHKh39JqnGtvxt8sKLbAuoWMVJESGOhBxN1X436pK8LhOLPa68qR +V2f82kuKT/dDxgtQhKmdkFE356gUZkRv+ndek7XB2NvLPlzuW1NlVcTpSAf1Nryj9BC 31DA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695317123; x=1695921923; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nEO84+iq6kC3VEaU1+DAhOho9ryfXauqS0h4qPx+tVo=; b=ZizGc0xS3YaHzNrmWr5ApUnUCPMLQm2fxMoSl4igJzPyJ0Ptc+jjcbppG/ZZmHRq2v kWsW0hrybAu1QWS6hpJcEI9rBADMl7p2/3KZKgVDjNJvGx21zTm2fhYD1j1hX1ECm0gh TQXu22dqpcnWZZe8TcJlBGnflTrOu+nbh64ENoEi06WPNjp+RrbBOEPZDNKH/XNN+8yU e0sTTnrS1TzBrAQjOGBUvbtqdH/aUECSl1blumElxq/dUMfD+h+v3Cy6rpP+8wSSK/I4 xnzzRwJaPEvKEOIXWq5T6MDSspioKKpdsH8fwusEFO9UDLRjcpil4i2SU8Es6JyAfuGb /Bcw== X-Gm-Message-State: AOJu0Yyl8mkQVaVibVydQaisHO7mpibGeQvJQuebkw57fGdyLqiVWyVg 8Z9MHqiersS4iiwoTPqWXDOWdu7QltSNBbItDpR3SyZtfEPfwkm5FwA= X-Received: by 2002:a17:902:e744:b0:1b8:89fd:61ea with SMTP id p4-20020a170902e74400b001b889fd61eamr201854plf.1.1695317123272; Thu, 21 Sep 2023 10:25:23 -0700 (PDT) MIME-Version: 1.0 References: <4eb47d6a-b127-4aad-af30-896c3b9505b4@linux.microsoft.com> In-Reply-To: From: Shakeel Butt Date: Thu, 21 Sep 2023 10:25:11 -0700 Message-ID: Subject: Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes To: Michal Hocko Cc: Jeremi Piotrowski , Johannes Weiner , Roman Gushchin , Muchun Song , Greg Kroah-Hartman , stable@vger.kernel.org, patches@lists.linux.dev, Tejun Heo , Andrew Morton , linux-kernel@vger.kernel.org, regressions@lists.linux.dev, mathieu.tortuyaux@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Thu, 21 Sep 2023 10:53:16 -0700 (PDT) On Thu, Sep 21, 2023 at 4:21=E2=80=AFAM Michal Hocko wrot= e: > > On Thu 21-09-23 12:43:05, Jeremi Piotrowski wrote: > > On 9/21/2023 9:52 AM, Michal Hocko wrote: > > > On Wed 20-09-23 14:46:52, Shakeel Butt wrote: > > >> On Wed, Sep 20, 2023 at 1:08=E2=80=AFPM Michal Hocko wrote: > > >>> > > >> [...] > > >>>> have a strong opinion against it. Also just to be clear we are not > > >>>> talking about full revert of 58056f77502f but just the returning o= f > > >>>> EOPNOTSUPP, right? > > >>> > > >>> If we allow the limit to be set without returning a failure then we > > >>> still have options 2 and 3 on how to deal with that. One of them is= to > > >>> enforce the limit. > > >>> > > >> > > >> Option 3 is a partial revert of 58056f77502f where we keep the no > > >> limit enforcement and remove the EOPNOTSUPP return on write. Let's g= o > > >> with option 3. In addition, let's add pr_warn_once on the read of > > >> kmem.limit_in_bytes as well. > > > > > > How about this? > > > --- > > > > I'm OK with this approach. You're missing this in the patch below: > > > > // static struct cftype mem_cgroup_legacy_files[] =3D { > > > > + { > > + .name =3D "kmem.limit_in_bytes", > > + .private =3D MEMFILE_PRIVATE(_KMEM, RES_LIMIT), > > + .write =3D mem_cgroup_write, > > + .read_u64 =3D mem_cgroup_read_u64, > > + }, > > Of course. I've lost the hunk while massaging the revert. Thanks for > spotting. Updated version below. Btw. I've decided to not pr_{warn,info} > on the read side because realistically I do not think this will help all > that much. I am worried we will get stuck with this for ever because > there always be somebody stuck on unpatched userspace. > --- > From bb6702b698efd31f3f90f4f1dd36ffe223397bec Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Thu, 21 Sep 2023 09:38:29 +0200 > Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation > > This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes") > and partially reverts 58056f77502f ("memcg, kmem: further deprecate > kmem.limit_in_bytes") which have incrementally removed support for the > kernel memory accounting hard limit. Unfortunately it has turned out > that there is still userspace depending on the existence of > memory.kmem.limit_in_bytes [1]. The underlying functionality is not > really required but the non-existent file just confuses the userspace > which fails in the result. The patch to fix this on the userspace side > has been submitted but it is hard to predict how it will propagate > through the maze of 3rd party consumers of the software. > > Now, reverting alone 86327e8eb94c is not an option because there is > another set of userspace which cannot cope with ENOTSUPP returned when > writing to the file. Therefore we have to go and revisit 58056f77502f > as well. There are two ways to go ahead. Either we give up on the > deprecation and fully revert 58056f77502f as well or we can keep > kmem.limit_in_bytes but make the write a noop and warn about the fact. > This should work for both known breaking workloads which depend on the > existence but do not depend on the hard limit enforcement. > > [1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yc= tzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net > Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes") > Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes"= ) > Signed-off-by: Michal Hocko With one request below: Acked-by: Shakeel Butt > --- > Documentation/admin-guide/cgroup-v1/memory.rst | 7 +++++++ > mm/memcontrol.c | 18 ++++++++++++++++++ > 2 files changed, 25 insertions(+) > > diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentati= on/admin-guide/cgroup-v1/memory.rst > index 5f502bf68fbc..ff456871bf4b 100644 > --- a/Documentation/admin-guide/cgroup-v1/memory.rst > +++ b/Documentation/admin-guide/cgroup-v1/memory.rst > @@ -92,6 +92,13 @@ Brief summary of control files. > memory.oom_control set/show oom controls. > memory.numa_stat show the number of memory usage per = numa > node > + memory.kmem.limit_in_bytes Deprecated knob to set and read the= kernel > + memory hard limit. Kernel hard limi= t is not > + supported since 5.16. Writing any v= alue to > + do file will not have any effect sa= me as if > + nokmem kernel parameter was specifi= ed. > + Kernel memory is still charged and = reported > + by memory.kmem.usage_in_bytes. > memory.kmem.usage_in_bytes show current kernel memory allocati= on > memory.kmem.failcnt show the number of kernel memory us= age > hits limits > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index a4d3282493b6..0b161705ef36 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cg= roup *objcg, > static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp, > unsigned int nr_pages) > { > + struct page_counter *counter; > struct mem_cgroup *memcg; > int ret; > > @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgro= up *objcg, gfp_t gfp, > goto out; > > memcg_account_kmem(memcg, nr_pages); > + > + /* There is no way to set up kmem hard limit so this operation ca= nnot fail */ > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) > + WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, = &counter)); WARN_ON_ONCE() please.