Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp1207733rdb; Wed, 20 Sep 2023 02:47:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHsqjT4eJXY3D/Zz4RWoNNTk+Hi8jUp8/kNYdiTJ3HqIXwQyEyNL8kuAFH6rpE7oK954aaJ X-Received: by 2002:a17:903:1107:b0:1b9:de67:286f with SMTP id n7-20020a170903110700b001b9de67286fmr2118635plh.49.1695203272188; Wed, 20 Sep 2023 02:47:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695203272; cv=none; d=google.com; s=arc-20160816; b=tb2PEVrYhu+CH80AI2PPdd6V13XPK2rmfI+8E5cgG7pTGsDBlUFL2UgDNizx4jaO8O EGeA9utSk4bgpoVazTRC9aBy5HHkcztd2lIK8IciLc4ZCrXhHW1E60B/DhbjJtTrLX3c P2qc5zx2s73aQ2f5aShtkKK8qZZ6IUvN+12rOwJojDShHsHqo0jSbldLgqMKU2T1ohOp g1qvgYtmzvS7NK+9poDvztRakKftD0kNrxT+7wTQduLEULiZ7/cQrEtWXOp0tTETjO0x n683o93S39AbhQ31rgIvDIU81q31nA3B2SgwnM6grVEZWYC5Fq5Uako7KOjYiCGdKod0 Yz4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature:dkim-filter; bh=MCcFdYcZBq72znSVOGs8taMAEPXbYkraDxusQXChN6k=; fh=n/EdGVqzh4rvKoTQlzPuj+pNtjo0mY05yubY0K50RiM=; b=Jo6b2UoXbR45xApBwlPHIK+7FTtde1MMYzwMwzlrg5t8si7+vlxS4/HoIxTSUnSc3Z q0U9LQoc9/ej2Ov4UCma16ge5HOycn3V3u8dbcx3ahlvmZbEAAvRxQ31aUIRrRLlVU9F Wdpwwbf1lQ7K6+waaS6PgyGqmvgs8Dgf3xWug0m+Sj+L8lJOPuVPk0aWWf9g+hYAyXbs tvFA1/W4daF+czRsdihWljrrHKk2hp81/6ljlvrNwtbLP/HCex9tQVyvGXlHBo+2uQZz R0JfmNiEqX4u3cWR7yh6Jt4SJbPr/li1WZbUP+Es7XsVkhc7hQtoQ9ByL/hNgJjRE9ck M8Wg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=KU12Ucla; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id n6-20020a170903110600b001bb1504b696si11883724plh.4.2023.09.20.02.47.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Sep 2023 02:47:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=KU12Ucla; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 1875E80A2346; Wed, 20 Sep 2023 01:11:15 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232575AbjITILK (ORCPT + 99 others); Wed, 20 Sep 2023 04:11:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230447AbjITILJ (ORCPT ); Wed, 20 Sep 2023 04:11:09 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id BB6ADA1; Wed, 20 Sep 2023 01:11:02 -0700 (PDT) Received: by linux.microsoft.com (Postfix, from userid 1112) id 00268212C4B0; Wed, 20 Sep 2023 01:11:01 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 00268212C4B0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1695197462; bh=MCcFdYcZBq72znSVOGs8taMAEPXbYkraDxusQXChN6k=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=KU12UclarmRWfDJZUmgx0gvenLqRo5r/VS7sUpslwGHf4LO3Wws2ZA5jOG84QhruY 8HhLeUk21ak2eADTaCPLv7qJcrj3VPOyLofny44JVT1JcWCDMyJGhLum8gTiXCOLhN zyL+vxc/1fN0wN6vRAIz+G6P9iGEfIDV/r+0ixK4= Date: Wed, 20 Sep 2023 01:11:01 -0700 From: Jeremi Piotrowski To: Greg Kroah-Hartman Cc: stable@vger.kernel.org, patches@lists.linux.dev, Michal Hocko , Shakeel Butt , Johannes Weiner , Roman Gushchin , Muchun Song , Tejun Heo , Andrew Morton , linux-kernel@vger.kernel.org, regressions@lists.linux.dev, mathieu.tortuyaux@gmail.com Subject: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes Message-ID: <20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <20230917191040.964416434@linuxfoundation.org> <20230917191042.204185566@linuxfoundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230917191042.204185566@linuxfoundation.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 20 Sep 2023 01:11:15 -0700 (PDT) On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote: > 6.1-stable review patch. If anyone has any objections, please let me know. > > ------------------ Hi Greg/Michal, This commit breaks userspace which makes it a bad commit for mainline and an even worse commit for stable. We ingested 6.1.54 into our nightly testing and found that runc fails to gather cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored into kubelet and kubelet fails to start if this operation fails. 6.1.53 is fine. > Address this by wiping out the file completely and effectively get back to > pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration. On reads, the runc code checks for MEMCG_KMEM=n by checking kmem.usage_in_bytes. If it is present then runc expects the other cgroup files to be there (including kmem.limit_in_bytes). So this change is not effectively the same. Here's a link to the PR that would be needed to handle this change in userspace (not merged yet and would need to be propagated through the ecosystem): https://github.com/opencontainers/runc/pull/4018. Jeremi > > From: Michal Hocko > > commit 86327e8eb94c52eca4f93cfece2e29d1bf52acbf upstream. > > kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been > deprecated since 58056f77502f ("memcg, kmem: further deprecate > kmem.limit_in_bytes") merged in 5.16. We haven't heard about any serious > users since then but it seems that the mere presence of the file is > causing more harm thatn good. We (SUSE) have had several bug reports from > customers where Docker based containers started to fail because a write to > kmem.limit_in_bytes has failed. > > This was unexpected because runc code only expects ENOENT (kmem disabled) > or EBUSY (tasks already running within cgroup). So a new error code was > unexpected and the whole container startup failed. This has been later > addressed by > https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380 > so current Docker runtimes do not suffer from the problem anymore. There > are still older version of Docker in use and likely hard to get rid of > completely. > > Address this by wiping out the file completely and effectively get back to > pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration. > > I would recommend backporting to stable trees which have picked up > 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes"). > > [mhocko@suse.com: restore _KMEM switch case] > Link: https://lkml.kernel.org/r/ZKe5wxdbvPi5Cwd7@dhcp22.suse.cz > Link: https://lkml.kernel.org/r/20230704115240.14672-1-mhocko@kernel.org > Signed-off-by: Michal Hocko > Acked-by: Shakeel Butt > Acked-by: Johannes Weiner > Acked-by: Roman Gushchin > Cc: Muchun Song > Cc: Tejun Heo > Cc: > Signed-off-by: Andrew Morton > Signed-off-by: Greg Kroah-Hartman > --- > Documentation/admin-guide/cgroup-v1/memory.rst | 2 -- > mm/memcontrol.c | 10 ---------- > 2 files changed, 12 deletions(-) > > --- a/Documentation/admin-guide/cgroup-v1/memory.rst > +++ b/Documentation/admin-guide/cgroup-v1/memory.rst > @@ -91,8 +91,6 @@ Brief summary of control files. > memory.oom_control set/show oom controls. > memory.numa_stat show the number of memory usage per numa > node > - memory.kmem.limit_in_bytes This knob is deprecated and writing to > - it will return -ENOTSUPP. > memory.kmem.usage_in_bytes show current kernel memory allocation > memory.kmem.failcnt show the number of kernel memory usage > hits limits > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3841,10 +3841,6 @@ static ssize_t mem_cgroup_write(struct k > case _MEMSWAP: > ret = mem_cgroup_resize_max(memcg, nr_pages, true); > break; > - case _KMEM: > - /* kmem.limit_in_bytes is deprecated. */ > - ret = -EOPNOTSUPP; > - break; > case _TCP: > ret = memcg_update_tcp_max(memcg, nr_pages); > break; > @@ -5056,12 +5052,6 @@ static struct cftype mem_cgroup_legacy_f > }, > #endif > { > - .name = "kmem.limit_in_bytes", > - .private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT), > - .write = mem_cgroup_write, > - .read_u64 = mem_cgroup_read_u64, > - }, > - { > .name = "kmem.usage_in_bytes", > .private = MEMFILE_PRIVATE(_KMEM, RES_USAGE), > .read_u64 = mem_cgroup_read_u64, > >