Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp2973011rdh; Wed, 27 Sep 2023 20:47:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG3hhxcWT6gUcCgjOu9AvRIQXFWt33mfog1bz3AXXFC6fvLFUD97hVF0PBbXNEjQiRTQy4a X-Received: by 2002:a67:ef9b:0:b0:454:6f12:3f67 with SMTP id r27-20020a67ef9b000000b004546f123f67mr32709vsp.19.1695872827269; Wed, 27 Sep 2023 20:47:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695872827; cv=none; d=google.com; s=arc-20160816; b=feEaLFCVnM5Bq2Pb8BwrmEbZODgrQ/rGHSPvuSRD5Yj6YpuE8PtwMQHypaAx/99EDl KUmgkKsBa0mxjwRwFDjGwaW6qx92uQZt4vB2wa7Wi85sBh6PevVt+wseuUZHqk576ooH 4ccqJF6iPAShFjvFe9xa3OhGa9qZtxX+8B8U0OKUNwcFYyvkhq+L+88N3p7gqrJKUimr kPFfvzfO6XQNMd4OV4fLpqdWe/dijV/BKen7m22wTVoOQtlzFwgwIX5rNg/m2peJxKBh Ti7Hhdlb3c15AoQRQtVw666N83Y5cz/DR6XyZaZz0dLA/EEDAEdhUU1sfvYW4Aybak59 IG9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=sxBrZrM5hfLMtTNxf05EQ6NAFimmP4KyNoHJGo1kqN0=; fh=ltZAsPCQixIDhFN//2TULDH8Zq+QWAbqPPnsGOFyRPY=; b=fsNk8e29kjM1OOdw7l281RlmAct4FlaGv9rjwR33YdVw1rFXKoTEIVQtDYRB+wVWRw hWFBrZ1R1aV/RcV6U2N8xbWnTK4AVmLh3ghwEC1gBReGnH6mUKk2zsu65IUVaaVDiviD JhrJyswt+vPMqqIQjwYzxof98goDLnf/BSvd4sv6buteR8WS+pUdIM5PJyEcRozKnt7e nflndabS9iM9arFkG/hs+nW+24Xg/bvw+I5DAo5+zmkKlokHbCjzNcucpbGm7Ax6fP6O 8DqA+Mg2cbmmujew/iKMrMAUTcNT6ishIN8lDz05rDiFVz+O0YjVdY7v8RL6o87K192Q Y+gA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=ha5j+hIP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id ca10-20020a056a02068a00b0057800024a67si21061229pgb.257.2023.09.27.20.47.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Sep 2023 20:47:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=ha5j+hIP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 985ED8074CA4; Wed, 27 Sep 2023 05:51:00 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231768AbjI0MuP (ORCPT + 99 others); Wed, 27 Sep 2023 08:50:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231378AbjI0MuO (ORCPT ); Wed, 27 Sep 2023 08:50:14 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00C2CE6; Wed, 27 Sep 2023 05:50:12 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9FBB21F88F; Wed, 27 Sep 2023 12:50:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1695819011; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=sxBrZrM5hfLMtTNxf05EQ6NAFimmP4KyNoHJGo1kqN0=; b=ha5j+hIPJJ6lzadvIMzHzoUW9ZmC8K3jQOxjwaKXmEkAfeXbnpPmsbHta9tiU7K/w0b8Oh PnKfqQ+t4CgjfFCOS6eG4gPP4VDLeeGDugFGPJTHutWqYgqXmqmjXqkKfOpCLJ2cdIHLC8 lnH9MOkZuIPUnbf7nXqL0+AxscBsCY8= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 7C12D13479; Wed, 27 Sep 2023 12:50:11 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id tPV9GwMlFGXVCwAAMHmgww (envelope-from ); Wed, 27 Sep 2023 12:50:11 +0000 Date: Wed, 27 Sep 2023 14:50:10 +0200 From: Michal Hocko To: Johannes Weiner Cc: Frank van der Linden , Nhat Pham , akpm@linux-foundation.org, riel@surriel.com, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, tj@kernel.org, lizefan.x@bytedance.com, shuah@kernel.org, mike.kravetz@oracle.com, yosryahmed@google.com, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Subject: Re: [PATCH 0/2] hugetlb memcg accounting Message-ID: References: <20230926194949.2637078-1-nphamcs@gmail.com> <20230926221414.GD348484@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230926221414.GD348484@cmpxchg.org> X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 27 Sep 2023 05:51:00 -0700 (PDT) On Tue 26-09-23 18:14:14, Johannes Weiner wrote: [...] > The fact that memory consumed by hugetlb is currently not considered > inside memcg (host memory accounting and control) is inconsistent. It > has been quite confusing to our service owners and complicating things > for our containers team. I do understand how that is confusing and inconsistent as well. Hugetlb is bringing throughout its existence I am afraid. As noted in other reply though I am not sure hugeltb pool can be reasonably incorporated with a sane semantic. Neither of the regular allocation nor the hugetlb reservation/actual use can fallback to the pool of the other. This makes them 2 different things each hitting their own failure cases that require a dedicated handling. Just from top of my head these are cases I do not see easy way out from: - hugetlb charge failure has two failure modes - pool empty or memcg limit reached. The former is not recoverable and should fail without any further intervention the latter might benefit from reclaiming. - !hugetlb memory charge failure cannot consider any hugetlb pages - they are implicit memory.min protection so it is impossible to manage reclaim protection without having a knowledge of the hugetlb use. - there is no way to control the hugetlb pool distribution by memcg limits. How do we distinguish reservations from actual use? - pre-allocated pool is consuming memory without any actual owner until it is actually used and even that has two stages (reserved and really used). This makes it really hard to manage memory as whole when there is a considerable amount of hugetlb memore preallocated. I am pretty sure there are many more interesting cases. -- Michal Hocko SUSE Labs