Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp5703797pxb; Tue, 16 Feb 2021 05:40:43 -0800 (PST) X-Google-Smtp-Source: ABdhPJxYCi/OWnJjgSzPk8RJQdsBKwFm9HEiHcuUXFEIswTdr9knBR8viX3jkoXJPx4aq31UdYb3 X-Received: by 2002:a17:906:ca15:: with SMTP id jt21mr21128045ejb.58.1613482843374; Tue, 16 Feb 2021 05:40:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613482843; cv=none; d=google.com; s=arc-20160816; b=yeHpBYNPiE8McqPS7p70B5iELOCFbAkfyi6nTSweZJoyLiGmqo7znSr24t862+dn4j omvdz/aRtbT2HJWPIihCY5AKrcrSzpLVqazFlOIqHjPJyxHn1AtmUSBsAUOXk9J0XAuI 4deNam3m3CnyFOKiFKatymed282sUpLXqE/6OGjZVS+vhhJQUOpFGbA5mvYWKiV+mUvh Fa26bz5duqG24CaCRFPRazyVEDnaSmzMoD9mfZdWRvo92sGv6lndnBHEtFT+lxAl/M3n ks7G5JDtgK6cC5FkRNMj3ZugbTCmz76b8dXu56iIFOI4W/nTtEDYsbH9ci5KR9uP0oM5 WOYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=WVdO5Suny8ndHlIrRpoBrmqIPx4CWmlNbuYupKr2Cqc=; b=u4+IrAu6JhHB6jVuTG/xM9IF4RNSSMX3Tgzd9+NDGfbRsmniu5IryoiLJ3Puh+ve6U 1z20wikGBpdnQNgtFPRXx6ePO2KxHcKRiL4fvVPNG5MOnSy6eHL1+CGdsGm/BegJ++Mt nBV1Fhnhva5K//NVhpt5QZYJcJZnXkmCGDOvbpMnmEBB4FtWJLe8m8riSERjQYSkofk8 /HaTAjG5fbDzsMjRNAkFjm+Zs+5n3Sd6I3mSf6b8Ym+7+cu+1Vgdv6b1mTqDBClfLo76 3bX/oWFMbCnGNajHj16QjnGL1luTvPKlv0G+BCWaWyqK3TS8vebt54/RdR+uYHjbFZPn ga4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chrisdown.name header.s=google header.b=mx72lufR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chrisdown.name Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j9si3362948edn.302.2021.02.16.05.40.20; Tue, 16 Feb 2021 05:40:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chrisdown.name header.s=google header.b=mx72lufR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chrisdown.name Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230030AbhBPNja (ORCPT + 99 others); Tue, 16 Feb 2021 08:39:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229796AbhBPNjY (ORCPT ); Tue, 16 Feb 2021 08:39:24 -0500 Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0FBD3C06174A for ; Tue, 16 Feb 2021 05:38:43 -0800 (PST) Received: by mail-ed1-x531.google.com with SMTP id q10so12132300edt.7 for ; Tue, 16 Feb 2021 05:38:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chrisdown.name; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=WVdO5Suny8ndHlIrRpoBrmqIPx4CWmlNbuYupKr2Cqc=; b=mx72lufRdfc46M4ag+yiDlWAEO4kn8+gbsl3jWz0aFqVijhPbgaHfBEQpRr24bWoo3 Kzcg5/Qv0MqwuyCOWfV1DbKnZu075rzPranwK6vD4iech5drvH3efYd5mAeEH9goRzIa WVW3u5QveHXn/Yw7x4byE6AG/TaLuDBAt4Gb8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=WVdO5Suny8ndHlIrRpoBrmqIPx4CWmlNbuYupKr2Cqc=; b=j9kUYLla4kfLPokBTSaJ9U7Kf68Hr+3eX1Wv0y039a0oXXVsfE8AHfuQh1/cyG1iQM fCsNeliSu5ltMWyt1W9mNOnFRq7erlIPqfNWGxfxZSpC2iGjno2jk3/mlSRWd2w3DTno h0NPc2BmAv0WkP9xeIkkB35Sw/CGyBH8XCnkSQtNtvcxlWbH76g/Uep5g7bRs8/EiedM M0G9JNU+teOhAWDQAMlXofNG1yaxqDhxrije+2O9XYYjxaYfC3Cc6oYNFLqZ7kB8/OtU O6fhFyozLW5QDI3i6F5uY24cz9I4EGTgLk8U4LFS5QcHIYPDbhq4E1iNPMqXvJZ1Ucqe 0Vsg== X-Gm-Message-State: AOAM530V6ZqRZOXjUIvEbQSeXcikgeVB5U+bF998tIBTuvZHopa2HBUy wTZYa4GlOGVrsiV7tm/vTSt4YQ== X-Received: by 2002:a50:fe02:: with SMTP id f2mr938391edt.173.1613482721746; Tue, 16 Feb 2021 05:38:41 -0800 (PST) Received: from localhost ([2a01:4b00:8432:8a00:63de:dd93:20be:f460]) by smtp.gmail.com with ESMTPSA id i21sm13620839edy.9.2021.02.16.05.38.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Feb 2021 05:38:41 -0800 (PST) Date: Tue, 16 Feb 2021 13:38:40 +0000 From: Chris Down To: Eiichi Tsukata Cc: corbet@lwn.net, mike.kravetz@oracle.com, mcgrof@kernel.org, keescook@chromium.org, yzaikin@google.com, akpm@linux-foundation.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, felipe.franciosi@nutanix.com Subject: Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom Message-ID: References: <20210216030713.79101-1-eiichi.tsukata@nutanix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20210216030713.79101-1-eiichi.tsukata@nutanix.com> User-Agent: Mutt/2.0.5 (da5e3282) (2021-01-21) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Eiichi, I agree with Michal's points, and I think there are also some other design questions which don't quite make sense to me. Perhaps you can clear them up? :-) Eiichi Tsukata writes: >diff --git a/mm/hugetlb.c b/mm/hugetlb.c >index 4bdb58ab14cb..e2d57200fd00 100644 >--- a/mm/hugetlb.c >+++ b/mm/hugetlb.c >@@ -1726,8 +1726,8 @@ static int alloc_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed, > * balanced over allowed nodes. > * Called with hugetlb_lock locked. > */ >-static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed, >- bool acct_surplus) >+int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed, >+ bool acct_surplus) > { > int nr_nodes, node; > int ret = 0; The immediate red flag to me is that we're investing further mm knowledge into hugetlb. For the vast majority of intents and purposes, hugetlb exists outside of the typical memory management lifecycle, and historic behaviour has been to treat a separate reserve that we don't touch. We expect that hugetlb is a reserve which is by and large explicitly managed by the system administrator, not by us, and this seems to violate that. Shoehorning in shrink-on-OOM support to it seems a little suspicious to me, because we already have a modernised system for huge pages that handles not only this, but many other memory management situations: THP. THP not only has support for this particular case, but so many other features which are necessary to coherently manage it as part of the mm lifecycle. For that reason, I'm not convinced that those composes to a sensible interface. As some example questions which appear unresolved to me: if hugetlb pages are lost, what mechanisms will we provide to tell automation or the system administrator what to do in that scenario? How should the interface for resolving hugepage starvation due to repeated OOMs look? By what metrics will you decide if releasing the hugepage is worse for the system than selecting a victim for OOM? Why can't the system use the existing THP mechanisms to resolve this ahead of time? Thanks, Chris