Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1903629rwb; Tue, 29 Nov 2022 22:32:43 -0800 (PST) X-Google-Smtp-Source: AA0mqf5xA+rgLGJvXeiKHfjgqAb0MGXuRF9FncEaodVUI4ohmlkDDSU013+sBu1mwK7N3oiix8PR X-Received: by 2002:a17:90a:307:b0:20a:c1bf:ad2d with SMTP id 7-20020a17090a030700b0020ac1bfad2dmr47104230pje.112.1669789962933; Tue, 29 Nov 2022 22:32:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669789962; cv=none; d=google.com; s=arc-20160816; b=L2XeEBOLOcW94fmIXEq1zjqUT7/Lsy0e+n6HtjQFmjzO+QAUYuKDt+QB28nI/06vcG v/zcT25K6l80N5CG5VGgU7cvOXGBM7vbEktVGRvqEEjRP94bPVmCLa7JaIX8qqbOoMGg 6Wa+PhfBfREB7aTcM0VpgpGNxLFQQUEnhw1W+0Y+k0Px3k4syDdavuXK0L+sWNKH3y4F 5YJdBmHjnDoClYGsIcc12tnxugUCNvhWha2zRLpo1pU1p/Suy9o5l9QdomW03vbF4pv3 hDRC8IGcAB34kanqP645bY+uAsb7rhloEs96TUMQEs4dKatDpWHtbYBc/PgXCpGQpqIq 7AgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:message-id:in-reply-to :date:references:subject:cc:to:from:dkim-signature; bh=iOmdX8AQL2rwkOmRsWDc2zLV8pXXNbUFZHlK55XDAow=; b=tBgG4J5gdrsqHR/EUmpha1u41+/fdzITMORadchb3E+hxwduB0dkCcFEV2+ELvU0og UKTVwaH0Rv/dbncmAkhDPL8CmS6IJLXSx/MdHjeKnFtNqOiJpnNbmjeoFYth0sTLuBlc ol1V25QuseffkwmBHtoDXPgfIOzfe7oIFpmZ/4JN9UZGtgiD8Dp+XlPqMLxy4LsTkbvF zVP8wb9X2SWPisE4IT+EQtISfx5Ico1ykIigvPwr3j9tFW9PHKRfAv13Y7T7hSPrLJOc c1G3CD/NYfJUK5I7s/QKuLcPnrvBhhuxVHvjeIHT/Ce0jTf3m2S6PZ/b154TN4/SXv9R Cjbg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VcX5thZV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 198-20020a6301cf000000b00477668d920esi488704pgb.619.2022.11.29.22.32.32; Tue, 29 Nov 2022 22:32:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VcX5thZV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233349AbiK3FkQ (ORCPT + 84 others); Wed, 30 Nov 2022 00:40:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229648AbiK3FkP (ORCPT ); Wed, 30 Nov 2022 00:40:15 -0500 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D74567615B; Tue, 29 Nov 2022 21:40:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1669786813; x=1701322813; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=7bushIMDrKj/fAEl186WlCQe8w0m5wH7S69eqHtJo14=; b=VcX5thZV6v4dpN3lhTOsQqKOAcC7LjqZCKPrj9rUOik27NfdgI3SidX3 F+UT8q/pZJOgEM3nu22SExipvp9bHedN+lgYGT62YK+6CG+mqJtQVUAjx AZ7jfpmXbH/jYfaR7kV8lmFD4ZTDMkftaRPyEeCjDvsKLZqe8vqwOHmc8 wz7wm883CtSlI5CW7Q3DiDgQrlR4rJtfMZox1LBYFZ2OF4MXmjYTB0rZF jRK4A6PF1ZmTyMRhsPb+w/MP/IsYvZuJRDpyzc3oJ0X6ZuRAcezXq1iDY Dp12zuat4voL6fRGZOb5aKeAtxEXZmWZwnHsNov9firvQTD0yGuQ+fr+B w==; X-IronPort-AV: E=McAfee;i="6500,9779,10546"; a="295005340" X-IronPort-AV: E=Sophos;i="5.96,205,1665471600"; d="scan'208";a="295005340" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Nov 2022 21:40:13 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10546"; a="768706794" X-IronPort-AV: E=Sophos;i="5.96,205,1665471600"; d="scan'208";a="768706794" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Nov 2022 21:40:09 -0800 From: "Huang, Ying" To: Mina Almasry Cc: Johannes Weiner , Yang Shi , Yosry Ahmed , Tim Chen , weixugc@google.com, shakeelb@google.com, gthelen@google.com, fvdl@google.com, Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: Re: [RFC PATCH V1] mm: Disable demotion from proactive reclaim References: <20221122203850.2765015-1-almasrymina@google.com> <874juonbmv.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Wed, 30 Nov 2022 13:39:19 +0800 In-Reply-To: (Mina Almasry's message of "Tue, 29 Nov 2022 18:14:49 -0800") Message-ID: <87edtlatmg.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mina Almasry writes: > On Wed, Nov 23, 2022 at 9:52 PM Huang, Ying wrote: >> >> Hi, Johannes, >> >> Johannes Weiner writes: >> [...] >> > >> > The fallback to reclaim actually strikes me as wrong. >> > >> > Think of reclaim as 'demoting' the pages to the storage tier. If we >> > have a RAM -> CXL -> storage hierarchy, we should demote from RAM to >> > CXL and from CXL to storage. If we reclaim a page from RAM, it means >> > we 'demote' it directly from RAM to storage, bypassing potentially a >> > huge amount of pages colder than it in CXL. That doesn't seem right. >> > >> > If demotion fails, IMO it shouldn't satisfy the reclaim request by >> > breaking the layering. Rather it should deflect that pressure to the >> > lower layers to make room. This makes sure we maintain an aging >> > pipeline that honors the memory tier hierarchy. >> >> Yes. I think that we should avoid to fall back to reclaim as much as >> possible too. Now, when we allocate memory for demotion >> (alloc_demote_page()), __GFP_KSWAPD_RECLAIM is used. So, we will trigger > > I may be missing something but as far I can tell reclaim is disabled > for allocations from lower tier memory: > https://elixir.bootlin.com/linux/v6.1-rc7/source/mm/vmscan.c#L1583 #define GFP_NOWAIT (__GFP_KSWAPD_RECLAIM) We have GFP_NOWAIT set in gfp. > I think this is maybe a good thing when doing proactive demotion. In > this case we probably don't want to try to reclaim from lower tier > nodes and instead fail the proactive demotion. Do you have some real use cases for this? If so, we can tweak the logic. > However I can see this being desirable when the top tier nodes are > under real memory pressure to deflect that pressure to the lower tier > nodes. Yes. Best Regards, Huang, Ying >> kswapd reclaim on lower tier node to free some memory to avoid fall back >> to reclaim on current (higher tier) node. This may be not good enough, >> for example, the following patch from Hasan may help via waking up >> kswapd earlier. >> >> https://lore.kernel.org/linux-mm/b45b9bf7cd3e21bca61d82dcd1eb692cd32c122c.1637778851.git.hasanalmaruf@fb.com/ >> >> Do you know what is the next step plan for this patch? >> >> Should we do even more? >> >> From another point of view, I still think that we can use falling back >> to reclaim as the last resort to avoid OOM in some special situations, >> for example, most pages in the lowest tier node are mlock() or too hot >> to be reclaimed. >> >> > So I'm hesitant to design cgroup controls around the current behavior. > > I sent RFC v2 patch: > https://lore.kernel.org/linux-mm/20221130020328.1009347-1-almasrymina@google.com/T/#u > > Please take a look when convenient. Thanks! > >> > >> >> Best Regards, >> Huang, Ying >>