Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp1892044rwp; Thu, 13 Jul 2023 19:22:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlFKTmwZ1BuN198WAHB3mtjeGvBIc73TqTDuXkraQDlxM/Edm2z9Aqt8x7MdigBh13+0ylnP X-Received: by 2002:a17:902:d507:b0:1b8:5bd0:fe12 with SMTP id b7-20020a170902d50700b001b85bd0fe12mr3922852plg.16.1689301343747; Thu, 13 Jul 2023 19:22:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689301343; cv=none; d=google.com; s=arc-20160816; b=vbwISdKfYkMKPlFmbIC/rhXw+IJ2GXaRTX7qQa0dEiWGj5XXeQT1NfGgbEaaYVxno7 cN4+8Rhx0VJnY71AKw/NMvMLaMwrMkrIRmeEqPyF8BLODOOKQnZOrtcT1yaNdoy63QGK X2F6Y/GdDvUXD39lagkn6Yr3VLHkbsTChy8l6fKebrJ+4xM9ETkgJ6tAXas+h4jEEacA 5SfIOjMxJuwW4tT1RsRTzWvOI1nXzgqG6eJuUI5VTCfDGzHgVlPc+sr3knMEHXV9NrpA lqXrwCilkPv4dmqRMp+y5TyMcHUTrPF3PQwpLBLCE+cpqGGi5BnARQCtogjkhgb1LyKA MrKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=yaoyMod4YhQet4GlKsQo7SoI2O5v7F0iNZz73/v59vU=; fh=7tovdGniLTc9If1xq9h3zkwVGzHoXHz21Ir8D3I2KNI=; b=XXsiSUK7U3Z82NeWE/pVBM0Mb2konfowWCuHqqy9mpIp+HWQbfEbTj7zqNvAjjHtG4 l1bbTMU3w3yK4HMGxlrv5pxoLwj6/l02OdDgMuyIG1WEYJtUZRlyGW1XnKpKHMmW8PNV lRijP77b/64CeVBn/JVrOqEp4mBQOBtfTSNDdHno1qvvrCS+ZkndsBs0+/2XdIpui5nF VdRNy3VaOlFugYpxEE5ufzs9qwEHpMt4H/h38iGUDZTYKFlBFqRDO82QQ6mbTZ0rwYn0 zM5eQyeGLq2f56kz+JQvly9wiw0tI5+mpqTFO7If2UWahN9eh9jSG6WVdJKikP4GStjM VH+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=MkLXXmdM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t8-20020a170902a5c800b001b3eeaad177si6181501plq.99.2023.07.13.19.22.10; Thu, 13 Jul 2023 19:22:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=MkLXXmdM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234501AbjGNCJW (ORCPT + 99 others); Thu, 13 Jul 2023 22:09:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232126AbjGNCJU (ORCPT ); Thu, 13 Jul 2023 22:09:20 -0400 Received: from mail-qt1-x82e.google.com (mail-qt1-x82e.google.com [IPv6:2607:f8b0:4864:20::82e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC10B2D64 for ; Thu, 13 Jul 2023 19:09:15 -0700 (PDT) Received: by mail-qt1-x82e.google.com with SMTP id d75a77b69052e-4036bd4fff1so173841cf.0 for ; Thu, 13 Jul 2023 19:09:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689300555; x=1691892555; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yaoyMod4YhQet4GlKsQo7SoI2O5v7F0iNZz73/v59vU=; b=MkLXXmdMYO5MjdINzkKT/vpvpgCKquPcbnWMJh7ZVyVppPkKOUFhF7vkNfx3GDSD2Y o3ytngfQLVehiOeiE7z84FEolHR4lnKs8Js+X8hnVPfFu5aocs88eACuhb//HqpkTxPJ E1B39L1s2E/FDIUjQQ4Ji4Iy6avLEF5IbWf4jjFvmVpE0h8UWefuQal7EetYynazW5Ln J3yhCxh29d3ibhvvX6/qJgJ+5RDB3YLAZBE3O8BhP68Gm4f5uyrGcw8Ht4MmZKXxXAL2 Hhsg00KWx59mvRtMR7xzbqXOGUR7dQGDCv/YN0ohetR/3UVMH5Gwp+1ZezBH18jb41M4 pj4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689300555; x=1691892555; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yaoyMod4YhQet4GlKsQo7SoI2O5v7F0iNZz73/v59vU=; b=kUSkto+JA0bqxImm7RK/DFtweEjkANKVE6LyVX5qS6X+QuMqQxzbPkIRe23uv/nzsM 3jHwQNQBNqdSjprhFWKUH8qDb9IGXgnCyBfj5meTXoUmTa5M2Bdxz+/nitrungHxxmLL oJk/DFxtxEnMMHM0x8QVOYc/c+qkUZsaBfzfhqgygeTHcHexQSYFi6GX/k++KNUlNGz7 fNF0KkGvEYwnVpbOkKiuprtisLhyChh2WmEj1dEMJPOjzBkr+lTk+RAwtiFGI8eEeQgY +yAn0OS8274znA8FeGTovrSUwD6/DucD+JgWCm+2CySc261Ryu/6kkZ+0Ucl14IWkdwf KoJA== X-Gm-Message-State: ABy/qLYeWT/HARF5AhWSRTlJvEacNr6IoVNF5z1YTHQ6ncuP1jB1cMcd 2je39nLisEQf9fv7RYrePSk1mRrf++O54gzxSm5xYQ== X-Received: by 2002:a05:622a:394:b0:3fa:45ab:22a5 with SMTP id j20-20020a05622a039400b003fa45ab22a5mr708987qtx.27.1689300554862; Thu, 13 Jul 2023 19:09:14 -0700 (PDT) MIME-Version: 1.0 References: <20230713150558.200545-1-fengwei.yin@intel.com> In-Reply-To: <20230713150558.200545-1-fengwei.yin@intel.com> From: Yu Zhao Date: Thu, 13 Jul 2023 20:08:38 -0600 Message-ID: Subject: Re: [RFC PATCH] madvise: make madvise_cold_or_pageout_pte_range() support large folio To: Yin Fengwei , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 13, 2023 at 9:06=E2=80=AFAM Yin Fengwei = wrote: > > Current madvise_cold_or_pageout_pte_range() has two problems for > large folio support: > - Using folio_mapcount() with large folio prevent large folio from > picking up. > - If large folio is in the range requested, shouldn't split it > in madvise_cold_or_pageout_pte_range(). > > Fix them by: > - Use folio_estimated_sharers() with large folio > - If large folio is in the range requested, don't split it. Leave > to page reclaim phase. > > For large folio cross boundaries of requested range, skip it if it's > page cache. Try to split it if it's anonymous folio. If splitting > fails, skip it. For now, we may not want to change the existing semantic (heuristic). IOW, we may want to stick to the "only owner" condition: - if (folio_mapcount(folio) !=3D 1) + if (folio_entire_mapcount(folio) || + (any_page_within_range_has_mapcount > 1)) +Minchan Kim Also there is an existing bug here: the later commit 07e8c82b5eff8 ("madvise: convert madvise_cold_or_pageout_pte_range() to use folios") is incorrect for sure; the original commit 9c276cc65a58f ("mm: introduce MADV_COLD") seems incorrect too. +Vishal Moola (Oracle) The "any_page_within_range_has_mapcount" test above seems to be the only correct to meet condition claimed by the comments, before or after the folio conversion, assuming here a THP page means the compound page without PMD mappings (PMD-split). Otherwise the test is always false (if it's also PMD mapped somewhere else). /* * Creating a THP page is expensive so split it only if we * are sure it's worth. Split it if we are only owner. */ > The main reason to call folio_referenced() is to clear the yong of > conresponding PTEs. So in page reclaim phase, there is good chance > the folio can be reclaimed. > > Signed-off-by: Yin Fengwei > --- > This patch is based on mlock large folio support rfc2 as it depends > on the folio_in_range() added by that patchset > > Also folio_op_size() can be unitfied with get_folio_mlock_step(). > > Testing done: > - kselftest: No new regression introduced. > > mm/madvise.c | 133 ++++++++++++++++++++++++++++++++------------------- > 1 file changed, 84 insertions(+), 49 deletions(-) Also the refactor looks fine to me but it'd be better if it's a separate pa= tch.