Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp547208pxb; Fri, 15 Apr 2022 06:00:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzA4cxPqq4T7tLMuwEWcpWULtzfzHxoj+zjZTcdGsGBsMGUMMzhMtaHA7yvZ60JGLcDuBZX X-Received: by 2002:a17:907:da0:b0:6df:d4a4:9d0f with SMTP id go32-20020a1709070da000b006dfd4a49d0fmr6279313ejc.407.1650027612811; Fri, 15 Apr 2022 06:00:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650027612; cv=none; d=google.com; s=arc-20160816; b=MPNI6f6zF2n/9kW4CQfrcMdJtcZvDK7LUmG5v0fUxPrbb0mHW+k1vyjg0bKj31+CCg fGGLT4RaZBhC1SV3z1ylx/HIDVweL9htqpKelIPaHWib5AqdSdoKXxRRDt9adv38Svdh UnPOGAOaNO5p1xu4+QVE+8gWQP4X/vg9xh4u8qAoLbUmkxsBa4gJbDYgeUUDyRJsHwN1 ZF+73+wlmKtzkcnFUVcFZ/0m3ddJ8BWr580JGdw2NFF45ncxU5jFp2b1RlakluWJNoCW aXjZEXxAx4unSgPnN/cFeDobSYtIDomrvDOR0tes2aiS3iNDFm303HpYuU3mR2y5zuP0 JYYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=6qk5TgQ6wsn2Uct/obB1s7pV7D+EfPbiOlpv7t47RQM=; b=LoP6bIeXDxEska89rIEltafWjFE4tUKHY9mTYvEvRXdg7QyQ2ZM29X3y+PST2Gr65/ armaExu0jM3Z1E8v5rXZmbqwqOZopSYEsw83RJTsqyFs+xXlOhae0JXiWI+i76URy3PZ 5qFFgMJflfPSpZ3LY9B+DOeY+o2b58PTtzCpfwFqsHb3OJJkhh0UEj4EvXy2lW8MZHLu H7igQSyyAzJN7tl+4hGKSKkfqAAka0Do7xUdCtfeu2p6f9Q4ZFHvNnvXe6nLv4uIp4PM rS439gyDWcMq6zEC++1iW64MWEVa3kApqP7vK4B1h19Fs4B/hlI9lRf41gfksc8O0DBj WUlw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=b2Nj4VvI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g3-20020a1709067c4300b006e6ff6b070asi1006739ejp.39.2022.04.15.05.59.47; Fri, 15 Apr 2022 06:00:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=b2Nj4VvI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348604AbiDODK6 (ORCPT + 99 others); Thu, 14 Apr 2022 23:10:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231276AbiDODKz (ORCPT ); Thu, 14 Apr 2022 23:10:55 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4BF57AF1E2 for ; Thu, 14 Apr 2022 20:08:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649992109; x=1681528109; h=message-id:subject:from:to:cc:date:in-reply-to: references:mime-version:content-transfer-encoding; bh=RAM36F8NkeMZBw4Ka4v8xDiTCpoZWQar458TDSkh+CI=; b=b2Nj4VvIVhTwv04aNlzfJSYL3VeYkzByRK0rk1RhvcQysg6Rr2h0/sl2 wZpnFXgV3MpxaLp7IrU6o6dwL50LIGOOqyFGalWlqNSYJGjX+SPEKWqaN w3l8UF229UpnLsdYfjkh/P3GeMDdneWRWKca7en3OOaNBzPTTqj5iPEAY Vmjc3Xxpd91Wu4geYpLWzKhSjxsm39LHxw0LhIhESJmv3cS7vUI0GZ19R 7lXO+bEVzBxQRNqwSDL1Vo2L5gq7Hh+lUlXvviH8v3+qhr474VfR1rEig WQvWr+wOHdorD8TEFH5LhPdu/RH+qgta/6n/19UgMPeHAq73zXfwJ/SqQ Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10317"; a="349517809" X-IronPort-AV: E=Sophos;i="5.90,261,1643702400"; d="scan'208";a="349517809" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Apr 2022 20:07:59 -0700 X-IronPort-AV: E=Sophos;i="5.90,261,1643702400"; d="scan'208";a="552974365" Received: from ruiqifu-mobl.ccr.corp.intel.com ([10.254.213.123]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Apr 2022 20:07:55 -0700 Message-ID: <4b47e6317aca3deeabf610a7f4839563ff2b25a1.camel@intel.com> Subject: Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check From: "ying.huang@intel.com" To: David Hildenbrand , Miaohe Lin , Oscar Salvador Cc: akpm@linux-foundation.org, songmuchun@bytedance.com, hch@infradead.org, willy@infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Pavel Tatashin , John Hubbard , Linus Torvalds , Vlastimil Babka , Yu Zhao Date: Fri, 15 Apr 2022 11:07:53 +0800 In-Reply-To: References: <20220409093500.10329-1-linmiaohe@huawei.com> <20220409093500.10329-3-linmiaohe@huawei.com> <7455b680-3d89-5d3e-ba0e-6e4358b114a2@huawei.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.38.3-1 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2022-04-13 at 09:26 +0800, ying.huang@intel.com wrote: > On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote: > > On 12.04.22 15:42, Miaohe Lin wrote: > > > On 2022/4/12 16:59, Oscar Salvador wrote: > > > > On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: > > > > > We don't need to check can_split_folio() because folio_maybe_dma_pinned() > > > > > is checked before. It will avoid the long term pinned pages to be swapped > > > > > out. And we can live with short term pinned pages. Without can_split_folio > > > > > checking we can simplify the code. Also activate_locked can be changed to > > > > > keep_locked as it's just short term pinning. > > > > > > > > What do you mean by "we can live with short term pinned pages"? > > > > Does it mean that it was not pinned when we check > > > > folio_maybe_dma_pinned() but now it is? > > > > > > > > To me it looks like the pinning is fluctuating and we rely on > > > > split_folio_to_list() to see whether we succeed or not, and if not > > > > we give it another spin in the next round? > > > > > > Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be > > > pinned for a noticeable time. So it's expected to split the folio successfully in the next > > > round as the pinning is really fluctuating. Or am I miss something? > > > > > > > Just so we're on the same page. folio_maybe_dma_pinned() only capture > > FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now > > via vmsplice(). > > Per my original understanding, folio_maybe_dma_pinned() can be used to > detect long-term pinned pages. And it seems reasonable to skip the > long-term pinned pages and try short-term pinned pages during page > reclaiming. But as you pointed out, vmsplice() doesn't use FOLL_PIN. > So if vmsplice() is expected to pin pages for long time, and we have no > way to detect it, then we should keep can_split_folio() in the original > code. > > Copying more people who have worked on long-term pinning for comments. Checked the discussion in the following thread, https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQSj80_w@mail.gmail.com/ It seems that from the practical point of view, folio_maybe_dma_pinned() can identify most long-term pinned pages that may block memory hot- remove or CMA allocation. Although as David pointed out, some pages may still be GUPed for long time (e.g. via vmsplice) even if !folio_maybe_dma_pinned(). But from another point of view, can_split_huge_page() is cheap and THP swapout is expensive (swap space, disk IO, and hard to be recovered), so it may be better to keep can_split_huge_page() in shink_page_list(). Best Regards, Huang, Ying > > > can_split_folio() is more precise then folio_maybe_dma_pinned(), but > > both are racy as long as the page is still mapped. > > > > >