Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp2726386iob; Fri, 6 May 2022 09:07:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw4guTxzt86o1lDQisjTvzB7jnLey5MxNIoweuiOZg7u2wmXrOqZnkS97DuP7DQVJneVWS3 X-Received: by 2002:a05:6602:1542:b0:65a:bc5d:db78 with SMTP id h2-20020a056602154200b0065abc5ddb78mr1616517iow.128.1651853220562; Fri, 06 May 2022 09:07:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651853220; cv=none; d=google.com; s=arc-20160816; b=hRO1Iuiw16AM9QbbcaBm+gsWIp5+q4XhXMcQ8cbBZbGP4+nx5bosd+YSl4xMrFVdwJ 1R2zvCqgXLaRSbXTKTqWToTwUPj4qtzKm5ilt6qMwB3ioukugQj1vp6pXvlyB24l+45A DxeJQ0kHHk8hje8ON+4CpJKlAmDDYMYDLnzeKT+hi8InoG7X+4NVsCKEpRFNz1dK/4rf NXFfLLFGyi2k+GcOE7DAHt26Np7NRKXgQ0TFs/mLflUwOh++L8TFQLoAqJUFMQLPH/vn KnEWAEFIh20CysgYG/ryv478UXARGXvNqv+S1rHKZ4Q6rAisblmua3lWXB3U3g65JVXa YgUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=96lsdP17+dwdeaXHBafFnWwq3XnQiiakeqnNdMuN9gE=; b=XofZW4U1pN8Rr5xpmF6rsdmUjbefzlAesQGzszw5dwW51z5XkEE961dtMYHKl8wEA/ Te2m573qQ1gcANh7GSk2vR/P6b9ofEl1OM8HAXo2LD0D21n1ekeC8l/BYofsQikizblh rV2o46HFFz5uMQ5LsS/5TnC1k1+FWpcNa4KIg+fcN2/3xGJ+J3e3bjC3XqwkgZ03xi9G S9FtLfwF25yeypb/8os6PamlKIZmiLc4Y1TqEEdFyPxcyt51ud+4t/+WYCjkqxEszxbY YvfGzS5GP9oIbpRgn3oDXbg27BHbfs1DhaS2CfJCeXdG5zefFM2viHVyQXrjIS4qeYoI 8NkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mvE0HPZd; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f8-20020a05660215c800b0065aa0cb4250si2986782iow.4.2022.05.06.09.06.39; Fri, 06 May 2022 09:07:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mvE0HPZd; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1388384AbiEFDA3 (ORCPT + 99 others); Thu, 5 May 2022 23:00:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36832 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238954AbiEFDA2 (ORCPT ); Thu, 5 May 2022 23:00:28 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ABFD15C65F; Thu, 5 May 2022 19:56:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651805806; x=1683341806; h=message-id:subject:from:to:cc:date:in-reply-to: references:mime-version:content-transfer-encoding; bh=bm/ECb2jddD2tXLPLMb1gNbwUfVlj/otmAhjYIFTg4o=; b=mvE0HPZdX+AV7TTGgcw3iNdXXL9LVt6gTqIePgswCwBZSgluwtWLFVtO KJYegzsNjzwIgWKUGaYEEge6jOy2NtTS80A/ZIM8+etfdA8fLYWsmLMgt S9esM+JM932gumOuFWxFNPY0PwiF85rIpspCQf0u1+f32mn74UV8dQdxv xU8PpF8ZGIrxjdTlpSQKTD+jt9chhw7uD6RlKuSame6MIb6VyikmMCGYn FIbnU3iNN3xqft0l6rKgxOqUFxqDELCNNk1frOdPUqiCYY/1R3+nr8GMQ ukxUIQDszr8TAaqUJZaCfG08/isfXIz5BaH7Qvb3oKgwsbQd390DtgD/9 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="248864370" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="248864370" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 19:56:46 -0700 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="735337943" Received: from fulaizha-mobl1.ccr.corp.intel.com ([10.254.213.163]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 19:56:43 -0700 Message-ID: Subject: Re: [PATCH 1/2] MM: handle THP in swap_*page_fs() From: "ying.huang@intel.com" To: NeilBrown , Yang Shi Cc: Andrew Morton , Geert Uytterhoeven , Christoph Hellwig , Miaohe Lin , linux-nfs@vger.kernel.org, Linux MM , Linux Kernel Mailing List Date: Fri, 06 May 2022 10:56:40 +0800 In-Reply-To: <165170771676.24672.16520001373464213119@noble.neil.brown.name> References: <165119280115.15698.2629172320052218921.stgit@noble.brown> , <165119301488.15698.9457662928942765453.stgit@noble.brown> , , <165146539609.24404.4051313590023463843@noble.neil.brown.name> , <165170771676.24672.16520001373464213119@noble.neil.brown.name> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.38.3-1 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Thu, 2022-05-05 at 09:41 +1000, NeilBrown wrote: > On Tue, 03 May 2022, Yang Shi wrote: > > On Sun, May 1, 2022 at 9:23 PM NeilBrown wrote: > > > > > > On Sat, 30 Apr 2022, Yang Shi wrote: > > > > On Thu, Apr 28, 2022 at 5:44 PM NeilBrown wrote: > > > > > > > > > > Pages passed to swap_readpage()/swap_writepage() are not necessarily all > > > > > the same size - there may be transparent-huge-pages involves. > > > > > > > > > > The BIO paths of swap_*page() handle this correctly, but the SWP_FS_OPS > > > > > path does not. > > > > > > > > > > So we need to use thp_size() to find the size, not just assume > > > > > PAGE_SIZE, and we need to track the total length of the request, not > > > > > just assume it is "page * PAGE_SIZE". > > > > > > > > Swap-over-nfs doesn't support THP swap IIUC. So SWP_FS_OPS should not > > > > see THP at all. But I agree to remove the assumption about page size > > > > in this path. > > > > > > Can you help me understand this please. How would the swap code know > > > that swap-over-NFS doesn't support THP swap? There is no reason that > > > NFS wouldn't be able to handle 2MB writes. Even 1GB should work though > > > NFS would have to split into several smaller WRITE requests. > > > > AFAICT, THP swap is only supported on non-rotate block devices, for > > example, SSD, PMEM, etc. IIRC, the swap device has to support the > > cluster in order to swap THP. The cluster is only supported by > > non-rotate block devices. > > > > Looped Ying in, who is the author of THP swap. > > I hunted around the code and found that THP swap only happens if a > 'cluster_info' is allocated, and that only happens if > if (p->bdev && bdev_nonrot(p->bdev)) { > in the swapon syscall. > And in get_swap_pages(), the cluster is only allocated for block devices. if (size == SWAPFILE_CLUSTER) { if (si->flags & SWP_BLKDEV) n_ret = swap_alloc_cluster(si, swp_entries); } else n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE, n_goal, swp_entries); We may remove this restriction in the future if someone can show the benefit. Best Regards, Huang, Ying > I guess "nonrot" is being use as a synonym for "low latency"... > So even if NFS was low-latency it couldn't benefit from THP swap. > > So as you say it is not currently possible for THP pages to be send to > NFS for swapout. It makes sense to prepare for it though I think - if > only so that the code is more consistent and less confusing. > > Thanks, > NeilBrown