Received: by 2002:a05:7412:8521:b0:e2:908c:2ebd with SMTP id t33csp497563rdf; Fri, 3 Nov 2023 06:58:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEE3Bwk3Ye9gThR0WyiPtheNueBiFpp1x1R922FFAJYf8XTV/B7zKJ+aayi3EdeziQMo0Ik X-Received: by 2002:a17:902:e805:b0:1cc:482c:bc46 with SMTP id u5-20020a170902e80500b001cc482cbc46mr18746600plg.34.1699019886216; Fri, 03 Nov 2023 06:58:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1699019886; cv=none; d=google.com; s=arc-20160816; b=hV+ozHiWupDgSekdlWtdiJfe94qzxn8XofHeCQPjrtyos5k+jxHy2vtbjVi803sP+9 TvHSd84njKp4w+qvQWUjxw6qRniOZDO0aOUHshWWp2LlvEcBfxnIv9L1BrDyCpeYEy4J BjloEaqw8rpksQgSm29OdXg8kjzExjau2H+CoRFDcYWOmekPgP5qwOGGSMcrDJDyBNiO oen2Osz/Oc2808wF3RQOBc0hJvriGO0/1Z8iXoHcsq+2LafcIuE0L7dErkYNgvrsPlIP QV3/8rwD3t/9+yCJYKu23LjOOsuRF9zPC4wO3qHVonCOaGs4EddV8DgNnOC1+A+KgYJb 7bWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id; bh=a6sQt3NA+6XpANvPrQ9Z6jVqFyLwCDzpekBlyZajEt4=; fh=RpG7SAs5r6Qh50p7pbRi0xLhvKcK6SWR9ofsneDRqLo=; b=VI2oxoPz/cF3YEbyNreAVpsKpDeQrzFsiUfKhiJA8S32fg/Hqe1G5vHikRH6+IQ6dc dquMPLVQCRdGUxSDblr3VnXUjtkXGC72QsJGjTrJp4A0G34iNok24xRGY1NOXM8Wo3VA y9XgH8T9jKP8jpFE49hezpM/wiW+EpSkJ9/BiWBIURYcDEHh9uFShRhNwNOhVJvPUHn3 sP14NvIxi+wJU4mIt8qBKvCEZrCSGoahg1v/qoexJufWOM1aNEIKmIQxfWS4GVFVgOMy 2AjSKKFsnRO374Kj7vslu0DARf9A48cnUYSwc/7+6tWUbHLYB99s/uObWTYZhW+AAMeb qFCA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id l12-20020a170903120c00b001b9e82a6beesi1563866plh.548.2023.11.03.06.58.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Nov 2023 06:58:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 80D7E81D0B40; Fri, 3 Nov 2023 06:58:03 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233010AbjKCN57 (ORCPT + 99 others); Fri, 3 Nov 2023 09:57:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229463AbjKCN55 (ORCPT ); Fri, 3 Nov 2023 09:57:57 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 75F4A1A8 for ; Fri, 3 Nov 2023 06:57:54 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E751A2F4; Fri, 3 Nov 2023 06:58:36 -0700 (PDT) Received: from [10.1.28.19] (e122027.cambridge.arm.com [10.1.28.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4D8563F64C; Fri, 3 Nov 2023 06:57:51 -0700 (PDT) Message-ID: <2fe5ce7e-9c5c-4df4-b4fc-9fd3d9b2dccb@arm.com> Date: Fri, 3 Nov 2023 13:57:49 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 4/4] mm: swap: Swap-out small-sized THP without splitting To: Ryan Roberts , Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, david@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com References: <73aad98e-de4c-4021-af3c-db67e06cdb70@arm.com> <20231102223643.7733-1-v-songbaohua@oppo.com> <6641a14b-e3fb-4e9e-bb95-b0306827294b@arm.com> Content-Language: en-GB From: Steven Price In-Reply-To: <6641a14b-e3fb-4e9e-bb95-b0306827294b@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 03 Nov 2023 06:58:03 -0700 (PDT) On 03/11/2023 11:31, Ryan Roberts wrote: > On 02/11/2023 22:36, Barry Song wrote: >>> But, yes, it would be nice to fix that! And if I've understood the problem >>> correctly, it doesn't sound like it should be too hard? Is this something you >>> are volunteering for?? :) >> >> Unfornately right now I haven't a real hardware with MTE which can run the latest >> kernel. but i have written a RFC, it will be nice to get someone to test it. Let >> me figure out if we can get someone :-) > > OK, let me know if you find someone. Otherwise I can have a hunt around to see > if I can test it. > >> >> [RFC PATCH] arm64: mm: swap: save and restore mte tags for large folios >> >> This patch makes MTE tags saving and restoring support large folios, >> then we don't need to split them into base pages for swapping on >> ARM64 SoCs with MTE. >> >> --- >> arch/arm64/include/asm/pgtable.h | 21 ++++----------------- >> arch/arm64/mm/mteswap.c | 20 ++++++++++++++++++++ >> 2 files changed, 24 insertions(+), 17 deletions(-) >> >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h >> index 7f7d9b1df4e5..b12783dca00a 100644 >> --- a/arch/arm64/include/asm/pgtable.h >> +++ b/arch/arm64/include/asm/pgtable.h >> @@ -45,12 +45,6 @@ >> __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1) >> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ >> >> -static inline bool arch_thp_swp_supported(void) >> -{ >> - return !system_supports_mte(); >> -} >> -#define arch_thp_swp_supported arch_thp_swp_supported > > IIRC, arm64 was the only arch implementing this, so perhaps it should be ripped > out from the core code now? > >> - >> /* >> * Outside of a few very special situations (e.g. hibernation), we always >> * use broadcast TLB invalidation instructions, therefore a spurious page >> @@ -1028,12 +1022,8 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, >> #ifdef CONFIG_ARM64_MTE >> >> #define __HAVE_ARCH_PREPARE_TO_SWAP >> -static inline int arch_prepare_to_swap(struct page *page) >> -{ >> - if (system_supports_mte()) >> - return mte_save_tags(page); >> - return 0; >> -} >> +#define arch_prepare_to_swap arch_prepare_to_swap >> +extern int arch_prepare_to_swap(struct page *page); > > I think it would be better to modify this API to take a folio explicitly. The > caller already has the folio. > >> >> #define __HAVE_ARCH_SWAP_INVALIDATE >> static inline void arch_swap_invalidate_page(int type, pgoff_t offset) >> @@ -1049,11 +1039,8 @@ static inline void arch_swap_invalidate_area(int type) >> } >> >> #define __HAVE_ARCH_SWAP_RESTORE >> -static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) >> -{ >> - if (system_supports_mte()) >> - mte_restore_tags(entry, &folio->page); >> -} >> +#define arch_swap_restore arch_swap_restore >> +extern void arch_swap_restore(swp_entry_t entry, struct folio *folio); >> >> #endif /* CONFIG_ARM64_MTE */ >> >> diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c >> index a31833e3ddc5..e5637e931e4f 100644 >> --- a/arch/arm64/mm/mteswap.c >> +++ b/arch/arm64/mm/mteswap.c >> @@ -83,3 +83,23 @@ void mte_invalidate_tags_area(int type) >> } >> xa_unlock(&mte_pages); >> } >> + >> +int arch_prepare_to_swap(struct page *page) >> +{ >> + if (system_supports_mte()) { >> + struct folio *folio = page_folio(page); >> + long i, nr = folio_nr_pages(folio); >> + for (i = 0; i < nr; i++) >> + return mte_save_tags(folio_page(folio, i)); > > This will return after saving the first page of the folio! You will need to add > each page in a loop, and if you get an error at any point, you will need to > remove the pages that you already added successfully, by calling > arch_swap_invalidate_page() as far as I can see. Steven can you confirm? Yes that's right. mte_save_tags() needs to allocate memory so can fail and if failing then arch_prepare_to_swap() would need to put things back how they were with calls to mte_invalidate_tags() (although I think you'd actually want to refactor to create a function which takes a struct page *). Steve >> + } >> + return 0; >> +} >> + >> +void arch_swap_restore(swp_entry_t entry, struct folio *folio) >> +{ >> + if (system_supports_mte()) { >> + long i, nr = folio_nr_pages(folio); >> + for (i = 0; i < nr; i++) >> + mte_restore_tags(entry, folio_page(folio, i)); > > swap-in currently doesn't support large folios - everything is a single page > folio. So this isn't technically needed. But from the API POV, it seems > reasonable to make this change - except your implementation is broken. You are > currently setting every page in the folio to use the same tags as the first > page. You need to increment the swap entry for each page. > > Thanks, > Ryan > > >> + } >> +} >