Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp4715978iob; Sun, 8 May 2022 23:03:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwJfVtVApX73pDi7aIxrRQAAoz9E1dj0djODYOoaGrpPVskTGbL5kHduF7kXRNf+eu/sQrF X-Received: by 2002:a17:90a:cf89:b0:1d7:7055:f49c with SMTP id i9-20020a17090acf8900b001d77055f49cmr17024616pju.12.1652076206676; Sun, 08 May 2022 23:03:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652076206; cv=none; d=google.com; s=arc-20160816; b=Ju42si/N8hLHt+1URCqS59UxEFA9Gu4mTAy2cdVgdXJSX/KEjIffILzRG3+Zkw/G+k /JY7MJpFZMeNXiU8WOgaJeRxRgvyY30rk5wd49Jqv+Z0xlmURzvFwoxfBucF3UgsWL7H Tmy+nhAosy2JUheUqvNEoH14dCpdqEpBf+h3NIj4wtJZM9BMd3cfeevdhIw574ckOrIT 2vwup9cEWa9fLNA0wEKZICTibXVlfzzMolHoOa1kvruWgVrmuqPK/dAqKPSVdKaAsTxO 1upVKC6UDW6xppUgVkTBopFrnBx3aXgzKGsJv8sUd4kaAhcNmEIAJfx/rnEh+qazSU11 qzzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=rCdXSvZjp6F8gmuuM1Bzn/n/1RjrsEENuke5qJWo6Ik=; b=Md4FWdKC0WYRwmX3UZ2uljHng+EyvmWk7eqCvJN6PJrTafi6odu6MmBZtqh99my1ue SHfyIZpNtWEO7e0uT6ZLkgY0+GH50MBejnsKa7TLLiepYU4ADzkDW4NcAJv4KbmAwADA 65FkOdSK+63gIYyCik3QH4uUWfbq9Zm79LArHUda334JO25AM2dWC2uzOrDMQE2mO6iF DURi/4Fz1g1vmg/zp46gs9SDClorfuwW+9GPlQYTo1cpdECMSdAYk7fsrT4B91kP0++u E9v6VXau0oe0cxV4t/QtKn1d6Az17aFhOWNOi96m7rOyZcj+sVbmH9d5tmBgqP8Dby6N +cXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=ZITLMC7D; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id m68-20020a632647000000b003c5d88c709esi13594533pgm.859.2022.05.08.23.03.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 08 May 2022 23:03:26 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=ZITLMC7D; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4042616E391; Sun, 8 May 2022 23:03:19 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239828AbiEHTHx (ORCPT + 99 others); Sun, 8 May 2022 15:07:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237379AbiEHRM1 (ORCPT ); Sun, 8 May 2022 13:12:27 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D3B3B7EC; Sun, 8 May 2022 10:08:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=rCdXSvZjp6F8gmuuM1Bzn/n/1RjrsEENuke5qJWo6Ik=; b=ZITLMC7DKkB4Trwc7bNWRzlWiR uT0nYzvCip831JpgQyklirM6W5ZtBvwI/rbUgZG25OXn37GktTQvcDVwZvekFIFN3RlwQjbeMxQNZ t8svRibUKtQ5pSma7ZHwD4rhYKKSKUMuCXUudxB/lKBTjcYTiXuIP90H0LBavuHykOSLAYZBR0n+N xlbmj9fmytKxcHJHrhHJQrB6rfhARHN0rdqB7mMnuDA1ZrH+oqN64jvGi1P+6adiJ4DRVajMHBRqD xcUx+jC59ig3ERAVQ/K3bOHlarpWldz6ZU9Jf6bS+1+KW5lZVKfkttKna9E0jBro6nPjtg4Fm6//P zy6sUqmw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1nnkOI-002foo-RN; Sun, 08 May 2022 17:08:18 +0000 Date: Sun, 8 May 2022 18:08:18 +0100 From: Matthew Wilcox To: Baolin Wang Cc: catalin.marinas@arm.com, will@kernel.org, arnd@arndb.de, mike.kravetz@oracle.com, akpm@linux-foundation.org, sj@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [RFC PATCH 0/3] Introduce new huge_ptep_get_access_flags() interface Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, May 08, 2022 at 04:58:51PM +0800, Baolin Wang wrote: > As Mike pointed out [1], the huge_ptep_get() will only return one specific > pte value for the CONT-PTE or CONT-PMD size hugetlb on ARM64 system, which > will not take into account the subpages' dirty or young bits of a CONT-PTE/PMD > size hugetlb page. That will make us miss dirty or young flags of a CONT-PTE/PMD > size hugetlb page for those functions that want to check the dirty or > young flags of a hugetlb page. For example, the gather_hugetlb_stats() will > get inaccurate dirty hugetlb page statistics, and the DAMON for hugetlb monitoring > will also get inaccurate access statistics. > > To fix this issue, one approach is that we can define an ARM64 specific huge_ptep_get() > implementation, which will take into account any subpages' dirty or young bits. > However we should add a new parameter for ARM64 specific huge_ptep_get() to check > how many continuous PTEs or PMDs in this CONT-PTE/PMD size hugetlb, that means we > should convert all the places using huge_ptep_get(), meanwhile most places using > huge_ptep_get() did not care about the dirty or young flags at all. > > So instead of changing the prototype of huge_ptep_get(), this patch set introduces > a new huge_ptep_get_access_flags() interface and define an ARM64 specific implementation, > that will take into account any subpages' dirty or young bits for CONT-PTE/PMD size > hugetlb page. And we can only change to use huge_ptep_get_access_flags() for those > functions that care about the dirty or young flags of a hugetlb page. I question whether this is the right approach. I understand that different hardware implementations have different requirements here, but at least one that I'm aware of (AMD Zen 2/3) requires that all PTEs that are part of a contig PTE must have identical A/D bits. Now, you could say that's irrelevant because it's x86 and we don't currently support contPTE on x86, but I wouldn't be surprised to see that other hardware has the same requirement. So what if we make that a Linux requirement? Setting a contPTE dirty or accessed becomes a bit more expensive (although still one/two cachelines, so not really much more expensive than a single write). Then there's no need to change the "get" side of things because they're always identical. It does mean that we can't take advantage of hardware setting A/D bits, unless hardware can be persuaded to behave this way. I don't have any ARM specs in front of me to check. I don't have a hard objection to your approach, I just want to discuss other possibilities.