Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp277073pxb; Thu, 12 Nov 2020 03:41:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJzJcZ7iuDpu7Bw8f282HEG/a+4oxsZnp9X1hOugdsdXYFpKvN+ly9IYxRbJGsApMc8J//6A X-Received: by 2002:aa7:d94c:: with SMTP id l12mr4690267eds.355.1605181298980; Thu, 12 Nov 2020 03:41:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605181298; cv=none; d=google.com; s=arc-20160816; b=KkgSCkTvLA15GXfJWXZQsXu2sSINUD1FwTwn8exZrQJz3Zq6yY2HfdiZ/g0fPuD1zC B+T9mOe+HkPsHU5q2uSKLFHZAz3eUO4rroMlM4G18lJoyPUjWtVK/WztuaPMJEBlpnli QdgxHPrckIINpxmhUd2JSIgf0tLSoj5eL1hoJHxv9AkVmh9/MwUW+svovQLK5J9WmN9J 9vZwA+Re87yhtCfEK+cz6d+XTP6BifumkbGWyoLUJ6tRXzVg9wzfIxtxN38HMJ/ukC/v b9zLs6+xsYN2pvC7Ad8pw26iF4Hu9S1Lj/BVO1g4Y8gQrMWxpXd138uJlt6UiT/pCCeF wniQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=BN2Rcp32NuciRFFCsn+z9483Z8Bc8dD4XTOlc3WJyMc=; b=Gm08Dg8ixncSVzOhaySwjyEGyf8EwDSS5yxcfdaldgGGKTcg9Byl0jMvVril0hjaNi H+Tb01irGEtU0RFeAuOatN3rrCDcvybQ6Go2b0cLXI0sF12QOiPPYjMmCCtByFP3K6Px qHqyvJiuIpSjLcHnJ7umGRhNlSsY6aLRm/YjmPFtLRjbk/tyvE1q+/Ls/cx138ma9mtQ cs9KkGBVqvb8PSqvhHfXIaErvvgKYAvo+y1PQ6GpZnk4RD53XtsE0oNIy0/lkhAZm+1d 5j9wOGNJ1pQdhENOFqaxNnmfmk2yKUOlcoF0Njs+eJo6k/rsItagqgz0eUNy3AjsBlkt 1M+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=merlin.20170209 header.b=JtvD8qtD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z5si3586661edj.362.2020.11.12.03.41.13; Thu, 12 Nov 2020 03:41:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=merlin.20170209 header.b=JtvD8qtD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727790AbgKLLhX (ORCPT + 99 others); Thu, 12 Nov 2020 06:37:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727234AbgKLLhT (ORCPT ); Thu, 12 Nov 2020 06:37:19 -0500 Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:8b0:10b:1231::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB670C0613D1 for ; Thu, 12 Nov 2020 03:37:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=BN2Rcp32NuciRFFCsn+z9483Z8Bc8dD4XTOlc3WJyMc=; b=JtvD8qtD1ZB6xYYKUZ3comLYb/ bVp+GlaK4zpeIx0eSTKY6Y8DHFO4c19kizbYk7gmSRwtBR2l/wzLCkQu/nD5v0kaGjRTpbNn590JX stqPGuUX71w6dQWFuiPurpvaL+l+LtUH8DRBRg0dOwTmiUacDbxYeMtJP81HOexa+Tt1f7O3bWykT BXMsVmCvKQeOs8cHL0xrZo9LADtqcoHajORnzFeG5J8bZDXPagSM2lGflFDXr5AhcFlaCX5JgQauv Hm16G3QiyZY8rgtvOK5GJSXcFXQFazcoaRs3TThMW5lI+AW4E+i34nQf7CoiZUwDPS7JVSaNuinma f1vzezcg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1kdAuG-0005AB-Hd; Thu, 12 Nov 2020 11:36:48 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 97F67300455; Thu, 12 Nov 2020 12:36:45 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 881672C71BAB3; Thu, 12 Nov 2020 12:36:45 +0100 (CET) Date: Thu, 12 Nov 2020 12:36:45 +0100 From: Peter Zijlstra To: Matthew Wilcox Cc: "Liang, Kan" , Will Deacon , Michael Ellerman , mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@redhat.com, eranian@google.com, ak@linux.intel.com, dave.hansen@intel.com, kirill.shutemov@linux.intel.com, benh@kernel.crashing.org, paulus@samba.org, David Miller , vbabka@suse.cz Subject: Re: [PATCH V9 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE Message-ID: <20201112113645.GT2651@hirez.programming.kicks-ass.net> References: <20201111112246.GR2651@hirez.programming.kicks-ass.net> <20201111124357.GS2651@hirez.programming.kicks-ass.net> <20201111153022.GT17076@casper.infradead.org> <20201111155724.GE2628@hirez.programming.kicks-ass.net> <20201111163848.GU17076@casper.infradead.org> <20201111172253.GG2628@hirez.programming.kicks-ass.net> <20201111182620.GW17076@casper.infradead.org> <20201111200000.GL2628@hirez.programming.kicks-ass.net> <20201111223344.GX17076@casper.infradead.org> <20201112095358.GU2628@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201112095358.GU2628@hirez.programming.kicks-ass.net> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 12, 2020 at 10:53:58AM +0100, Peter Zijlstra wrote: > You just don't like it because you want me to be purely page-table > based. How's something like this then? I failed to untangle Power's many MMUs though :/ --- arch/arm64/include/asm/pgtable.h | 3 ++ arch/sparc/include/asm/pgtable_64.h | 14 +++++ arch/sparc/mm/hugetlbpage.c | 19 ++++--- include/linux/pgtable.h | 16 ++++++ kernel/events/core.c | 100 +++++++++++++----------------------- 5 files changed, 82 insertions(+), 70 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 4ff12a7adcfd..1cd2d986b0ca 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -503,6 +503,9 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, PMD_TYPE_SECT) #define pmd_leaf(pmd) pmd_sect(pmd) +#define pmd_leaf_size(pmd) (pmd_cont(pmd) ? CONT_PMD_SIZE : PMD_SIZE) +#define pte_leaf_size(pte) (pte_cont(pte) ? CONT_PTE_SIZE : PAGE_SIZE) + #if defined(CONFIG_ARM64_64K_PAGES) || CONFIG_PGTABLE_LEVELS < 3 static inline bool pud_sect(pud_t pud) { return false; } static inline bool pud_table(pud_t pud) { return true; } diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 7ef6affa105e..1723e18ba89f 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -1121,6 +1121,20 @@ extern unsigned long cmdline_memory_size; asmlinkage void do_sparc64_fault(struct pt_regs *regs); +#ifdef CONFIG_HUGETLB_PAGE + +#define pud_leaf_size pud_leaf_size +extern unsigned long pud_leaf_size(pud_t pud); + +#define pmd_leaf_size pmd_leaf_size +extern unsigned long pmd_leaf_size(pmd_t pmd); + +#define pte_leaf_size pte_leaf_size +extern unsigned long pte_leaf_size(pte_t pte); +#endif + +#endif /* CONFIG_HUGETLB_PAGE */ + #endif /* !(__ASSEMBLY__) */ #endif /* !(_SPARC64_PGTABLE_H) */ diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index ec423b5f17dd..3e806b87ec19 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -247,14 +247,17 @@ static unsigned int sun4u_huge_tte_to_shift(pte_t entry) return shift; } -static unsigned int huge_tte_to_shift(pte_t entry) +static unsigned long __tte_to_shift(pte_t entry) { - unsigned long shift; - if (tlb_type == hypervisor) - shift = sun4v_huge_tte_to_shift(entry); - else - shift = sun4u_huge_tte_to_shift(entry); + return sun4v_huge_tte_to_shift(entry); + + return sun4u_huge_tte_to_shift(entry); +} + +static unsigned int huge_tte_to_shift(pte_t entry) +{ + unsigned long shift = __tte_to_shift(entry); if (shift == PAGE_SHIFT) WARN_ONCE(1, "tto_to_shift: invalid hugepage tte=0x%lx\n", @@ -272,6 +275,10 @@ static unsigned long huge_tte_to_size(pte_t pte) return size; } +unsigned long pud_leaf_size(pud_t pud) { return 1UL << __tte_to_shift((pte_t)pud); } +unsigned long pmd_leaf_size(pmd_t pmd) { return 1UL << __tte_to_shift((pte_t)pmd); } +unsigned long pte_leaf_size(pte_t pte) { return 1UL << __tte_to_shift((pte_t)pte); } + pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz) { diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 71125a4676c4..35b9da397e55 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1481,4 +1481,20 @@ typedef unsigned int pgtbl_mod_mask; #define pmd_leaf(x) 0 #endif +#ifndef pgd_leaf_size(x) +#define pgd_leaf_size(x) PGD_SIZE +#endif +#ifndef p4d_leaf_size(x) +#define p4d_leaf_size(x) P4D_SIZE +#endif +#ifndef pud_leaf_size(x) +#define pud_leaf_size(x) PUD_SIZE +#endif +#ifndef pmd_leaf_size(x) +#define pmd_leaf_size(x) PMD_SIZE +#endif +#ifndef pte_leaf_size(x) +#define pte_leaf_size(x) PAGE_SIZE +#endif + #endif /* _LINUX_PGTABLE_H */ diff --git a/kernel/events/core.c b/kernel/events/core.c index d2f3ca792936..fca04fcfc9ea 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7001,90 +7001,62 @@ static u64 perf_virt_to_phys(u64 virt) return phys_addr; } -#ifdef CONFIG_MMU - /* * Return the MMU page size of a given virtual address. - * - * This generic implementation handles page-table aligned huge pages, as well - * as non-page-table aligned hugetlbfs compound pages. - * - * If an architecture supports and uses non-page-table aligned pages in their - * kernel mapping it will need to provide it's own implementation of this - * function. */ -__weak u64 arch_perf_get_page_size(struct mm_struct *mm, unsigned long addr) +static u64 arch_perf_get_page_size(struct mm_struct *mm, unsigned long addr) { + u64 size = 0; +#ifdef CONFIG_HAVE_FAST_GUP struct page *page; - pgd_t *pgd; - p4d_t *p4d; - pud_t *pud; - pmd_t *pmd; - pte_t *pte; - - pgd = pgd_offset(mm, addr); - if (pgd_none(*pgd)) + pgd_t *pgdp, pgd; + p4d_t *p4dp, p4d; + pud_t *pudp, pud; + pmd_t *pmdp, pmd; + pte_t *ptep, pte; + + pgdp = pgd_offset(mm, addr); + pgd = READ_ONCE(*pgdp); + if (pgd_none(pgd)) return 0; - p4d = p4d_offset(pgd, addr); - if (!p4d_present(*p4d)) - return 0; + if (pgd_leaf(pgd)) + return pgd_leaf_size(pgd); - if (p4d_leaf(*p4d)) - return 1ULL << P4D_SHIFT; - - pud = pud_offset(p4d, addr); - if (!pud_present(*pud)) + p4dp = p4d_offset_lockless(pgdp, pgd, addr); + p4d = READ_ONCE(*p4dp); + if (!p4d_present(p4d)) return 0; - if (pud_leaf(*pud)) { -#ifdef pud_page - page = pud_page(*pud); - if (PageHuge(page)) - return page_size(compound_head(page)); -#endif - return 1ULL << PUD_SHIFT; - } + if (p4d_leaf(p4d)) + return p4d_leaf_size(p4d); - pmd = pmd_offset(pud, addr); - if (!pmd_present(*pmd)) + pudp = pud_offset_lockless(p4dp, p4d, addr); + pud = READ_ONCE(*pudp); + if (!pud_present(pud)) return 0; - if (pmd_leaf(*pmd)) { -#ifdef pmd_page - page = pmd_page(*pmd); - if (PageHuge(page)) - return page_size(compound_head(page)); -#endif - return 1ULL << PMD_SHIFT; - } + if (pud_leaf(pud)) + return pud_leaf_size(pud); - pte = pte_offset_map(pmd, addr); - if (!pte_present(*pte)) { - pte_unmap(pte); + pmdp = pmd_offset_lockless(pudp, pud, addr); + pmd = READ_ONCE(*pmdp); + if (!pmd_present(pmd)) return 0; - } - - page = pte_page(*pte); - if (PageHuge(page)) { - u64 size = page_size(compound_head(page)); - pte_unmap(pte); - return size; - } - pte_unmap(pte); - return PAGE_SIZE; -} + if (pmd_leaf(pmd)) + return pmd_leaf_size(pmd); -#else + ptep = pte_offset_map(&pmd, addr); + pte = READ_ONCE(*ptep); // gup_get_pte() + if (pte_present(pte)) + size = pte_leaf_size(pte); + pte_unmap(ptep); -static u64 arch_perf_get_page_size(struct mm_struct *mm, unsigned long addr) -{ - return 0; +#endif /* CONFIG_HAVE_FAST_GUP */ + return size; } -#endif - static u64 perf_get_page_size(unsigned long addr) { struct mm_struct *mm;