Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp6128427ybi; Wed, 31 Jul 2019 08:48:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqz2qTRH548e/1J4ZykrjRHKLNDFg9hOuTjHwTVUCQnLYb+cN6mxGp+BmtID+J7u1bzZDBGa X-Received: by 2002:a17:902:f301:: with SMTP id gb1mr119158213plb.292.1564588130870; Wed, 31 Jul 2019 08:48:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564588130; cv=none; d=google.com; s=arc-20160816; b=ETeVaX8D2s5Cs8Ig0iO/2/dGxswt8YQptsra75S7VmVO+/vKSjFzgHMvWrFYhD5TbH 8zUx4FzbAlKYantsR74YLybMGQGn8n/P0qHSYx6EmyydNIAWQxTJZOUhezOiU8I9kcI8 gZW09n4t6G35xDDdu0YAr7ZJiwW16zpNjoqhP0ZKOuGObtXcRwONqjpym3J2IQVR7POz Tb/KQu5/sj3NsBf+R2L3dCsO4Unv8N5epFGkCiCYtMr8Rfq73HilcDxft29Qw2QvIUVW TrM+7C48HmdH8m5G73vHp7tuMNGCmi/2jVfHO7geB1xs8hcLgS5OnYs+gN2GCER+CEAE zOsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=wIgp+k+2ujziSCzMqofK2+n/6Gw2lFVBBkY93b4Ysm4=; b=Nr7HOrbARVxgEWMdLyDZApiPl9yFsWQ7tt0RO4QddJYZbxHEMnqgxEH7g+Z6CpNJ1o BQFq4bl1kImbgQyAtwDxjaVJF55jkZt8DADoOfpkVXGekNURRiqY0KG6gXqd/1Y6QOH8 0BlijmcuRUvG6rA/OkGgxURU03t4nd/JFsxd8JfP5EebnMtd7g4G1V0YFNVLw8k2ON1O XXDZQWeBieLdpyi73raEwod6iURz37bn2AkX4N6wP8tzy6pwiVdGTflzCX/S94gWc8yT mGV2j8szMACMamEi0kIof/QSMMMGH4vjjTiLB3dV0UOkkFF+fDz9pX2gJgJ1S4xog+IM dYUQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s1si33792745pgr.112.2019.07.31.08.48.34; Wed, 31 Jul 2019 08:48:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730066AbfGaPq4 (ORCPT + 99 others); Wed, 31 Jul 2019 11:46:56 -0400 Received: from foss.arm.com ([217.140.110.172]:49922 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729998AbfGaPqy (ORCPT ); Wed, 31 Jul 2019 11:46:54 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 16A1B1570; Wed, 31 Jul 2019 08:46:53 -0700 (PDT) Received: from e112269-lin.arm.com (e112269-lin.cambridge.arm.com [10.1.196.133]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8125E3F694; Wed, 31 Jul 2019 08:46:50 -0700 (PDT) From: Steven Price To: linux-mm@kvack.org Cc: Steven Price , Andy Lutomirski , Ard Biesheuvel , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Dave Hansen , Ingo Molnar , James Morse , =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= , Peter Zijlstra , Thomas Gleixner , Will Deacon , x86@kernel.org, "H. Peter Anvin" , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Mark Rutland , "Liang, Kan" , Andrew Morton Subject: [PATCH v10 14/22] mm: pagewalk: Add 'depth' parameter to pte_hole Date: Wed, 31 Jul 2019 16:45:55 +0100 Message-Id: <20190731154603.41797-15-steven.price@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190731154603.41797-1-steven.price@arm.com> References: <20190731154603.41797-1-steven.price@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The pte_hole() callback is called at multiple levels of the page tables. Code dumping the kernel page tables needs to know what at what depth the missing entry is. Add this is an extra parameter to pte_hole(). When the depth isn't know (e.g. processing a vma) then -1 is passed. The depth that is reported is the actual level where the entry is missing (ignoring any folding that is in place), i.e. any levels where PTRS_PER_P?D is set to 1 are ignored. Note that depth starts at 0 for a PGD so that PUD/PMD/PTE retain their natural numbers as levels 2/3/4. Signed-off-by: Steven Price --- fs/proc/task_mmu.c | 4 ++-- include/linux/mm.h | 6 ++++-- mm/hmm.c | 2 +- mm/migrate.c | 1 + mm/mincore.c | 1 + mm/pagewalk.c | 31 +++++++++++++++++++++++++------ 6 files changed, 34 insertions(+), 11 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 731642e0f5a0..b2f87fde69eb 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -504,7 +504,7 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, #ifdef CONFIG_SHMEM static int smaps_pte_hole(unsigned long addr, unsigned long end, - struct mm_walk *walk) + __always_unused int depth, struct mm_walk *walk) { struct mem_size_stats *mss = walk->private; @@ -1274,7 +1274,7 @@ static int add_to_pagemap(unsigned long addr, pagemap_entry_t *pme, } static int pagemap_pte_hole(unsigned long start, unsigned long end, - struct mm_walk *walk) + __always_unused int depth, struct mm_walk *walk) { struct pagemapread *pm = walk->private; unsigned long addr = start; diff --git a/include/linux/mm.h b/include/linux/mm.h index e2581ec5324e..6b2e6d65cb4c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1440,7 +1440,9 @@ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma, * pmd_trans_huge() pmds. They may simply choose to * split_huge_page() instead of handling it explicitly. * @pte_entry: if set, called for each non-empty PTE (lowest-level) entry - * @pte_hole: if set, called for each hole at all levels + * @pte_hole: if set, called for each hole at all levels, + * depth is -1 if not known, 0:PGD, 1:P4D, 2:PUD, 3:PMD, 4:PTE + * any depths where PTRS_PER_P?D is equal to 1 are skipped * @hugetlb_entry: if set, called for each hugetlb entry * @test_walk: caller specific callback function to determine whether * we walk over the current vma or not. Returning 0 @@ -1473,7 +1475,7 @@ struct mm_walk { int (*pte_entry)(pte_t *pte, unsigned long addr, unsigned long next, struct mm_walk *walk); int (*pte_hole)(unsigned long addr, unsigned long next, - struct mm_walk *walk); + int depth, struct mm_walk *walk); int (*hugetlb_entry)(pte_t *pte, unsigned long hmask, unsigned long addr, unsigned long next, struct mm_walk *walk); diff --git a/mm/hmm.c b/mm/hmm.c index e1eedef129cf..413944bb99dc 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -433,7 +433,7 @@ static void hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk, } static int hmm_vma_walk_hole(unsigned long addr, unsigned long end, - struct mm_walk *walk) + __always_unused int depth, struct mm_walk *walk) { struct hmm_vma_walk *hmm_vma_walk = walk->private; struct hmm_range *range = hmm_vma_walk->range; diff --git a/mm/migrate.c b/mm/migrate.c index 8992741f10aa..b92014ceb6dc 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2130,6 +2130,7 @@ struct migrate_vma { static int migrate_vma_collect_hole(unsigned long start, unsigned long end, + __always_unused int depth, struct mm_walk *walk) { struct migrate_vma *migrate = walk->private; diff --git a/mm/mincore.c b/mm/mincore.c index 4fe91d497436..8ba0fd80d449 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -112,6 +112,7 @@ static int __mincore_unmapped_range(unsigned long addr, unsigned long end, } static int mincore_unmapped_range(unsigned long addr, unsigned long end, + __always_unused int depth, struct mm_walk *walk) { walk->private += __mincore_unmapped_range(addr, end, diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 6bea79b95be3..cecc91259707 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -4,6 +4,22 @@ #include #include +/* + * We want to know the real level where a entry is located ignoring any + * folding of levels which may be happening. For example if p4d is folded then + * a missing entry found at level 1 (p4d) is actually at level 0 (pgd). + */ +static int real_depth(int depth) +{ + if (depth == 3 && PTRS_PER_PMD == 1) + depth = 2; + if (depth == 2 && PTRS_PER_PUD == 1) + depth = 1; + if (depth == 1 && PTRS_PER_P4D == 1) + depth = 0; + return depth; +} + static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) { @@ -31,6 +47,7 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, pmd_t *pmd; unsigned long next; int err = 0; + int depth = real_depth(3); if (walk->test_pmd) { err = walk->test_pmd(addr, end, pmd_offset(pud, 0UL), walk); @@ -46,7 +63,7 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, next = pmd_addr_end(addr, end); if (pmd_none(*pmd)) { if (walk->pte_hole) - err = walk->pte_hole(addr, next, walk); + err = walk->pte_hole(addr, next, depth, walk); if (err) break; continue; @@ -89,6 +106,7 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, pud_t *pud; unsigned long next; int err = 0; + int depth = real_depth(2); if (walk->test_pud) { err = walk->test_pud(addr, end, pud_offset(p4d, 0UL), walk); @@ -104,7 +122,7 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, next = pud_addr_end(addr, end); if (pud_none(*pud)) { if (walk->pte_hole) - err = walk->pte_hole(addr, next, walk); + err = walk->pte_hole(addr, next, depth, walk); if (err) break; continue; @@ -139,6 +157,7 @@ static int walk_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, p4d_t *p4d; unsigned long next; int err = 0; + int depth = real_depth(1); if (walk->test_p4d) { err = walk->test_p4d(addr, end, p4d_offset(pgd, 0UL), walk); @@ -153,7 +172,7 @@ static int walk_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, next = p4d_addr_end(addr, end); if (p4d_none_or_clear_bad(p4d)) { if (walk->pte_hole) - err = walk->pte_hole(addr, next, walk); + err = walk->pte_hole(addr, next, depth, walk); if (err) break; continue; @@ -184,7 +203,7 @@ static int walk_pgd_range(unsigned long addr, unsigned long end, next = pgd_addr_end(addr, end); if (pgd_none_or_clear_bad(pgd)) { if (walk->pte_hole) - err = walk->pte_hole(addr, next, walk); + err = walk->pte_hole(addr, next, 0, walk); if (err) break; continue; @@ -230,7 +249,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, if (pte) err = walk->hugetlb_entry(pte, hmask, addr, next, walk); else if (walk->pte_hole) - err = walk->pte_hole(addr, next, walk); + err = walk->pte_hole(addr, next, -1, walk); if (err) break; @@ -273,7 +292,7 @@ static int walk_page_test(unsigned long start, unsigned long end, if (vma->vm_flags & VM_PFNMAP) { int err = 1; if (walk->pte_hole) - err = walk->pte_hole(start, end, walk); + err = walk->pte_hole(start, end, -1, walk); return err ? err : 1; } return 0; -- 2.20.1