Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp207785yba; Wed, 3 Apr 2019 07:19:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqzC+EcHx5sPM9fDdi/T5vKkEvAgupMpKMVElPfyLMVo8Ioe2sdYxb6VIyLESUkg63xlFRnz X-Received: by 2002:aa7:8ac8:: with SMTP id b8mr13153238pfd.234.1554301199903; Wed, 03 Apr 2019 07:19:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554301199; cv=none; d=google.com; s=arc-20160816; b=zvEE9iUb07bjM+AjrA5YnA989SFsCSZGQbjzEXq38M+ymG9WVWVRkmWyHV2VFLh2nr u8jbi2gzzgQG/a44/QLP8JA8AadegEFkcCmDPluOiHpw3U9TZnwalXSmgoXo+BGna518 DwL4Qal7Y4GtcVS6qmYXq3EWH9GdQ1EgdeKbyxzT8jOSv+v34oPCUwANk/NrSxwLMnQ6 4z06P1m0MlM1Ae6gT1qdWch+EPYgIOIjtRWgbAc9Pibx0BxI9++GLwIsdFkGnazoFk7a nRyK/jlbGlXB8urRzEZ3nz89uo+1xMbPgevjy2KdduJUaV41Ptve2OBHoL6GCa6HljcS rFVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=kX4dLJraJo092n4y125DLKk3n4IU7HsLifEATF3quag=; b=zuIU71ghYQzfWdMt4VYGxwfRdrixWvXlcsEQsGiHj/nM9R3qozRKeKjEOa0L6ASgLj nNM70JboNeAJHLzNwjRKRiEOAE30ZkwvvNKlGhHQxtm6ihUekX6VZzz8zoso3mouGF7Y NW0bZhB8A9YRAZE4WKccOI+fx6B+DpXG0InnjD540+ApZZ3J7iM3W0b0cfhjO7R7+RyP YoSjqs+j95MA0lM2Y7EahkemnTph/zn9aDlKr9YK9Z2QnFzQ3xtbaZretTTkmqq62T3B z6cZRU0E1cNlFVE5/A6iyYOnW7wY73w8Y8ZR7GKd86EDP4od0pzX7cRJenYeJWBWHoQR wS0w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v5si13734827pgb.83.2019.04.03.07.19.44; Wed, 03 Apr 2019 07:19:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728674AbfDCOSb (ORCPT + 99 others); Wed, 3 Apr 2019 10:18:31 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:42078 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728668AbfDCOS2 (ORCPT ); Wed, 3 Apr 2019 10:18:28 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3968D169E; Wed, 3 Apr 2019 07:18:28 -0700 (PDT) Received: from e112269-lin.arm.com (e112269-lin.cambridge.arm.com [10.1.196.69]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CF37D3F68F; Wed, 3 Apr 2019 07:18:24 -0700 (PDT) From: Steven Price To: linux-mm@kvack.org Cc: Steven Price , Andy Lutomirski , Ard Biesheuvel , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Dave Hansen , Ingo Molnar , James Morse , =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= , Peter Zijlstra , Thomas Gleixner , Will Deacon , x86@kernel.org, "H. Peter Anvin" , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Mark Rutland , "Liang, Kan" , Andrew Morton Subject: [PATCH v8 20/20] x86: mm: Convert dump_pagetables to use walk_page_range Date: Wed, 3 Apr 2019 15:16:27 +0100 Message-Id: <20190403141627.11664-21-steven.price@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190403141627.11664-1-steven.price@arm.com> References: <20190403141627.11664-1-steven.price@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Make use of the new functionality in walk_page_range to remove the arch page walking code and use the generic code to walk the page tables. The effective permissions are passed down the chain using new fields in struct pg_state. The KASAN optimisation is implemented by including test_p?d callbacks which can decide to skip an entire tree of entries Signed-off-by: Steven Price --- arch/x86/mm/dump_pagetables.c | 280 ++++++++++++++++++---------------- 1 file changed, 146 insertions(+), 134 deletions(-) diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c index c0fbb9e5a790..f6b814aaddf7 100644 --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -33,6 +33,10 @@ struct pg_state { int level; pgprot_t current_prot; pgprotval_t effective_prot; + pgprotval_t effective_prot_pgd; + pgprotval_t effective_prot_p4d; + pgprotval_t effective_prot_pud; + pgprotval_t effective_prot_pmd; unsigned long start_address; unsigned long current_address; const struct addr_marker *marker; @@ -356,22 +360,21 @@ static inline pgprotval_t effective_prot(pgprotval_t prot1, pgprotval_t prot2) ((prot1 | prot2) & _PAGE_NX); } -static void walk_pte_level(struct pg_state *st, pmd_t addr, pgprotval_t eff_in, - unsigned long P) +static int ptdump_pte_entry(pte_t *pte, unsigned long addr, + unsigned long next, struct mm_walk *walk) { - int i; - pte_t *pte; - pgprotval_t prot, eff; - - for (i = 0; i < PTRS_PER_PTE; i++) { - st->current_address = normalize_addr(P + i * PTE_LEVEL_MULT); - pte = pte_offset_map(&addr, st->current_address); - prot = pte_flags(*pte); - eff = effective_prot(eff_in, prot); - note_page(st, __pgprot(prot), eff, 5); - pte_unmap(pte); - } + struct pg_state *st = walk->private; + pgprotval_t eff, prot; + + st->current_address = normalize_addr(addr); + + prot = pte_flags(*pte); + eff = effective_prot(st->effective_prot_pmd, prot); + note_page(st, __pgprot(prot), eff, 5); + + return 0; } + #ifdef CONFIG_KASAN /* @@ -400,131 +403,152 @@ static inline bool kasan_page_table(struct pg_state *st, void *pt) } #endif -#if PTRS_PER_PMD > 1 - -static void walk_pmd_level(struct pg_state *st, pud_t addr, - pgprotval_t eff_in, unsigned long P) +static int ptdump_test_pmd(unsigned long addr, unsigned long next, + pmd_t *pmd, struct mm_walk *walk) { - int i; - pmd_t *start, *pmd_start; - pgprotval_t prot, eff; - - pmd_start = start = (pmd_t *)pud_page_vaddr(addr); - for (i = 0; i < PTRS_PER_PMD; i++) { - st->current_address = normalize_addr(P + i * PMD_LEVEL_MULT); - if (!pmd_none(*start)) { - prot = pmd_flags(*start); - eff = effective_prot(eff_in, prot); - if (pmd_large(*start) || !pmd_present(*start)) { - note_page(st, __pgprot(prot), eff, 4); - } else if (!kasan_page_table(st, pmd_start)) { - walk_pte_level(st, *start, eff, - P + i * PMD_LEVEL_MULT); - } - } else - note_page(st, __pgprot(0), 0, 4); - start++; - } + struct pg_state *st = walk->private; + + st->current_address = normalize_addr(addr); + + if (kasan_page_table(st, pmd)) + return 1; + return 0; } -#else -#define walk_pmd_level(s,a,e,p) walk_pte_level(s,__pmd(pud_val(a)),e,p) -#undef pud_large -#define pud_large(a) pmd_large(__pmd(pud_val(a))) -#define pud_none(a) pmd_none(__pmd(pud_val(a))) -#endif +static int ptdump_pmd_entry(pmd_t *pmd, unsigned long addr, + unsigned long next, struct mm_walk *walk) +{ + struct pg_state *st = walk->private; + pgprotval_t eff, prot; + + prot = pmd_flags(*pmd); + eff = effective_prot(st->effective_prot_pud, prot); + + st->current_address = normalize_addr(addr); + + if (pmd_large(*pmd)) + note_page(st, __pgprot(prot), eff, 4); -#if PTRS_PER_PUD > 1 + st->effective_prot_pmd = eff; -static void walk_pud_level(struct pg_state *st, p4d_t addr, pgprotval_t eff_in, - unsigned long P) + return 0; +} + +static int ptdump_test_pud(unsigned long addr, unsigned long next, + pud_t *pud, struct mm_walk *walk) { - int i; - pud_t *start, *pud_start; - pgprotval_t prot, eff; - - pud_start = start = (pud_t *)p4d_page_vaddr(addr); - - for (i = 0; i < PTRS_PER_PUD; i++) { - st->current_address = normalize_addr(P + i * PUD_LEVEL_MULT); - if (!pud_none(*start)) { - prot = pud_flags(*start); - eff = effective_prot(eff_in, prot); - if (pud_large(*start) || !pud_present(*start)) { - note_page(st, __pgprot(prot), eff, 3); - } else if (!kasan_page_table(st, pud_start)) { - walk_pmd_level(st, *start, eff, - P + i * PUD_LEVEL_MULT); - } - } else - note_page(st, __pgprot(0), 0, 3); + struct pg_state *st = walk->private; - start++; - } + st->current_address = normalize_addr(addr); + + if (kasan_page_table(st, pud)) + return 1; + return 0; } -#else -#define walk_pud_level(s,a,e,p) walk_pmd_level(s,__pud(p4d_val(a)),e,p) -#undef p4d_large -#define p4d_large(a) pud_large(__pud(p4d_val(a))) -#define p4d_none(a) pud_none(__pud(p4d_val(a))) -#endif +static int ptdump_pud_entry(pud_t *pud, unsigned long addr, + unsigned long next, struct mm_walk *walk) +{ + struct pg_state *st = walk->private; + pgprotval_t eff, prot; + + prot = pud_flags(*pud); + eff = effective_prot(st->effective_prot_p4d, prot); + + st->current_address = normalize_addr(addr); + + if (pud_large(*pud)) + note_page(st, __pgprot(prot), eff, 3); + + st->effective_prot_pud = eff; -static void walk_p4d_level(struct pg_state *st, pgd_t addr, pgprotval_t eff_in, - unsigned long P) + return 0; +} + +static int ptdump_test_p4d(unsigned long addr, unsigned long next, + p4d_t *p4d, struct mm_walk *walk) { - int i; - p4d_t *start, *p4d_start; - pgprotval_t prot, eff; - - if (PTRS_PER_P4D == 1) - return walk_pud_level(st, __p4d(pgd_val(addr)), eff_in, P); - - p4d_start = start = (p4d_t *)pgd_page_vaddr(addr); - - for (i = 0; i < PTRS_PER_P4D; i++) { - st->current_address = normalize_addr(P + i * P4D_LEVEL_MULT); - if (!p4d_none(*start)) { - prot = p4d_flags(*start); - eff = effective_prot(eff_in, prot); - if (p4d_large(*start) || !p4d_present(*start)) { - note_page(st, __pgprot(prot), eff, 2); - } else if (!kasan_page_table(st, p4d_start)) { - walk_pud_level(st, *start, eff, - P + i * P4D_LEVEL_MULT); - } - } else - note_page(st, __pgprot(0), 0, 2); + struct pg_state *st = walk->private; - start++; - } + st->current_address = normalize_addr(addr); + + if (kasan_page_table(st, p4d)) + return 1; + return 0; } -#undef pgd_large -#define pgd_large(a) (pgtable_l5_enabled() ? pgd_large(a) : p4d_large(__p4d(pgd_val(a)))) -#define pgd_none(a) (pgtable_l5_enabled() ? pgd_none(a) : p4d_none(__p4d(pgd_val(a)))) +static int ptdump_p4d_entry(p4d_t *p4d, unsigned long addr, + unsigned long next, struct mm_walk *walk) +{ + struct pg_state *st = walk->private; + pgprotval_t eff, prot; + + prot = p4d_flags(*p4d); + eff = effective_prot(st->effective_prot_pgd, prot); + + st->current_address = normalize_addr(addr); + + if (p4d_large(*p4d)) + note_page(st, __pgprot(prot), eff, 2); + + st->effective_prot_p4d = eff; + + return 0; +} -static inline bool is_hypervisor_range(int idx) +static int ptdump_pgd_entry(pgd_t *pgd, unsigned long addr, + unsigned long next, struct mm_walk *walk) { -#ifdef CONFIG_X86_64 - /* - * A hole in the beginning of kernel address space reserved - * for a hypervisor. - */ - return (idx >= pgd_index(GUARD_HOLE_BASE_ADDR)) && - (idx < pgd_index(GUARD_HOLE_END_ADDR)); + struct pg_state *st = walk->private; + pgprotval_t eff, prot; + + prot = pgd_flags(*pgd); + +#ifdef CONFIG_X86_PAE + eff = _PAGE_USER | _PAGE_RW; #else - return false; + eff = prot; #endif + + st->current_address = normalize_addr(addr); + + if (pgd_large(*pgd)) + note_page(st, __pgprot(prot), eff, 1); + + st->effective_prot_pgd = eff; + + return 0; +} + +static int ptdump_hole(unsigned long addr, unsigned long next, + struct mm_walk *walk) +{ + struct pg_state *st = walk->private; + + st->current_address = normalize_addr(addr); + + note_page(st, __pgprot(0), 0, -1); + + return 0; } static void ptdump_walk_pgd_level_core(struct seq_file *m, struct mm_struct *mm, bool checkwx, bool dmesg) { - pgd_t *start = mm->pgd; - pgprotval_t prot, eff; - int i; struct pg_state st = {}; + struct mm_walk walk = { + .mm = mm, + .pgd_entry = ptdump_pgd_entry, + .p4d_entry = ptdump_p4d_entry, + .pud_entry = ptdump_pud_entry, + .pmd_entry = ptdump_pmd_entry, + .pte_entry = ptdump_pte_entry, + .test_p4d = ptdump_test_p4d, + .test_pud = ptdump_test_pud, + .test_pmd = ptdump_test_pmd, + .pte_hole = ptdump_hole, + .private = &st + }; st.to_dmesg = dmesg; st.check_wx = checkwx; @@ -532,27 +556,15 @@ static void ptdump_walk_pgd_level_core(struct seq_file *m, struct mm_struct *mm, if (checkwx) st.wx_pages = 0; - for (i = 0; i < PTRS_PER_PGD; i++) { - st.current_address = normalize_addr(i * PGD_LEVEL_MULT); - if (!pgd_none(*start) && !is_hypervisor_range(i)) { - prot = pgd_flags(*start); -#ifdef CONFIG_X86_PAE - eff = _PAGE_USER | _PAGE_RW; + down_read(&mm->mmap_sem); +#ifdef CONFIG_X86_64 + walk_page_range(0, PTRS_PER_PGD*PGD_LEVEL_MULT/2, &walk); + walk_page_range(normalize_addr(PTRS_PER_PGD*PGD_LEVEL_MULT/2), ~0, + &walk); #else - eff = prot; + walk_page_range(0, ~0, &walk); #endif - if (pgd_large(*start) || !pgd_present(*start)) { - note_page(&st, __pgprot(prot), eff, 1); - } else { - walk_p4d_level(&st, *start, eff, - i * PGD_LEVEL_MULT); - } - } else - note_page(&st, __pgprot(0), 0, 1); - - cond_resched(); - start++; - } + up_read(&mm->mmap_sem); /* Flush out the last page */ st.current_address = normalize_addr(PTRS_PER_PGD*PGD_LEVEL_MULT); -- 2.20.1