Received: by 10.213.65.68 with SMTP id h4csp1157165imn; Wed, 14 Mar 2018 11:15:01 -0700 (PDT) X-Google-Smtp-Source: AG47ELsI9fSJ+ztyNlzOvniCzHesxg9+OypckjDLhvMtCOIzML3StoSv2iJh+FnlIIdDZxG3rPPC X-Received: by 10.99.109.139 with SMTP id i133mr4309537pgc.194.1521051301201; Wed, 14 Mar 2018 11:15:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521051301; cv=none; d=google.com; s=arc-20160816; b=e6S5UeWvYcHtYMn1OxdfFj9HP/L258c3VNkVlzDEKZMNOSlVKMw91LVRc5IC0Vl929 4bJx7Cj+xTgCmsVm3AbJ+M6jA1jv2Wz4O8uBRSSl005r1ov9llns38hxQQdgCYALVopj XNHiInjvXhjgkUjiNpfaBBqHDF5c20MfoBuI4YVBHfgI5vbCiC6xBI20UEPVIEUw4lPe YEEUNG/cygwp/xHstUBqiueHdn+aywLVY8RIGf8fJLt+wPZCOwz4K++BysG+EtrGrTcM XxuVCi1JarVJwVo8RDMWil8txzmznRhD5KywxbfTOfy1l7xmh3jNMlzaIUL2jAza4SUD r/Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=Re/ef8SaoH27mGPcNH59OoYU4FDsZVWQJSaa+hngX7E=; b=A90bTPQRJ4GyPgMRluwfVAJxlA314V2+PRLpjauI1/LuK+L6972PH1HXH4Terg/wJL dUZrukg3oTlDtXiJ8TEmlmuqnYAWWoYfvzokghfKwfscck/+8AAvNQTIbM3xkUBH4nb0 AYwcMEONo3Tdgeq6sLBpmwEMi6nuevNKMcegW5E8WX3psFk/ti4eFofbeoBfE6AqUIcn 8pQJTM8TPSPCtvxNVteb3Oe+NVHUKau8P2/OB5lpuBxkiar4mmNR69qEuCPO51Mfv7g5 IW812wrfpt0qDCc1PidcBIPOvyzPUaBgugH3S3l+Kn8DqyFJSaggFoHLHHy3rg2fug1r m+7g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w123si2455552pfd.14.2018.03.14.11.14.46; Wed, 14 Mar 2018 11:15:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752591AbeCNSLa (ORCPT + 99 others); Wed, 14 Mar 2018 14:11:30 -0400 Received: from g4t3425.houston.hpe.com ([15.241.140.78]:53466 "EHLO g4t3425.houston.hpe.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751526AbeCNSL2 (ORCPT ); Wed, 14 Mar 2018 14:11:28 -0400 X-Greylist: delayed 90456 seconds by postgrey-1.27 at vger.kernel.org; Wed, 14 Mar 2018 14:11:28 EDT Received: from g9t2301.houston.hpecorp.net (g9t2301.houston.hpecorp.net [16.220.97.129]) by g4t3425.houston.hpe.com (Postfix) with ESMTP id 494FCD9; Wed, 14 Mar 2018 18:11:27 +0000 (UTC) Received: from misato.americas.hpqcorp.net (misato.americas.hpqcorp.net [10.34.81.122]) by g9t2301.houston.hpecorp.net (Postfix) with ESMTP id E656C84A; Wed, 14 Mar 2018 18:02:00 +0000 (UTC) From: Toshi Kani To: mhocko@suse.com, akpm@linux-foundation.org, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, bp@suse.de, catalin.marinas@arm.com Cc: guohanjun@huawei.com, will.deacon@arm.com, wxf.wang@hisilicon.com, willy@infradead.org, cpandya@codeaurora.org, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Toshi Kani , stable@vger.kernel.org Subject: [PATCH v2 1/2] mm/vmalloc: Add interfaces to free unmapped page table Date: Wed, 14 Mar 2018 12:01:54 -0600 Message-Id: <20180314180155.19492-2-toshi.kani@hpe.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180314180155.19492-1-toshi.kani@hpe.com> References: <20180314180155.19492-1-toshi.kani@hpe.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may create pud/pmd mappings. Kernel panic was observed on arm64 systems with Cortex-A75 in the following steps as described by Hanjun Guo. 1. ioremap a 4K size, valid page table will build, 2. iounmap it, pte0 will set to 0; 3. ioremap the same address with 2M size, pgd/pmd is unchanged, then set the a new value for pmd; 4. pte0 is leaked; 5. CPU may meet exception because the old pmd is still in TLB, which will lead to kernel panic. This panic is not reproducible on x86. INVLPG, called from iounmap, purges all levels of entries associated with purged address on x86. x86 still has memory leak. The patch changes the ioremap path to free unmapped page table(s) since doing so in the unmap path has the following issues: - The iounmap() path is shared with vunmap(). Since vmap() only supports pte mappings, making vunmap() to free a pte page is an overhead for regular vmap users as they do not need a pte page freed up. - Checking if all entries in a pte page are cleared in the unmap path is racy, and serializing this check is expensive. - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges. Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB purge. Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which clear a given pud/pmd entry and free up a page for the lower level entries. This patch implements their stub functions on x86 and arm64, which work as workaround. Reported-by: Lei Li Signed-off-by: Toshi Kani Cc: Catalin Marinas Cc: Wang Xuefeng Cc: Will Deacon Cc: Hanjun Guo Cc: Michal Hocko Cc: Andrew Morton Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Borislav Petkov Cc: Matthew Wilcox Cc: Chintan Pandya Cc: --- arch/arm64/mm/mmu.c | 10 ++++++++++ arch/x86/mm/pgtable.c | 24 ++++++++++++++++++++++++ include/asm-generic/pgtable.h | 10 ++++++++++ lib/ioremap.c | 6 ++++-- 4 files changed, 48 insertions(+), 2 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 8c704f1e53c2..2dbb2c9f1ec1 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -972,3 +972,13 @@ int pmd_clear_huge(pmd_t *pmdp) pmd_clear(pmdp); return 1; } + +int pud_free_pmd_page(pud_t *pud) +{ + return pud_none(*pud); +} + +int pmd_free_pte_page(pmd_t *pmd) +{ + return pmd_none(*pmd); +} diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 004abf9ebf12..1eed7ed518e6 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -702,4 +702,28 @@ int pmd_clear_huge(pmd_t *pmd) return 0; } + +/** + * pud_free_pmd_page - Clear pud entry and free pmd page. + * @pud: Pointer to a PUD. + * + * Context: The pud range has been unmaped and TLB purged. + * Return: 1 if clearing the entry succeeded. 0 otherwise. + */ +int pud_free_pmd_page(pud_t *pud) +{ + return pud_none(*pud); +} + +/** + * pmd_free_pte_page - Clear pmd entry and free pte page. + * @pmd: Pointer to a PMD. + * + * Context: The pmd range has been unmaped and TLB purged. + * Return: 1 if clearing the entry succeeded. 0 otherwise. + */ +int pmd_free_pte_page(pmd_t *pmd) +{ + return pmd_none(*pmd); +} #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */ diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 2cfa3075d148..2490800f7c5a 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -983,6 +983,8 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot); int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot); int pud_clear_huge(pud_t *pud); int pmd_clear_huge(pmd_t *pmd); +int pud_free_pmd_page(pud_t *pud); +int pmd_free_pte_page(pmd_t *pmd); #else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */ static inline int p4d_set_huge(p4d_t *p4d, phys_addr_t addr, pgprot_t prot) { @@ -1008,6 +1010,14 @@ static inline int pmd_clear_huge(pmd_t *pmd) { return 0; } +static inline int pud_free_pmd_page(pud_t *pud) +{ + return 0; +} +static inline int pmd_free_pte_page(pud_t *pmd) +{ + return 0; +} #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */ #ifndef __HAVE_ARCH_FLUSH_PMD_TLB_RANGE diff --git a/lib/ioremap.c b/lib/ioremap.c index b808a390e4c3..54e5bbaa3200 100644 --- a/lib/ioremap.c +++ b/lib/ioremap.c @@ -91,7 +91,8 @@ static inline int ioremap_pmd_range(pud_t *pud, unsigned long addr, if (ioremap_pmd_enabled() && ((next - addr) == PMD_SIZE) && - IS_ALIGNED(phys_addr + addr, PMD_SIZE)) { + IS_ALIGNED(phys_addr + addr, PMD_SIZE) && + pmd_free_pte_page(pmd)) { if (pmd_set_huge(pmd, phys_addr + addr, prot)) continue; } @@ -117,7 +118,8 @@ static inline int ioremap_pud_range(p4d_t *p4d, unsigned long addr, if (ioremap_pud_enabled() && ((next - addr) == PUD_SIZE) && - IS_ALIGNED(phys_addr + addr, PUD_SIZE)) { + IS_ALIGNED(phys_addr + addr, PUD_SIZE) && + pud_free_pmd_page(pud)) { if (pud_set_huge(pud, phys_addr + addr, prot)) continue; }