Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753556AbbKLIqy (ORCPT ); Thu, 12 Nov 2015 03:46:54 -0500 Received: from mga01.intel.com ([192.55.52.88]:18331 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750832AbbKLIqx (ORCPT ); Thu, 12 Nov 2015 03:46:53 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,281,1444719600"; d="scan'208";a="835312733" From: "Kirill A. Shutemov" To: Ingo Molnar Cc: "Kirill A. Shutemov" , Borislav Petkov , "Kirill A. Shutemov" , hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com, konrad.wilk@oracle.com, elliott@hpe.com, boris.ostrovsky@oracle.com, Toshi Kani , Linus Torvalds In-Reply-To: <20151112080059.GA6835@gmail.com> References: <1447111090-8526-1-git-send-email-kirill.shutemov@linux.intel.com> <20151110123429.GE19187@pd.tnic> <20151110135303.GA11246@node.shutemov.name> <20151110144648.GG19187@pd.tnic> <20151110150713.GA11956@node.shutemov.name> <20151110170447.GH19187@pd.tnic> <20151111095101.GA22512@pd.tnic> <20151112074854.GA5376@gmail.com> <20151112075758.GA20702@node.shutemov.name> <20151112080059.GA6835@gmail.com> Subject: Re: [PATCH] x86/mm: fix regression with huge pages on PAE Content-Transfer-Encoding: 7bit Message-Id: <20151112084616.EABFE19B@black.fi.intel.com> Date: Thu, 12 Nov 2015 10:46:16 +0200 (EET) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9722 Lines: 267 Ingo Molnar wrote: > > * Kirill A. Shutemov wrote: > > > On Thu, Nov 12, 2015 at 08:48:54AM +0100, Ingo Molnar wrote: > > > > > > * Borislav Petkov wrote: > > > > > > > --- a/arch/x86/include/asm/pgtable_types.h > > > > +++ b/arch/x86/include/asm/pgtable_types.h > > > > @@ -279,17 +279,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd) > > > > static inline pudval_t pud_pfn_mask(pud_t pud) > > > > { > > > > if (native_pud_val(pud) & _PAGE_PSE) > > > > - return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK; > > > > + return ~((1ULL << PUD_SHIFT) - 1) & PHYSICAL_PAGE_MASK; > > > > else > > > > return PTE_PFN_MASK; > > > > } > > > > > > > static inline pmdval_t pmd_pfn_mask(pmd_t pmd) > > > > { > > > > if (native_pmd_val(pmd) & _PAGE_PSE) > > > > - return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK; > > > > + return ~((1ULL << PMD_SHIFT) - 1) & PHYSICAL_PAGE_MASK; > > > > else > > > > return PTE_PFN_MASK; > > > > } > > > > > > So instead of uglifying the code, why not fix the real bug: change the > > > PMD_PAGE_MASK/PUD_PAGE_MASK definitions to be 64-bit everywhere? > > > > *PAGE_MASK are usually applied to virtual addresses. I don't think it > > should anything but 'unsigned long'. This is odd use case really. > > So we already have PHYSICAL_PAGE_MASK, why not introduce PHYSICAL_PMD_MASK et al, > instead of uglifying the code? Okay, makes sense. Check out the patch below. > But, what problems do you expect with having a wider mask than its primary usage? > If it's used for 32-bit values it will be truncated down safely. (But I have not > tested it, so I might be missing some complication.) Yeah, I basically worry about non realized side effect. And these masks are defined via {PMD,PUD}_PAGE_SIZE. Should we change them to 'unsigned long long' too or leave alone? What about PAGE_SIZE and PAGE_MASK? Need to be converted too? >From 96362f38d47d6ef9a0ebb9a92c6b4db9337f1565 Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" Date: Thu, 12 Nov 2015 10:39:00 +0200 Subject: [PATCH] x86/mm: fix regression with huge pages on PAE MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Recent PAT patchset has caused issue on 32-bit PAE machines: page:eea45000 count:0 mapcount:-128 mapping: (null) index:0x0 flags: 0x40000000() page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) < 0) ------------[ cut here ]------------ kernel BUG at /home/build/linux-boris/mm/huge_memory.c:1485! invalid opcode: 0000 [#1] SMP Modules linked in: ahci libahci ata_generic skge r8169 firewire_ohci mii libata qla2xxx(+) scsi_transport_fc scsi_mod radeon tpm_infineon ttm backlight wmi acpi_cpufreq tpm_tis CPU: 2 PID: 1758 Comm: modprobe Not tainted 4.3.0upstream-09269-gce5c2d2 #1 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080014 07/18/2008 task: ed84e600 ti: f6458000 task.ti: f6458000 EIP: 0060:[] EFLAGS: 00010246 CPU: 2 EIP is at zap_huge_pmd+0x240/0x260 EAX: 00000000 EBX: f6459eb0 ECX: 00000292 EDX: 00000292 ESI: f6634d98 EDI: eea45000 EBP: f6459dc8 ESP: f6459d98 ata1: SATA link down (SStatus 0 SControl 300) ata2: SATA link down (SStatus 0 SControl 300) DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 8005003b CR2: b75b21a0 CR3: 3655b880 CR4: 000006f0 Stack: ... Call Trace: unmap_single_vma ? __wake_up unmap_vmas unmap_region do_munmap vm_munmap SyS_munmap do_fast_syscall_32 ? __do_page_fault sysenter_past_esp Code: 00 e9 05 fe ff ff 90 8d 74 26 00 0f 0b eb fe ba 4c e1 7a c1 89 f8 e8 f0 91 fd ff 0f 0b eb fe ba 6c e1 7a c1 89 f8 e8 e0 91 fd ff <0f> 0b eb fe ba c4 e1 7a c1 89 f8 e8 d0 91 fd ff 0f 0b eb fe 8d EIP: [] zap_huge_pmd+0x240/0x260 SS:ESP 0068:f6459d98 ---[ end trace cba8fb1fc2e2e78a ]--- The problem is in pmd_pfn_mask() and pmd_flags_mask(). These helpers use PMD_PAGE_MASK to calculate resulting mask. PMD_PAGE_MASK is 'unsigned long', not 'unsigned long long' as phys_addr_t. As result upper bits of resulting mask is truncated. pud_pfn_mask() and pud_flags_mask() aren't problematic since we don't have PUD page table level on 32-bit systems, but it's reasonable to keep them consitent with PMD counterpart. The patch introduces PHYSICAL_PMD_PAGE_MASK and PHYSICAL_PUD_PAGE_MASK in addition to existing PHYSICAL_PAGE_MASK and rework helpers to use them. Reported-and-Tested-by: Boris Ostrovsky Signed-off-by: Kirill A. Shutemov Reviewed-by: Toshi Kani Cc: Andrew Morton Cc: elliott@hpe.com Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jürgen Gross Cc: konrad.wilk@oracle.com Cc: linux-mm Cc: Mel Gorman Cc: Thomas Gleixner Fixes: f70abb0fc3da ("x86/asm: Fix pud/pmd interfaces to handle large PAT bit") Link: http://lkml.kernel.org/r/1447111090-8526-1-git-send-email-kirill.shutemov@linux.intel.com [ Fix -Woverflow warnings from the realmode code. ] Signed-off-by: Borislav Petkov --- arch/x86/boot/boot.h | 1 - arch/x86/boot/video-mode.c | 2 ++ arch/x86/boot/video.c | 2 ++ arch/x86/include/asm/page_types.h | 16 +++++++++------- arch/x86/include/asm/pgtable_types.h | 14 ++++---------- arch/x86/include/asm/x86_init.h | 1 - 6 files changed, 17 insertions(+), 19 deletions(-) diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h index 0033e96c3f09..9011a88353de 100644 --- a/arch/x86/boot/boot.h +++ b/arch/x86/boot/boot.h @@ -23,7 +23,6 @@ #include #include #include -#include #include #include "bitops.h" #include "ctype.h" diff --git a/arch/x86/boot/video-mode.c b/arch/x86/boot/video-mode.c index aa8a96b052e3..95c7a818c0ed 100644 --- a/arch/x86/boot/video-mode.c +++ b/arch/x86/boot/video-mode.c @@ -19,6 +19,8 @@ #include "video.h" #include "vesa.h" +#include + /* * Common variables */ diff --git a/arch/x86/boot/video.c b/arch/x86/boot/video.c index 05111bb8d018..77780e386e9b 100644 --- a/arch/x86/boot/video.c +++ b/arch/x86/boot/video.c @@ -13,6 +13,8 @@ * Select video mode */ +#include + #include "boot.h" #include "video.h" #include "vesa.h" diff --git a/arch/x86/include/asm/page_types.h b/arch/x86/include/asm/page_types.h index c5b7fb2774d0..cc071c6f7d4d 100644 --- a/arch/x86/include/asm/page_types.h +++ b/arch/x86/include/asm/page_types.h @@ -9,19 +9,21 @@ #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) +#define PMD_PAGE_SIZE (_AC(1, UL) << PMD_SHIFT) +#define PMD_PAGE_MASK (~(PMD_PAGE_SIZE-1)) + +#define PUD_PAGE_SIZE (_AC(1, UL) << PUD_SHIFT) +#define PUD_PAGE_MASK (~(PUD_PAGE_SIZE-1)) + #define __PHYSICAL_MASK ((phys_addr_t)((1ULL << __PHYSICAL_MASK_SHIFT) - 1)) #define __VIRTUAL_MASK ((1UL << __VIRTUAL_MASK_SHIFT) - 1) -/* Cast PAGE_MASK to a signed type so that it is sign-extended if +/* Cast *PAGE_MASK to a signed type so that it is sign-extended if virtual addresses are 32-bits but physical addresses are larger (ie, 32-bit PAE). */ #define PHYSICAL_PAGE_MASK (((signed long)PAGE_MASK) & __PHYSICAL_MASK) - -#define PMD_PAGE_SIZE (_AC(1, UL) << PMD_SHIFT) -#define PMD_PAGE_MASK (~(PMD_PAGE_SIZE-1)) - -#define PUD_PAGE_SIZE (_AC(1, UL) << PUD_SHIFT) -#define PUD_PAGE_MASK (~(PUD_PAGE_SIZE-1)) +#define PHYSICAL_PMD_PAGE_MASK (((signed long)PMD_PAGE_MASK) & __PHYSICAL_MASK) +#define PHYSICAL_PUD_PAGE_MASK (((signed long)PUD_PAGE_MASK) & __PHYSICAL_MASK) #define HPAGE_SHIFT PMD_SHIFT #define HPAGE_SIZE (_AC(1,UL) << HPAGE_SHIFT) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 116fc4ee586f..d1b76f88ccd1 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -277,17 +277,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd) static inline pudval_t pud_pfn_mask(pud_t pud) { if (native_pud_val(pud) & _PAGE_PSE) - return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK; + return PHYSICAL_PUD_PAGE_MASK; else return PTE_PFN_MASK; } static inline pudval_t pud_flags_mask(pud_t pud) { - if (native_pud_val(pud) & _PAGE_PSE) - return ~(PUD_PAGE_MASK & (pudval_t)PHYSICAL_PAGE_MASK); - else - return ~PTE_PFN_MASK; + return ~pud_pfn_mask(pud); } static inline pudval_t pud_flags(pud_t pud) @@ -298,17 +295,14 @@ static inline pudval_t pud_flags(pud_t pud) static inline pmdval_t pmd_pfn_mask(pmd_t pmd) { if (native_pmd_val(pmd) & _PAGE_PSE) - return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK; + return PHYSICAL_PMD_PAGE_MASK; else return PTE_PFN_MASK; } static inline pmdval_t pmd_flags_mask(pmd_t pmd) { - if (native_pmd_val(pmd) & _PAGE_PSE) - return ~(PMD_PAGE_MASK & (pmdval_t)PHYSICAL_PAGE_MASK); - else - return ~PTE_PFN_MASK; + return ~pmd_pfn_mask(pmd); } static inline pmdval_t pmd_flags(pmd_t pmd) diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h index 48d34d28f5a6..cd0fc0cc78bc 100644 --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -1,7 +1,6 @@ #ifndef _ASM_X86_PLATFORM_H #define _ASM_X86_PLATFORM_H -#include #include struct mpc_bus; -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/