Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3690304imu; Mon, 10 Dec 2018 06:23:37 -0800 (PST) X-Google-Smtp-Source: AFSGD/UIHZtkCx1AFl8LEoVVjRZXP/zscS0f0vFaUpv7uKiTMzjePdNOa85o0PeYvH0C+xXDZnQ1 X-Received: by 2002:a63:4d66:: with SMTP id n38mr11085088pgl.270.1544451817189; Mon, 10 Dec 2018 06:23:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544451817; cv=none; d=google.com; s=arc-20160816; b=Wp52cQ5w/AZhqKem/oDNpgyff+mol8vpmIIiyMsrvIr9Rye47M8OnazqxMO5KCc0wj ElFjy/91Uj11oaYJ0Ob+ZH+4IH62NcCY9JrhbGRwc8Ruxqhdi8LA7EI0S0b75OJG0WDE gjw+iG+sNxlzcHQOusn6psrxZvRabUI9wGySetwczpAfetHO/op8PDUQZf7/dI6za8Dj s6TpvtFAKi9sGAzIF90F4GELH39Xa2gcd/9rqO6XRPlCI66rVgaBBslAfn0Qs7mydSqo uVCGlrXTIShvtCCQ93REpsn+X/tbbfb9Jw6X2XWvWcUGWZLbawcbNAirhX73UqsB1Bj/ oFfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=xOoTzdmKMdtBA2BpXbDLD0G8LWx+DyKy7zXrDUTZebs=; b=xCc5CT6z3nnVNXSSCq8Pj3JTwBV8MCuGVXOQ2clLQZiwKumbcQuym/qHRnCEOhDu0O HQbR070w/mc0ehaZIqU29/MbYpIOIW/8/G5/dfm31jDB2ZZVBpvAbKXRvCpnLxTtd0Bu JoNvSPobhmXDZe3wXBycJtz7bqrzYXDHsj7q4V79DgiQDbMIhdO0gU76xixlkVDv2PSF GQRAb45mDfDctivUNAeAkQZP0xdHZzXBRxB99owG5ICCtZ6RuhISj3t9SuKznJo1xhFj eSWBxLbEv7pnV8U0OVYEXL2tBLR8ZIqegdjQHAthLiHEsy/9UPf6s9WNUQ7aw1ZhJ2JZ MkTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="X6qQ26h/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t9si7967426plz.427.2018.12.10.06.23.21; Mon, 10 Dec 2018 06:23:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="X6qQ26h/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727848AbeLJOVU (ORCPT + 99 others); Mon, 10 Dec 2018 09:21:20 -0500 Received: from mail-qk1-f194.google.com ([209.85.222.194]:42837 "EHLO mail-qk1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727642AbeLJOVT (ORCPT ); Mon, 10 Dec 2018 09:21:19 -0500 Received: by mail-qk1-f194.google.com with SMTP id w12so6527750qkb.9 for ; Mon, 10 Dec 2018 06:21:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=xOoTzdmKMdtBA2BpXbDLD0G8LWx+DyKy7zXrDUTZebs=; b=X6qQ26h/11V6sgU0OMygX4vB5jJmONTRHOEsU6ut2JkcQo8d/vOYEtraj4EuVGDJDl xl+RSTobeDhtn6BhYV1IhqxW0+fcf+/YpQKL5Hbq4ze+Bg6HZ/HvaqISS2qSEKvwgUkF tqPclHGXHvtrl0lxoOjXzfZ1JU1LG3NBUwSmo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=xOoTzdmKMdtBA2BpXbDLD0G8LWx+DyKy7zXrDUTZebs=; b=VIIAHeVALwjOkeGZF7A6M5pfsddJ0blPISfhAjbl1keBIB6LHKVIP1B0gQWz2PdYi0 +izhrl81PTCPuVeyYrExtEkXJmRRHZ7+mm/jev2kNERNfdOD1gqLRCt+kgsRpUrPujtt RQXfZ0OHhhqsLX/FPVYbODZ4V2W0x4DajGarD4ZBmrXaeNWWvqMM72EYX4dnJWLZIHQ7 8HJkI5owdFzqGd+XZc75Uck2eL4WvsOEXRyold+kQE3AkDzbZsQzBsK2l1T/Q4++5M2/ 7ZQZadD0FD9LkZjC8zq/6KkHAobA6sVEisYakmn5noG5BX1xhDEgMlS2h9PXAhlfEO3T t8ew== X-Gm-Message-State: AA+aEWaMgPYX6wu9aZVeyxnNbm0jpnzGwGVbjhNPEdR1Jz47HwS55Z35 U/XoiwjVFeE0vSWWSPpqKPaMkQ== X-Received: by 2002:a37:1ad9:: with SMTP id l86mr10676215qkh.54.1544451677506; Mon, 10 Dec 2018 06:21:17 -0800 (PST) Received: from workstation.celeiro.br ([138.204.25.7]) by smtp.gmail.com with ESMTPSA id x127sm10195473qkx.43.2018.12.10.06.21.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 10 Dec 2018 06:21:16 -0800 (PST) From: Rafael David Tinoco To: Russell King Cc: Catalin Marinas , Will Deacon , Tony Luck , Fenghua Yu , Ralf Baechle , Paul Burton , James Hogan , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Martin Schwidefsky , Heiko Carstens , Yoshinori Sato , Rich Felker , "David S . Miller" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Minchan Kim , Nitin Gupta , Sergey Senozhatsky , Rafael David Tinoco , Christophe Leroy , "Aneesh Kumar K . V" , Ram Pai , Nicholas Piggin , Vasily Gorbik , Anthony Yznaga , Khalid Aziz , Joerg Roedel , Juergen Gross , "Kirill A . Shutemov" , Andy Lutomirski , Jiri Kosina , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support Date: Mon, 10 Dec 2018 12:21:05 -0200 Message-Id: <20181210142105.6750-1-rafael.tinoco@linaro.org> X-Mailer: git-send-email 2.20.0.rc1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 32-bit systems, zsmalloc uses HIGHMEM and, when PAE is enabled, the physical frame number might be so big that zsmalloc obj encoding (to location) will break, causing: BUG: KASAN: null-ptr-deref in zs_map_object+0xa4/0x2bc Read of size 4 at addr 00000000 by task mkfs.ext4/623 CPU: 2 PID: 623 Comm: mkfs.ext4 Not tainted 4.19.0-rc8-00017-g8239bc6d3307-dirty #15 Hardware name: Generic DT based system [] (unwind_backtrace) from [] (show_stack+0x20/0x24) [] (show_stack) from [] (dump_stack+0xbc/0xe8) [] (dump_stack) from [] (kasan_report+0x248/0x390) [] (kasan_report) from [] (__asan_load4+0x78/0xb4) [] (__asan_load4) from [] (zs_map_object+0xa4/0x2bc) [] (zs_map_object) from [] (zram_bvec_rw.constprop.2+0x324/0x8e4 [zram]) [] (zram_bvec_rw.constprop.2 [zram]) from [] (zram_make_request+0x234/0x46c [zram]) [] (zram_make_request [zram]) from [] (generic_make_request+0x304/0x63c) [] (generic_make_request) from [] (submit_bio+0x4c/0x1c8) [] (submit_bio) from [] (submit_bh_wbc.constprop.15+0x238/0x26c) [] (submit_bh_wbc.constprop.15) from [] (__block_write_full_page+0x524/0x76c) [] (__block_write_full_page) from [] (block_write_full_page+0x1bc/0x1d4) [] (block_write_full_page) from [] (blkdev_writepage+0x24/0x28) [] (blkdev_writepage) from [] (__writepage+0x44/0x78) [] (__writepage) from [] (write_cache_pages+0x3b8/0x800) [] (write_cache_pages) from [] (generic_writepages+0x74/0xa0) [] (generic_writepages) from [] (blkdev_writepages+0x18/0x1c) [] (blkdev_writepages) from [] (do_writepages+0x68/0x134) [] (do_writepages) from [] (__filemap_fdatawrite_range+0xb0/0xf4) [] (__filemap_fdatawrite_range) from [] (file_write_and_wait_range+0x64/0xd0) [] (file_write_and_wait_range) from [] (blkdev_fsync+0x54/0x84) [] (blkdev_fsync) from [] (vfs_fsync_range+0x70/0xd4) [] (vfs_fsync_range) from [] (do_fsync+0x4c/0x80) [] (do_fsync) from [] (sys_fsync+0x1c/0x20) [] (sys_fsync) from [] (ret_fast_syscall+0x0/0x2c) when trying to decode (the pfn) and map the object. That happens because one architecture might not re-define MAX_PHYSMEM_BITS, like in this ARM 32-bit w/ LPAE enabled example. For 32-bit systems, if not re-defined, MAX_POSSIBLE_PHYSMEM_BITS will default to BITS_PER_LONG (32) in most cases, and, with PAE enabled, _PFN_BITS might be wrong: which may cause obj variable to overflow if frame number is in HIGHMEM and referencing a page above the 4GB watermark. commit 6e00ec00b1a7 ("staging: zsmalloc: calculate MAX_PHYSMEM_BITS if not defined") realized MAX_PHYSMEM_BITS depended on SPARSEMEM headers and "fixed" it by calculating it using BITS_PER_LONG if SPARSEMEM wasn't used, like in the example given above. Systems with potential for PAE exist for a long time and assuming BITS_PER_LONG seems inadequate. Defining MAX_PHYSMEM_BITS looks better, however it is NOT a constant anymore for x86. SO, instead, MAX_POSSIBLE_PHYSMEM_BITS should be defined by every architecture using zsmalloc, together with a sanity check for MAX_POSSIBLE_PHYSMEM_BITS being too big on 32-bit systems. Link: https://bugs.linaro.org/show_bug.cgi?id=3765#c17 Signed-off-by: Rafael David Tinoco --- arch/arm/include/asm/pgtable-2level-types.h | 2 ++ arch/arm/include/asm/pgtable-3level-types.h | 2 ++ arch/arm64/include/asm/pgtable-types.h | 2 ++ arch/ia64/include/asm/page.h | 2 ++ arch/mips/include/asm/page.h | 2 ++ arch/powerpc/include/asm/mmu.h | 2 ++ arch/s390/include/asm/page.h | 2 ++ arch/sh/include/asm/page.h | 2 ++ arch/sparc/include/asm/page_32.h | 2 ++ arch/sparc/include/asm/page_64.h | 2 ++ arch/x86/include/asm/pgtable-2level_types.h | 2 ++ arch/x86/include/asm/pgtable-3level_types.h | 3 +- arch/x86/include/asm/pgtable_64_types.h | 4 +-- mm/zsmalloc.c | 35 +++++++++++---------- 14 files changed, 45 insertions(+), 19 deletions(-) diff --git a/arch/arm/include/asm/pgtable-2level-types.h b/arch/arm/include/asm/pgtable-2level-types.h index 66cb5b0e89c5..552dba411324 100644 --- a/arch/arm/include/asm/pgtable-2level-types.h +++ b/arch/arm/include/asm/pgtable-2level-types.h @@ -64,4 +64,6 @@ typedef pteval_t pgprot_t; #endif /* STRICT_MM_TYPECHECKS */ +#define MAX_POSSIBLE_PHYSMEM_BITS 32 + #endif /* _ASM_PGTABLE_2LEVEL_TYPES_H */ diff --git a/arch/arm/include/asm/pgtable-3level-types.h b/arch/arm/include/asm/pgtable-3level-types.h index 921aa30259c4..664c39e6517c 100644 --- a/arch/arm/include/asm/pgtable-3level-types.h +++ b/arch/arm/include/asm/pgtable-3level-types.h @@ -67,4 +67,6 @@ typedef pteval_t pgprot_t; #endif /* STRICT_MM_TYPECHECKS */ +#define MAX_POSSIBLE_PHYSMEM_BITS 36 + #endif /* _ASM_PGTABLE_3LEVEL_TYPES_H */ diff --git a/arch/arm64/include/asm/pgtable-types.h b/arch/arm64/include/asm/pgtable-types.h index 345a072b5856..45c3834eb4c8 100644 --- a/arch/arm64/include/asm/pgtable-types.h +++ b/arch/arm64/include/asm/pgtable-types.h @@ -64,4 +64,6 @@ typedef struct { pteval_t pgprot; } pgprot_t; #include #endif +#define MAX_POSSIBLE_PHYSMEM_BITS CONFIG_ARM64_PA_BITS + #endif /* __ASM_PGTABLE_TYPES_H */ diff --git a/arch/ia64/include/asm/page.h b/arch/ia64/include/asm/page.h index 5798bd2b462c..a3e055979e46 100644 --- a/arch/ia64/include/asm/page.h +++ b/arch/ia64/include/asm/page.h @@ -235,4 +235,6 @@ get_order (unsigned long size) #define __HAVE_ARCH_GATE_AREA 1 +#define MAX_POSSIBLE_PHYSMEM_BITS 50 + #endif /* _ASM_IA64_PAGE_H */ diff --git a/arch/mips/include/asm/page.h b/arch/mips/include/asm/page.h index e8cc328fce2d..f6a5dea1a66c 100644 --- a/arch/mips/include/asm/page.h +++ b/arch/mips/include/asm/page.h @@ -263,4 +263,6 @@ extern int __virt_addr_valid(const volatile void *kaddr); #include #include +#define MAX_POSSIBLE_PHYSMEM_BITS 48 + #endif /* _ASM_PAGE_H */ diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index eb20eb3b8fb0..2ebc1d2d9a5c 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -324,6 +324,8 @@ static inline u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address) #define MAX_PHYSMEM_BITS 46 #endif +#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS + #ifdef CONFIG_PPC_BOOK3S_64 #include #else /* CONFIG_PPC_BOOK3S_64 */ diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h index a4d38092530a..8abec1461bf7 100644 --- a/arch/s390/include/asm/page.h +++ b/arch/s390/include/asm/page.h @@ -180,4 +180,6 @@ static inline int devmem_is_allowed(unsigned long pfn) #include #include +#define MAX_POSSIBLE_PHYSMEM_BITS CONFIG_MAX_PHYSMEM_BITS + #endif /* _S390_PAGE_H */ diff --git a/arch/sh/include/asm/page.h b/arch/sh/include/asm/page.h index 5eef8be3e59f..40c7e12cf09e 100644 --- a/arch/sh/include/asm/page.h +++ b/arch/sh/include/asm/page.h @@ -205,4 +205,6 @@ typedef struct page *pgtable_t; #define ARCH_SLAB_MINALIGN 8 #endif +#define MAX_POSSIBLE_PHYSMEM_BITS 32 + #endif /* __ASM_SH_PAGE_H */ diff --git a/arch/sparc/include/asm/page_32.h b/arch/sparc/include/asm/page_32.h index b76d59edec8c..14e9ca4659d7 100644 --- a/arch/sparc/include/asm/page_32.h +++ b/arch/sparc/include/asm/page_32.h @@ -139,4 +139,6 @@ extern unsigned long pfn_base; #include #include +#define MAX_POSSIBLE_PHYSMEM_BITS 32 + #endif /* _SPARC_PAGE_H */ diff --git a/arch/sparc/include/asm/page_64.h b/arch/sparc/include/asm/page_64.h index e80f2d5bf62f..6d6f3654ead1 100644 --- a/arch/sparc/include/asm/page_64.h +++ b/arch/sparc/include/asm/page_64.h @@ -163,4 +163,6 @@ extern unsigned long PAGE_OFFSET; #include +#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYS_ADDRESS_BITS + #endif /* _SPARC64_PAGE_H */ diff --git a/arch/x86/include/asm/pgtable-2level_types.h b/arch/x86/include/asm/pgtable-2level_types.h index 6deb6cd236e3..c2eae59e6505 100644 --- a/arch/x86/include/asm/pgtable-2level_types.h +++ b/arch/x86/include/asm/pgtable-2level_types.h @@ -38,4 +38,6 @@ typedef union { /* This covers all VMSPLIT_* and VMSPLIT_*_OPT variants */ #define PGD_KERNEL_START (CONFIG_PAGE_OFFSET >> PGDIR_SHIFT) +#define MAX_POSSIBLE_PHYSMEM_BITS 32 + #endif /* _ASM_X86_PGTABLE_2LEVEL_DEFS_H */ diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h index 33845d36897c..5fce514a49a0 100644 --- a/arch/x86/include/asm/pgtable-3level_types.h +++ b/arch/x86/include/asm/pgtable-3level_types.h @@ -45,7 +45,8 @@ typedef union { */ #define PTRS_PER_PTE 512 -#define MAX_POSSIBLE_PHYSMEM_BITS 36 #define PGD_KERNEL_START (CONFIG_PAGE_OFFSET >> PGDIR_SHIFT) +#define MAX_POSSIBLE_PHYSMEM_BITS 36 + #endif /* _ASM_X86_PGTABLE_3LEVEL_DEFS_H */ diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 84bd9bdc1987..d808cfde3d19 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -64,8 +64,6 @@ extern unsigned int ptrs_per_p4d; #define P4D_SIZE (_AC(1, UL) << P4D_SHIFT) #define P4D_MASK (~(P4D_SIZE - 1)) -#define MAX_POSSIBLE_PHYSMEM_BITS 52 - #else /* CONFIG_X86_5LEVEL */ /* @@ -154,4 +152,6 @@ extern unsigned int ptrs_per_p4d; #define PGD_KERNEL_START ((PAGE_SIZE / 2) / sizeof(pgd_t)) +#define MAX_POSSIBLE_PHYSMEM_BITS (pgtable_l5_enabled() ? 52 : 46) + #endif /* _ASM_X86_PGTABLE_64_DEFS_H */ diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 0787d33b80d8..132c20b6fd4f 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -80,23 +80,7 @@ * as single (unsigned long) handle value. * * Note that object index starts from 0. - * - * This is made more complicated by various memory models and PAE. - */ - -#ifndef MAX_POSSIBLE_PHYSMEM_BITS -#ifdef MAX_PHYSMEM_BITS -#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS -#else -/* - * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just - * be PAGE_SHIFT */ -#define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG -#endif -#endif - -#define _PFN_BITS (MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT) /* * Memory for allocating for handle keeps object position by @@ -116,6 +100,25 @@ */ #define OBJ_ALLOCATED_TAG 1 #define OBJ_TAG_BITS 1 + +/* + * MAX_POSSIBLE_PHYSMEM_BITS should be defined by all archs using zsmalloc: + * Trying to guess it from MAX_PHYSMEM_BITS, or considering it BITS_PER_LONG, + * proved to be wrong by not considering PAE capabilities, or using SPARSEMEM + * only headers, leading to bad object encoding due to object index overflow. + */ +#ifndef MAX_POSSIBLE_PHYSMEM_BITS + #define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG + #error "MAX_POSSIBLE_PHYSMEM_BITS HAS to be defined by arch using zsmalloc"; +#else + #ifndef CONFIG_64BIT + #if (MAX_POSSIBLE_PHYSMEM_BITS >= (BITS_PER_LONG + PAGE_SHIFT - OBJ_TAG_BITS)) + #error "MAX_POSSIBLE_PHYSMEM_BITS is wrong for this arch"; + #endif + #endif +#endif + +#define _PFN_BITS (MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT) #define OBJ_INDEX_BITS (BITS_PER_LONG - _PFN_BITS - OBJ_TAG_BITS) #define OBJ_INDEX_MASK ((_AC(1, UL) << OBJ_INDEX_BITS) - 1) -- 2.20.0.rc1