Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933427AbdDGLgY (ORCPT ); Fri, 7 Apr 2017 07:36:24 -0400 Received: from mail-eopbgr10132.outbound.protection.outlook.com ([40.107.1.132]:23072 "EHLO EUR02-HE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932247AbdDGLgO (ORCPT ); Fri, 7 Apr 2017 07:36:14 -0400 Authentication-Results: virtuozzo.com; dkim=none (message not signed) header.d=none;virtuozzo.com; dmarc=none action=none header.from=virtuozzo.com; Subject: Re: [PATCHv2 8/8] x86/mm: Allow to have userspace mappings above 47-bits To: "Kirill A. Shutemov" References: <20170406232137.uk7y2knbkcsru4pi@black.fi.intel.com> <20170406232442.9822-1-kirill.shutemov@linux.intel.com> CC: Linus Torvalds , Andrew Morton , , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andi Kleen , Dave Hansen , Andy Lutomirski , , , From: Dmitry Safonov Message-ID: <4c8cd9a9-2013-2a74-6bea-d7dc7207abb1@virtuozzo.com> Date: Fri, 7 Apr 2017 14:32:31 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170406232442.9822-1-kirill.shutemov@linux.intel.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [195.214.232.6] X-ClientProxiedBy: AM5PR0402CA0014.eurprd04.prod.outlook.com (10.175.37.24) To DB6PR0801MB1736.eurprd08.prod.outlook.com (10.169.227.7) X-MS-Office365-Filtering-Correlation-Id: ab6a93af-dc32-4451-0584-08d47daa4a7e X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(201703131423075)(201703031133081);SRVR:DB6PR0801MB1736; X-Microsoft-Exchange-Diagnostics: 1;DB6PR0801MB1736;3:T88VOzodk3rpE+xp7q5s8lh2GXefXgLjt8i5I8oLYj2XeZr0Io2pfGUo8+7YINVucu+4Qzr+1OY1b/P3va7TfjE9E5TQ3H44pyknOCngEFzJbyRDrnei2cbtcw/XNDaiVwxSg4ZtfpGpW/EoeHREE3ILl3XXk8NPCkCaNuaSWZgJTAebpCXUbvhVdC15uZ5Gsf3mGyU5j6yp9OYqGd1Y/QROE1xpBfGyDVU+yLqDLXUztA4QgIQ0bpTi2dGpCw8nAYWfrcfGoZUeZ877qjyDSbBuFJb3rzOGG/sFyoA8HA/pTnY7YJ5iVDC2Qcvd0jr6hJIFaIKukCTU9oHil1DM6g==;25:0t+AWOy0+jkXwuaUMNgFFQFmM3py7kgqfoHyKfT4nct+jyOa3aZTq5ZfDHYdheHhd+HQjrxm5QY3/BNJerqjtQGxxWATm66k6zkwdZVORtLFND5jc/o+Xv52ARfkcazVgnM9dTZK2Hg/qxXkGoIibcs1MeTzz2sxN6HdJ5y1T+r533paaZCg26TvHfSg36i+4hwiIHrI5VGEEnYjKTfwRBe8PKQ8I8fxC6agOHx+Fb8y+Sjdc6fkp2SiRg12EVYG+/cVa0JLt7PBnGO2CX4/dM8UcgtFzJSUQ0pworquol21GzqfYKYnzAyHcyVg2mMTdc5kTcQbB/L/e0WZTPNDvXI+tk9ApDaL369+zFEHzWZcq/PbQajCOgn7snIT0ekPArh+2BE7XLPRQr6Nsdr8vUj1kqMhIaVXKlAcirFarNlAjsMWGwugNImrJkTjpVezDUkXzmUK2r7U9zJ6gLuIVw== X-Microsoft-Exchange-Diagnostics: 1;DB6PR0801MB1736;31:xn79vBwyNz8DAliCTueT8XeIG0wzqrdmB3FaXKB0fjc0utvx6yfnuJ/YpqRwtoP4pJQBTWQVAjQAp8HWdHTSZSyuWDUHYitOGLhwqUZvbX77CEkz3eQ5usp1OBxDAk4CrCz75JdWkVd5qkEyRjLX5NQeB6vwF1mtC89SXxbdd4Qmrwr6e8rpukuNiZhfWvssXUlV7zRLprSXkRNwuDHdM5RMwAm/G/AU0+PrrgRgLCQ=;20:sk1+DlzKDg6LuhcR35mpu/WS8HZ+0WWnvqABE7rwsC0PAPXdZ9tLltIYV9aTajTNzup9qaGuN6nWgvNMdj9dAXfMPKUzf2B704XS6qo5EcodEE0zg80+7i89dI9P3ODtizOK3GFS1LO1Y+K1kMcFOhoYqgzoMpZekwWgynRYqaE7bdEUA/9/EkFjxlxVwuqCITiLJpJFKglqwfxYtdn8goAgfpuz+9mbsyDYa7c6QOW3Iiq7B3LO3Jjg4RQDUA/RUfIWqR2tetCymg7dFXLQXZDXcTWq2LYqxGQrqO3JlIMMxTgLTRhvRmgkpThMAJJYq/ppM29j2S8Cd1dp5OT7VzomBmjmG6gByN43AGoYvZ6aFkS7AoIUoItDpPS/S9GtDnCWyuskXnU1E5MRjqFQseflZvB1SJF/vLIc8qCAaTs= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(228905959029699); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040450)(601004)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(3002001)(6041248)(20161123555025)(20161123562025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(20161123564025)(6072148);SRVR:DB6PR0801MB1736;BCL:0;PCL:0;RULEID:;SRVR:DB6PR0801MB1736; X-Microsoft-Exchange-Diagnostics: 1;DB6PR0801MB1736;4:HMJJM6g/YBRLQPcvcPUkClQ/jyRslucs41de8aNLFGDBMVdvmB7EVFUxWVl1aYlBoIvTjjx6k/yYO9VwCHNeP67l1x+3ZP4qFmMgd7ZvzPpVa6CHiSFszrzDaAMAQfcWFkD4iJbPmF3wo4dXGtjuQhvexMwgrCtpCSheSxglzqhMvcOvxPN7kTcyx/hcMXIrPFV4M5of1N8Y6vBd3g+e8Jf3GB0XRQSS01ErUpHHNE7pyT4uy0avyqWk3ytfnRpyPLx9TzBk1V6BbIEA9Z40QFbnXpv3jm+HyCtty3iWQsAjVCU10pEDAZkPfwNoO+H+wRhoj4px41XX2Kj88qwPYK47ix/ALuxfNXWedm7n7d1H5BjhdzPtcreVvuAidZScZud5Mq5oSxMTVsibXcrIdhQj0dT/dFsfLwCUytXsH9hMNsxwU4WyZYkrIIQRiu3rwAQUDGSMrr+wNejmLhpPouO1WS60wIDURZGXI+7Rll0wVqnrVXdgFz0gcI1QH/DuUSpQzTFy3zPXRsiwaAnnqQALpz1kFhOO+Dxf0k1xFq5swWLS5yK+OyX4I9Qx/8TbH8SATpoxgOFlHdYVRU5XA2uaZxf922pm4YW+25ElYiso0sggiPZfXNOD8/Ad6oDXSIGSH//W7I48yKokT2uuHimNrOTl8x1jOZMH7f+9sA/34WCSMUHYii/mlbVZjpZuMStF2Qqu4nqPMbetjnkXQrhRkr5iE+9FARM8sotVDuySWAe1fAHSZpyOXKhUaY44dAL/uMoftfvspK5H0Ws9Wg== X-Forefront-PRVS: 0270ED2845 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(979002)(6009001)(6049001)(39450400003)(39410400002)(39830400002)(39400400002)(52314003)(377454003)(24454002)(31686004)(7736002)(229853002)(81166006)(42186005)(4001350100001)(31696002)(50466002)(23746002)(36756003)(38730400002)(8676002)(4326008)(86362001)(110136004)(90366009)(575784001)(77096006)(6246003)(64126003)(6486002)(47776003)(66066001)(25786009)(53936002)(65956001)(33646002)(54356999)(53546009)(50986999)(65806001)(2950100002)(6116002)(3846002)(5660300001)(7416002)(2906002)(54906002)(83506001)(230700001)(6666003)(189998001)(65826007)(76176999)(6916009)(305945005)(97736004)(969003)(989001)(999001)(1009001)(1019001);DIR:OUT;SFP:1102;SCL:1;SRVR:DB6PR0801MB1736;H:[172.16.25.13];FPR:;SPF:None;MLV:ovrnspm;PTR:InfoNoRecords;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;DB6PR0801MB1736;23:JccRJjxFGdqeO5DD3Rzj1xvZ2INveAXehjU?= =?Windows-1252?Q?VnltNNQ1MOk6S80cN8s53e4/HHBigvngP/tmFa3AjMbqRYCi708o7h67?= =?Windows-1252?Q?/TXwrmaBje4VY5fNfhKVKubeTv7VpkLC7JDITgUSjJJwCBRuUSXVem6S?= =?Windows-1252?Q?OgHvs8la3MqUEqLW8641/SfppLTQv3ctUXo/MUXfLLzXUl1qkvwJio1T?= =?Windows-1252?Q?rt6esGRkKqq2OE12ucltHBv5shaOePHXSzU30MX3n1mRLmtW2iIt0cNe?= =?Windows-1252?Q?nZvO1Az4RUZz5KwyoReKhw+XAsA5QRbsFerXbExcNl5QDD7JpJUh6CKc?= =?Windows-1252?Q?dcCOHJkQsHhZMR/PJqSXZpNQvW8i5vikPgoCpb6YFbBLQ5jAhjnekz3c?= =?Windows-1252?Q?Z7V5fl9kgQsnqPEQpSntoL1hC3R5AaH67Urt3kDdrEzFYzhRnoyuD4gm?= =?Windows-1252?Q?9n55QaZjf7yJ9OoHsJejmlW84fa7oM++oWBF0m1hFWM0K5provP5D4ej?= =?Windows-1252?Q?aytDsBtRxsBJuv0bBcpJlwgf2Uk2Gi+YHEqgp3eZpbGeGl1TiQMxeOM+?= =?Windows-1252?Q?1whKiDtGwYdJkF0m5CuC6lt5DenST7dVjFmUZg+hmdF7q7yYIN7dAm/K?= =?Windows-1252?Q?u3jJpa0epBlBlxxuyrI0ZXCaO6fQz97qTu7MQWce12VlZaN0cTthtrIb?= =?Windows-1252?Q?+pMjUTRanJQOEfTh7zEaAZTCIVcQTP1Ejt9ReVYLMjV3zIR/ZDlNA1c5?= =?Windows-1252?Q?4F/VTpON++9thYeNcs+i4hQjIc8PKASBqFEubOaEdtKIuDgpBJcKeH91?= =?Windows-1252?Q?aLn0m8rAfYVEE3Xc4aHaBrB1ESs29PRknMKJBq6xILbOGevtSErum1KG?= =?Windows-1252?Q?+3mvEFW6dFKJe/Z3FQOd/eBie8TrsIr9+WA2Gh8MVbz/Zw1DwjIVC1Xa?= =?Windows-1252?Q?wxO5+IWp1ehqWqyUHhX9yZc56sd5QGwNOGodksU13wIIrFAR5sRS3W29?= =?Windows-1252?Q?wE9gMnlIdy/mk446VgfrOWaKnPX7TURkaYZ8Ov/KEyv/UegfgB9CkGDn?= =?Windows-1252?Q?QxKpBaIMxdcym3fS0/EsYSOB++TksrHodPjRCr2zCuLoAGESXDvX1bbF?= =?Windows-1252?Q?oA5ewm++eWF+HcKgm39kEmpleayIbhvATjXWOvsCxBTWVkQZt9r7jK5e?= =?Windows-1252?Q?QOHilotg+ZyM7hBx269Wr7iCqdJMmFlJ/NKeC6C3lJHRacTl0TvQFT1g?= =?Windows-1252?Q?GkhVK2s/cQPtiobsQ76NIMyjE/6GcLUc+0vyaV5e/B0kPvKKuJcWQPCF?= =?Windows-1252?Q?GeB/iKspmr1tm9BUHz9Vt2kyQJvVjR/B1WdggpGoKGA3/xuCnO7mPmlp?= =?Windows-1252?Q?OVybnR2BP8tQxvhWjekrn5N7+sk2kc5EH0IjnKUdfS3Q06i/FDFEbsp5?= =?Windows-1252?Q?QHJfEJ9lTv7bCpAzILLS2UTmshDvErqsxwA02RUfYBFuk2E2b9Ayt9GR?= =?Windows-1252?Q?YmkegF5zpEmevEVd1KacteedG5c/Wxul2amPExOkKaIYcHUAWnkLGHqS?= =?Windows-1252?Q?LTITu9pDCXQb4hXZ+nzNPC4UyhKNZ7c2AZnl+3yMF2eHPexp2yHdOiwv?= =?Windows-1252?Q?8FQ=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;DB6PR0801MB1736;6:ZnzzLMOL8Rh7sYwqv2QoR/ktG+k3lgQz2sWKTvBjfH+4r1tbFGL0RPW0WJavSg9KEzuYqWO0FqFRk8i5+kdOrkXKuzmYVIIpN1X1dRUpq0WKHKD05gOXN1dR5DhIB30lIJWSwQmCD85SfTSburn1j4n6C8lkKzE54pQb1SGdkaSDhyfCpOpFtQhKdueSRbfyiWe9ErfOqvvEg0n057Z+fUydAlp4lrImoNW2W5XxRy9tcAfL67pK2MgRh9BeBluFippaT9lWSApDugwvcZn3O8pVqLnVisCfKMFOv9m9Yzu8ENgxwb+S4x9TeuFIvaAwwUA6TkcnjXOMj3gnM/lxLJFVE45T0ULuE8jDzJpc28hLJ9QgGZT7dtXVHoaw5UOs/exFdtsRbP0lznhodHa24w==;5:sQ7/OSI9bxtTNhoRTOoRBxMR6tbwrtxOMT5zClo01TWB4tlxgZ004TECYLKX2ny582u99fRJBcI2edF32WzF6uROjegJxuajnRTWggg5eT61tbd0AqwyW5JR9WFUIJysCCS8c4Psg6sf9JlfexRKhg==;24:iohglEkGUNuwXERRD7km3wmrAWjF7qoZtXvrrlpm363j+5x1IO/7bQflJ6104F7kyth2jvW/1st/OjQK6qJIv7I3ZUjUgGVJZl2n+D+wXAo= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;DB6PR0801MB1736;7:epqA0D/zz43wzfGcOwpR18Cb0trNhQjYEj9241gYlBbhH6X8FHaym1a/uKz5Jl1zGmb+/u+l0sahMcP+Nfrs/uaghPonlzmDzWDF/dgAZZskLhM1A9VwHiDClhS9DloE49BLMxaZqUMuo9Z45q/IkTZaDxyYzL1R9/RMFTkHKVm39Jkakgyj6f9YGvFDA9/XYDeXuS586lOsYRTQ3QARPFVfEqBVeWafuryelkedca1cTl+yN9lEZtJzTS4bqpZJ1ugy/NPVKrsnncMwruOoJRWpYtrJJ1r5u7ZzYihA7QbAY1sMvKA5AcYvQWsyPTT2cSeu06Tzl9P5zUx4XEBtaw==;20:yUjBHUB9LbHQCxfsinyVEPmg7IRuNo0uLm57HB6atue3vurkxpDtTdWmvosibsuu0M5NSUvz/LaT6iBlKQBYnhY4Um6qHiKQD+0gXAhImmEWYGcTaNvEO7w8fHl1HlVaXHakT0tb3HFBkqiZqCnU8bOQEYbxrFeAsN1BRJRQyvQ= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Apr 2017 11:36:10.2673 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1736 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 13205 Lines: 388 On 04/07/2017 02:24 AM, Kirill A. Shutemov wrote: > On x86, 5-level paging enables 56-bit userspace virtual address space. > Not all user space is ready to handle wide addresses. It's known that > at least some JIT compilers use higher bits in pointers to encode their > information. It collides with valid pointers with 5-level paging and > leads to crashes. > > To mitigate this, we are not going to allocate virtual address space > above 47-bit by default. > > But userspace can ask for allocation from full address space by > specifying hint address (with or without MAP_FIXED) above 47-bits. > > If hint address set above 47-bit, but MAP_FIXED is not specified, we try > to look for unmapped area by specified address. If it's already > occupied, we look for unmapped area in *full* address space, rather than > from 47-bit window. > > This approach helps to easily make application's memory allocator aware > about large address space without manually tracking allocated virtual > address space. > > One important case we need to handle here is interaction with MPX. > MPX (without MAWA( extension cannot handle addresses above 47-bit, so we > need to make sure that MPX cannot be enabled we already have VMA above > the boundary and forbid creating such VMAs once MPX is enabled. > > Signed-off-by: Kirill A. Shutemov > Cc: Dmitry Safonov > --- > arch/x86/include/asm/elf.h | 2 +- > arch/x86/include/asm/mpx.h | 9 +++++++++ > arch/x86/include/asm/processor.h | 10 +++++++--- > arch/x86/kernel/sys_x86_64.c | 28 +++++++++++++++++++++++++++- > arch/x86/mm/hugetlbpage.c | 27 ++++++++++++++++++++++++--- > arch/x86/mm/mmap.c | 2 +- > arch/x86/mm/mpx.c | 33 ++++++++++++++++++++++++++++++++- > 7 files changed, 101 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h > index d4d3ed456cb7..67260dbe1688 100644 > --- a/arch/x86/include/asm/elf.h > +++ b/arch/x86/include/asm/elf.h > @@ -250,7 +250,7 @@ extern int force_personality32; > the loader. We need to make sure that it is out of the way of the program > that it will "exec", and that there is sufficient room for the brk. */ > > -#define ELF_ET_DYN_BASE (TASK_SIZE / 3 * 2) > +#define ELF_ET_DYN_BASE (DEFAULT_MAP_WINDOW / 3 * 2) > > /* This yields a mask that user programs can use to figure out what > instruction set this CPU supports. This could be done in user space, > diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h > index a0d662be4c5b..7d7404756bb4 100644 > --- a/arch/x86/include/asm/mpx.h > +++ b/arch/x86/include/asm/mpx.h > @@ -73,6 +73,9 @@ static inline void mpx_mm_init(struct mm_struct *mm) > } > void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma, > unsigned long start, unsigned long end); > + > +unsigned long mpx_unmapped_area_check(unsigned long addr, unsigned long len, > + unsigned long flags); > #else > static inline siginfo_t *mpx_generate_siginfo(struct pt_regs *regs) > { > @@ -94,6 +97,12 @@ static inline void mpx_notify_unmap(struct mm_struct *mm, > unsigned long start, unsigned long end) > { > } > + > +static inline unsigned long mpx_unmapped_area_check(unsigned long addr, > + unsigned long len, unsigned long flags) > +{ > + return addr; > +} > #endif /* CONFIG_X86_INTEL_MPX */ > > #endif /* _ASM_X86_MPX_H */ > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index 3cada998a402..a98395e89ac6 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -795,6 +795,7 @@ static inline void spin_lock_prefetch(const void *x) > #define IA32_PAGE_OFFSET PAGE_OFFSET > #define TASK_SIZE PAGE_OFFSET > #define TASK_SIZE_MAX TASK_SIZE > +#define DEFAULT_MAP_WINDOW TASK_SIZE > #define STACK_TOP TASK_SIZE > #define STACK_TOP_MAX STACK_TOP > > @@ -834,7 +835,10 @@ static inline void spin_lock_prefetch(const void *x) > * particular problem by preventing anything from being mapped > * at the maximum canonical address. > */ > -#define TASK_SIZE_MAX ((1UL << 47) - PAGE_SIZE) > +#define TASK_SIZE_MAX ((1UL << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE) > + > +#define DEFAULT_MAP_WINDOW (test_thread_flag(TIF_ADDR32) ? \ > + IA32_PAGE_OFFSET : ((1UL << 47) - PAGE_SIZE)) That fixes 32-bit, but we need to adjust some places, AFAICS, I'll point them below. > > /* This decides where the kernel will search for a free chunk of vm > * space during mmap's. > @@ -847,7 +851,7 @@ static inline void spin_lock_prefetch(const void *x) > #define TASK_SIZE_OF(child) ((test_tsk_thread_flag(child, TIF_ADDR32)) ? \ > IA32_PAGE_OFFSET : TASK_SIZE_MAX) > > -#define STACK_TOP TASK_SIZE > +#define STACK_TOP DEFAULT_MAP_WINDOW > #define STACK_TOP_MAX TASK_SIZE_MAX > > #define INIT_THREAD { \ > @@ -870,7 +874,7 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip, > * space during mmap's. > */ > #define __TASK_UNMAPPED_BASE(task_size) (PAGE_ALIGN(task_size / 3)) > -#define TASK_UNMAPPED_BASE __TASK_UNMAPPED_BASE(TASK_SIZE) > +#define TASK_UNMAPPED_BASE __TASK_UNMAPPED_BASE(DEFAULT_MAP_WINDOW) > > #define KSTK_EIP(task) (task_pt_regs(task)->ip) > > diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c > index 207b8f2582c7..593a31e93812 100644 > --- a/arch/x86/kernel/sys_x86_64.c > +++ b/arch/x86/kernel/sys_x86_64.c > @@ -21,6 +21,7 @@ > #include > #include > #include > +#include > > /* > * Align a virtual address to avoid aliasing in the I$ on AMD F15h. > @@ -132,6 +133,10 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr, > struct vm_unmapped_area_info info; > unsigned long begin, end; > > + addr = mpx_unmapped_area_check(addr, len, flags); > + if (IS_ERR_VALUE(addr)) > + return addr; > + > if (flags & MAP_FIXED) > return addr; > > @@ -151,7 +156,16 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr, > info.flags = 0; > info.length = len; > info.low_limit = begin; > - info.high_limit = end; > + > + /* > + * If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area > + * in the full address space. > + */ > + if (addr > DEFAULT_MAP_WINDOW) > + info.high_limit = min(end, TASK_SIZE); > + else > + info.high_limit = min(end, DEFAULT_MAP_WINDOW); That looks not working. `end' is choosed between tasksize_32bit() and tasksize_64bit(). Which is ~4Gb or 47-bit. So, info.high_limit will never go above DEFAULT_MAP_WINDOW with this min(). Can we move this logic into find_start_end()? May it be something like: if (in_compat_syscall()) *end = tasksize_32bit(); else if (addr > task_size_64bit()) *end = TASK_SIZE_MAX; else *end = tasksize_64bit(); In my point of view, it could be even simpler if we add a parameter to task_size_64bit(): #define TASK_SIZE_47BIT ((1UL << 47) - PAGE_SIZE)) unsigned long task_size_64bit(int full_addr_space) { return (full_addr_space) ? TASK_SIZE_MAX : TASK_SIZE_47BIT; } > + > info.align_mask = 0; > info.align_offset = pgoff << PAGE_SHIFT; > if (filp) { > @@ -171,6 +185,10 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0, > unsigned long addr = addr0; > struct vm_unmapped_area_info info; > > + addr = mpx_unmapped_area_check(addr, len, flags); > + if (IS_ERR_VALUE(addr)) > + return addr; > + > /* requested length too big for entire address space */ > if (len > TASK_SIZE) > return -ENOMEM; > @@ -195,6 +213,14 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0, > info.length = len; > info.low_limit = PAGE_SIZE; > info.high_limit = get_mmap_base(0); > + > + /* > + * If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area > + * in the full address space. > + */ > + if (addr > DEFAULT_MAP_WINDOW && !in_compat_syscall()) > + info.high_limit += TASK_SIZE - DEFAULT_MAP_WINDOW; Hmm, looks like we do need in_compat_syscall() as you did because x32 mmap() syscall has 8 byte parameter. Maybe worth a comment. Anyway, maybe something like that: if (addr > tasksize_64bit() && !in_compat_syscall()) info.high_limit += TASK_SIZE_MAX - tasksize_64bit(); This way it's more readable and clear because we don't need to keep in mind that TIF_ADDR32 flag, while reading. > + > info.align_mask = 0; > info.align_offset = pgoff << PAGE_SHIFT; > if (filp) { > diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c > index 302f43fd9c28..9a0b89252c52 100644 > --- a/arch/x86/mm/hugetlbpage.c > +++ b/arch/x86/mm/hugetlbpage.c > @@ -18,6 +18,7 @@ > #include > #include > #include > +#include > > #if 0 /* This is just for testing */ > struct page * > @@ -87,23 +88,38 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file, > info.low_limit = get_mmap_base(1); > info.high_limit = in_compat_syscall() ? > tasksize_32bit() : tasksize_64bit(); > + > + /* > + * If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area > + * in the full address space. > + */ > + if (addr > DEFAULT_MAP_WINDOW) > + info.high_limit = TASK_SIZE; > + > info.align_mask = PAGE_MASK & ~huge_page_mask(h); > info.align_offset = 0; > return vm_unmapped_area(&info); > } > > static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file, > - unsigned long addr0, unsigned long len, > + unsigned long addr, unsigned long len, > unsigned long pgoff, unsigned long flags) > { > struct hstate *h = hstate_file(file); > struct vm_unmapped_area_info info; > - unsigned long addr; > > info.flags = VM_UNMAPPED_AREA_TOPDOWN; > info.length = len; > info.low_limit = PAGE_SIZE; > info.high_limit = get_mmap_base(0); > + > + /* > + * If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area > + * in the full address space. > + */ > + if (addr > DEFAULT_MAP_WINDOW && !in_compat_syscall()) > + info.high_limit += TASK_SIZE - DEFAULT_MAP_WINDOW; > + > info.align_mask = PAGE_MASK & ~huge_page_mask(h); > info.align_offset = 0; > addr = vm_unmapped_area(&info); > @@ -118,7 +134,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file, > VM_BUG_ON(addr != -ENOMEM); > info.flags = 0; > info.low_limit = TASK_UNMAPPED_BASE; > - info.high_limit = TASK_SIZE; > + info.high_limit = DEFAULT_MAP_WINDOW; > addr = vm_unmapped_area(&info); > } > > @@ -135,6 +151,11 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, > > if (len & ~huge_page_mask(h)) > return -EINVAL; > + > + addr = mpx_unmapped_area_check(addr, len, flags); > + if (IS_ERR_VALUE(addr)) > + return addr; > + > if (len > TASK_SIZE) > return -ENOMEM; > > diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c > index 19ad095b41df..d63232a31945 100644 > --- a/arch/x86/mm/mmap.c > +++ b/arch/x86/mm/mmap.c > @@ -44,7 +44,7 @@ unsigned long tasksize_32bit(void) > > unsigned long tasksize_64bit(void) > { > - return TASK_SIZE_MAX; > + return DEFAULT_MAP_WINDOW; My suggestion about new parameter is above, but at least we need to omit depending on TIF_ADDR32 here and return 64-bit size independent of flag value: #define TASK_SIZE_47BIT ((1UL << 47) - PAGE_SIZE)) unsigned long task_size_64bit(void) { return TASK_SIZE_47BIT; } Because for 32-bit ELFs it would be always 4Gb in your case, while 32-bit ELFs can do 64-bit syscalls. > } > > static unsigned long stack_maxrandom_size(unsigned long task_size) > diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c > index cd44ae727df7..a26a1b373fd0 100644 > --- a/arch/x86/mm/mpx.c > +++ b/arch/x86/mm/mpx.c > @@ -355,10 +355,19 @@ int mpx_enable_management(void) > */ > bd_base = mpx_get_bounds_dir(); > down_write(&mm->mmap_sem); > + > + /* MPX doesn't support addresses above 47-bits yet. */ > + if (find_vma(mm, DEFAULT_MAP_WINDOW)) { > + pr_warn_once("%s (%d): MPX cannot handle addresses " > + "above 47-bits. Disabling.", > + current->comm, current->pid); > + ret = -ENXIO; > + goto out; > + } > mm->context.bd_addr = bd_base; > if (mm->context.bd_addr == MPX_INVALID_BOUNDS_DIR) > ret = -ENXIO; > - > +out: > up_write(&mm->mmap_sem); > return ret; > } > @@ -1038,3 +1047,25 @@ void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma, > if (ret) > force_sig(SIGSEGV, current); > } > + > +/* MPX cannot handle addresses above 47-bits yet. */ > +unsigned long mpx_unmapped_area_check(unsigned long addr, unsigned long len, > + unsigned long flags) > +{ > + if (!kernel_managing_mpx_tables(current->mm)) > + return addr; > + if (addr + len <= DEFAULT_MAP_WINDOW) > + return addr; > + if (flags & MAP_FIXED) > + return -ENOMEM; > + > + /* > + * Requested len is larger than whole area we're allowed to map in. > + * Resetting hinting address wouldn't do much good -- fail early. > + */ > + if (len > DEFAULT_MAP_WINDOW) > + return -ENOMEM; > + > + /* Look for unmap area within DEFAULT_MAP_WINDOW */ > + return 0; > +} > -- Dmitry