Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1260887imm; Fri, 12 Oct 2018 14:55:21 -0700 (PDT) X-Google-Smtp-Source: ACcGV61tRE77E0tc2/oEbtKXwVqvkn4rvtDkbR9uzX2kQ2NeHR0EBAWl+ow5L7+0FOcfpzK/LPqm X-Received: by 2002:a65:6458:: with SMTP id s24-v6mr7058437pgv.29.1539381321558; Fri, 12 Oct 2018 14:55:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539381321; cv=none; d=google.com; s=arc-20160816; b=RT1wpLbF9gVhYRNGK/1BORDGa+w8zIAQOP1upTE1Z0SAPfg9XjxT4vQ3hnsn7ITt+g MB2hte2LVqgUCeO6yxeBmlAHQ7nVBZ1x8S0JmP51E7mMcX8R/1T9+JszOK5dhtNrxlRW +XjCqHza/UVwTGPW/loF6xgq6QvVcgcitrq//3j+F4gYfVidruH74/pCCvzevhc6yPcr tHLoMMaNnelZFKjnez5GvZ2w8iKXE+80wtkl6LK68M88qKhAlSQ7q4xcvK+Tdjt20vn0 IzKwROgtAUTei4U7/lD1ol6qWPmkhHBgtrSp+xxKXtnL8fX2m66pLaJ7UnM7JKbiW5eI KXtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id; bh=JKUjcRu3+cIeB8Plbm7xC5udb10QKrMXfe2Quk9tS0k=; b=mbcUmVUtjWw7bSbb50iHYIFhc44gEiewN96Vf/PaMx9pMakF4Rol3r2+8RTHFVw4bl zwH7l1+OkVgez9Vr4YxHrEfGBAtskWAiyeGNuqwakoI7lettpz4WciISYb+iBlw4WxB4 b7fEgqTzeduowK1Tp1uQqpaXjrOcN6rnqGgHMlIWNRAjqdLRuCV5/JPLV7+srtblzX4H 4681VC+vaArdETFPVx9YVOfWhn3FLUAl9VIK8gMB2aZFz6KTNI2ndbIDXKzLmvLX9CVK qCEveIuDb46ADyGSJ4FZXohvQCjOnYRPELObhJR+BzAQBxhfc/uHc+3JbfFpUg4Zub4Y qNLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g5-v6si2377103pgf.565.2018.10.12.14.54.54; Fri, 12 Oct 2018 14:55:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726220AbeJMF2p (ORCPT + 99 others); Sat, 13 Oct 2018 01:28:45 -0400 Received: from mga12.intel.com ([192.55.52.136]:20719 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725904AbeJMF2o (ORCPT ); Sat, 13 Oct 2018 01:28:44 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Oct 2018 14:54:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,374,1534834800"; d="scan'208";a="98758128" Received: from 2b52.sc.intel.com ([143.183.136.147]) by fmsmga001.fm.intel.com with ESMTP; 12 Oct 2018 14:54:16 -0700 Message-ID: <1d293d2fb0df99fdb0048825b4e39640840bfabb.camel@intel.com> Subject: Re: [PATCH v5 07/27] mm/mmap: Create a guard area between VMAs From: Yu-cheng Yu To: Andy Lutomirski , Jann Horn Cc: X86 ML , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , LKML , linux-doc@vger.kernel.org, Linux-MM , linux-arch , Linux API , Arnd Bergmann , Balbir Singh , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H. J. Lu" , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , "Shanbhogue, Vedvyas" , Daniel Micay Date: Fri, 12 Oct 2018 14:49:28 -0700 In-Reply-To: References: <20181011151523.27101-1-yu-cheng.yu@intel.com> <20181011151523.27101-8-yu-cheng.yu@intel.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.1-2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2018-10-11 at 13:55 -0700, Andy Lutomirski wrote: > On Thu, Oct 11, 2018 at 1:39 PM Jann Horn wrote: > > > > On Thu, Oct 11, 2018 at 5:20 PM Yu-cheng Yu wrote: > > > Create a guard area between VMAs to detect memory corruption. > > > > [...] > > > +config VM_AREA_GUARD > > > + bool "VM area guard" > > > + default n > > > + help > > > + Create a guard area between VM areas so that access beyond > > > + limit can be detected. > > > + > > > endmenu > > > > Sorry to bring this up so late, but Daniel Micay pointed out to me > > that, given that VMA guards will raise the number of VMAs by > > inhibiting vma_merge(), people are more likely to run into > > /proc/sys/vm/max_map_count (which limits the number of VMAs to ~65k by > > default, and can't easily be raised without risking an overflow of > > page->_mapcount on systems with over ~800GiB of RAM, see > > https://lore.kernel.org/lkml/20180208021112.GB14918@bombadil.infradead.org/ > > and replies) with this change. > > > > Playing with glibc's memory allocator, it looks like glibc will use > > mmap() for 128KB allocations; so at 65530*128KB=8GB of memory usage in > > 128KB chunks, an application could run out of VMAs. > > Ugh. > > Do we have a free VM flag so we could do VM_GUARD to force a guard > page? (And to make sure that, when a new VMA is allocated, it won't > be directly adjacent to a VM_GUARD VMA.) Maybe something like the following? These vm_start_gap()/vm_end_gap() are used in many architectures. Do we want to put them in a different series? Comments? Yu-cheng diff --git a/include/linux/mm.h b/include/linux/mm.h index 0416a7204be3..92b580542411 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -224,11 +224,13 @@ extern unsigned int kobjsize(const void *objp); #define VM_HIGH_ARCH_BIT_2 34 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_3 35 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_4 36 /* bit only usable on 64-bit architectures */ +#define VM_HIGH_ARCH_BIT_5 37 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_0 BIT(VM_HIGH_ARCH_BIT_0) #define VM_HIGH_ARCH_1 BIT(VM_HIGH_ARCH_BIT_1) #define VM_HIGH_ARCH_2 BIT(VM_HIGH_ARCH_BIT_2) #define VM_HIGH_ARCH_3 BIT(VM_HIGH_ARCH_BIT_3) #define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4) +#define VM_HIGH_ARCH_5 BIT(VM_HIGH_ARCH_BIT_5) #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */ #ifdef CONFIG_ARCH_HAS_PKEYS @@ -266,6 +268,12 @@ extern unsigned int kobjsize(const void *objp); # define VM_MPX VM_NONE #endif +#ifdef CONFIG_ARCH_USES_HIGH_VMA_FLAGS +#define VM_GUARD VM_HIGH_ARCH_5 +#else +#define VM_GUARD VM_NONE +#endif + #ifndef VM_GROWSUP # define VM_GROWSUP VM_NONE #endif @@ -2417,24 +2425,34 @@ static inline struct vm_area_struct * find_vma_intersection(struct mm_struct * m -static inline unsigned long vm_start_gap(struct vm_area_struct *vma) +static inline unsigned long vm_start_gap(struct vm_area_struct *vma, vm_flags_t flags) { unsigned long vm_start = vma->vm_start; + unsigned long gap = 0; + + if (vma->vm_flags & VM_GROWSDOWN) + gap = stack_guard_gap; + else if ((vma->vm_flags & VM_GUARD) || (flags & VM_GUARD)) + gap = PAGE_SIZE; + + vm_start -= gap; + if (vm_start > vma->vm_start) + vm_start = 0; - if (vma->vm_flags & VM_GROWSDOWN) { - vm_start -= stack_guard_gap; - if (vm_start > vma->vm_start) - vm_start = 0; - } return vm_start; } -static inline unsigned long vm_end_gap(struct vm_area_struct *vma) +static inline unsigned long vm_end_gap(struct vm_area_struct *vma, vm_flags_t flags) { unsigned long vm_end = vma->vm_end; + unsigned long gap = 0; + + if (vma->vm_flags & VM_GROWSUP) + gap = stack_guard_gap; + else if ((vma->vm_flags & VM_GUARD) || (flags & VM_GUARD)) + gap = PAGE_SIZE; + + vm_end += gap; + if (vm_end < vma->vm_end) + vm_end = -PAGE_SIZE; - if (vma->vm_flags & VM_GROWSUP) { - vm_end += stack_guard_gap; - if (vm_end < vma->vm_end) - vm_end = -PAGE_SIZE; - } return vm_end; }