Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752490AbdGEPZy (ORCPT ); Wed, 5 Jul 2017 11:25:54 -0400 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:57636 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751929AbdGEPZs (ORCPT ); Wed, 5 Jul 2017 11:25:48 -0400 Message-ID: <1499268300.2707.41.camel@decadent.org.uk> Subject: Re: [PATCH] mm: larger stack guard gap, between vmas From: Ben Hutchings To: Michal Hocko Cc: Linus Torvalds , Willy Tarreau , Hugh Dickins , Oleg Nesterov , "Jason A. Donenfeld" , Rik van Riel , Larry Woodman , "Kirill A. Shutemov" , Tony Luck , "James E.J. Bottomley" , Helge Diller , James Hogan , Laura Abbott , Greg KH , "security@kernel.org" , Qualys Security Advisory , LKML , Ximin Luo Date: Wed, 05 Jul 2017 16:25:00 +0100 In-Reply-To: <20170705142354.GB21220@dhcp22.suse.cz> References: <1499126133.2707.20.camel@decadent.org.uk> <20170704084122.GC14722@dhcp22.suse.cz> <20170704093538.GF14722@dhcp22.suse.cz> <20170704094728.GB22013@1wt.eu> <20170704104211.GG14722@dhcp22.suse.cz> <20170704113611.GA4732@decadent.org.uk> <1499209315.2707.29.camel@decadent.org.uk> <1499257180.2707.34.camel@decadent.org.uk> <20170705142354.GB21220@dhcp22.suse.cz> Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="=-1yqoh7Tf9K8KOCFF7zC4" X-Mailer: Evolution 3.22.6-1 Mime-Version: 1.0 X-SA-Exim-Connect-IP: 82.70.136.246 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4033 Lines: 96 --=-1yqoh7Tf9K8KOCFF7zC4 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2017-07-05 at 16:23 +0200, Michal Hocko wrote: > On Wed 05-07-17 13:19:40, Ben Hutchings wrote: > > On Tue, 2017-07-04 at 16:31 -0700, Linus Torvalds wrote: > > > On Tue, Jul 4, 2017 at 4:01 PM, Ben Hutchings > > > wrote: > > > >=20 > > > > We have: > > > >=20 > > > > bottom =3D 0xff803fff > > > > sp =3D=C2=A0=C2=A0=C2=A0=C2=A0=C2=A00xffffb178 > > > >=20 > > > > The relevant mappings are: > > > >=20 > > > > ff7fc000-ff7fd000 rwxp 00000000 00:00 0 > > > > fffdd000-ffffe000 rw-p 00000000 00:00 > > > > 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0[stack] > > >=20 > > > Ugh. So that stack is actually 8MB in size, but the alloca() is about > > > to use up almost all of it, and there's only about 28kB left between > > > "bottom" and that 'rwx' mapping. > > >=20 > > > Still, that rwx mapping is interesting: it is a single page, and it > > > really is almost exactly 8MB below the stack. > > >=20 > > > In fact, the top of stack (at 0xffffe000) is *exactly* 8MB+4kB from > > > the top of that odd one-page allocation (0xff7fd000). > > >=20 > > > Can you find out where that is allocated? Perhaps a breakpoint on > > > mmap, with a condition to catch that particular one? > >=20 > > [...] > >=20 > > Found it, and it's now clear why only i386 is affected: > > http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/tip/src/os/linux/vm/o= s_linux.cpp#l4852 > > http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/tip/src/os_cpu/linux_= x86/vm/os_linux_x86.cpp#l881 >=20 > This is really worrying. This doesn't look like a gap at all. It is a > mapping which actually contains a code and so we should absolutely not > allow to scribble over it. So I am afraid the only way forward is to > allow per process stack gap and run this particular program to have a > smaller gap. We basically have two ways. Either /proc//$file or > a prctl inherited on exec. The later is a smaller code. What do you > think? Distributions can do that, but what about all the other apps out there using JNI and private copies of the JRE? Soemthing I noticed is that Java doesn't immediately use MAP_FIXED.=20 Look at os::pd_attempt_reserve_memory_at(). If the first, hinted, mmap() doesn't return the hinted address it then attempts to allocate huge areas (I'm not sure how intentional this is) and unmaps the unwanted parts. Then os::workaround_expand_exec_shield_cs_limit() re- mmap()s the wanted part with MAP_FIXED. If this fails at any point it is not a fatal error. So if we change vm_start_gap() to take the stack limit into account (when it's finite) that should neutralise os::workaround_expand_exec_shield_cs_limit(). I'll try this. Ben. --=20 Ben Hutchings Anthony's Law of Force: Don't force it, get a larger hammer. --=-1yqoh7Tf9K8KOCFF7zC4 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEErCspvTSmr92z9o8157/I7JWGEQkFAlldBMwACgkQ57/I7JWG EQm8qhAAkLZaZvsawZXTUhgqZg1hTkfG0y/YVyinoaqZ2RTA3vuXTpzs578cKE4V B9EVAk2mfnUIACDU5drg30Qp2liBAemwQ2fC2L+gAGbKJvX4FPsucv4aaUNuMOHl HZ1WpX+HeF0cr9vLt3OERtKj/Rn7h/+5mz9mZ3F/jvO7KwoE19oVQjiLEderF8pS VfDNCBW3emEPpAbAd7iAPFJ5jw7FZ5hPcajLGTUWYVLup5Cp5aENj+eZQN0CDpTe MG1xMH8QwwWvGJoxES7hNYKh7D8uuvCBrHmfleuZzRD5n3ryXcCqdMTPCc2HnEQ0 nJLaH+r3eX0APkfix2SiDMpNix/88IOe1ENQGIDtfgyAR4rES+gdNtwZV9fKSkVM qz2eOGjA+QcxhhtpjB4ipXh7cJmFHZxDl13Ove9MOG/AFVF455S9aWKMER45ApJL jcXYpP1ogKNQJHW6yFuzpNuDId2rzMdd9HChfxZjWDLukn3xfSyvJFemFL8T5adk bKDAUaLXe4og5U+jKLOESESvDJevlKXLkbVBbOC0adyk3AtYBgL/aOX6Fmo5mwgF S5l3LPaFEo284duAfq+xz6ZEeCovpqqb93lvLHAxlfUQ6Pstd/UGcdBnzErpkvq3 fQSfIIcrROq7Tcd/HU8tFVJzbEbCV+9M2pY+UuiePy09+Cbcgco= =H92F -----END PGP SIGNATURE----- --=-1yqoh7Tf9K8KOCFF7zC4--