MIME-Version: 1.0
In-Reply-To: <20170704183927.GH22013@1wt.eu>
References: <1499126133.2707.20.camel@decadent.org.uk> <CA+55aFzMX72+Kb=zNgjCf6UfPt+C+e7WDp_rpbSLuOVx1k7iqg@mail.gmail.com>
 <20170704084122.GC14722@dhcp22.suse.cz> <20170704093538.GF14722@dhcp22.suse.cz>
 <20170704094728.GB22013@1wt.eu> <20170704104211.GG14722@dhcp22.suse.cz>
 <20170704113611.GA4732@decadent.org.uk> <20170704155140.GC22013@1wt.eu>
 <20170704172247.GA6178@dhcp22.suse.cz> <CA+55aFyAunJqeoZfcOFjPS3ZQvm_zM2xztvwxkcJ5fhk1Xzmqw@mail.gmail.com>
 <20170704183927.GH22013@1wt.eu>
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 4 Jul 2017 11:47:37 -0700
Message-ID: <CA+55aFyjihU+73h07a4XwUdgZixk=EhU5yQFjaii4sbcymT2tQ@mail.gmail.com>
Subject: Re: [PATCH] mm: larger stack guard gap, between vmas
To: Willy Tarreau <w@1wt.eu>
Cc: Michal Hocko <mhocko@kernel.org>, Ben Hutchings <ben@decadent.org.uk>,
        Hugh Dickins <hughd@google.com>, Oleg Nesterov <oleg@redhat.com>,
        "Jason A. Donenfeld" <Jason@zx2c4.com>, Rik van Riel <riel@redhat.com>,
        Larry Woodman <lwoodman@redhat.com>,
        "Kirill A. Shutemov" <kirill@shutemov.name>,
        Tony Luck <tony.luck@intel.com>,
        "James E.J. Bottomley" <jejb@parisc-linux.org>,
        Helge Diller <deller@gmx.de>, James Hogan <james.hogan@imgtec.com>,
        Laura Abbott <labbott@redhat.com>, Greg KH <greg@kroah.com>,
        "security@kernel.org" <security@kernel.org>,
        linux-distros@vs.openwall.org,
        Qualys Security Advisory <qsa@qualys.com>,
        LKML <linux-kernel@vger.kernel.org>, Ximin Luo <infinity0@debian.org>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1682
Lines: 38

On Tue, Jul 4, 2017 at 11:39 AM, Willy Tarreau <w@1wt.eu> wrote:
>
> But what is wrong with stopping the loop as soon as the distance gets
> larger than the stack_guard_gap ?

Absolutely nothing. But that's not the problem with the loop. Let's
say that you are using lots of threads, so that you know your stack
space is limited. What you do is to use MAP_FIXED a lot, and you lay
out your stacks fairly densely (with each other, but also possibly
with other mappings), with that PROT_NONE redzoning mapping in between
the "dense" allocations.

So when the kernel wants to grow the stack, it finds the PROT_NONE
redzone mapping - but there's possibly other maps right under it, so
the stack_guard_gap still hits other mappings.

And the fact that this seems to trigger with
 (a) 32-bit x86
 (b) Java
actually makes sense in the above scenario: that's _exactly_ when
you'd have dense mappings. Java is very thread-happy, and in a 32-bit
VM, the virtual address space allocation for stacks is a primary issue
with lots of threads.

Of course, the downside to this theory is that apparently the Java
problem is not confirmed to actually be due to this (Ben root-caused
the rust thing on ppc64), but it still sounds like quite a reasonable
thing to do.

The problem with the Java issue may be that they do that "dense stack
mappings in VM space" (for all the usual "lots of threads, limited VM"
reasons), but they may *not* have that PROT_NONE redzoning at all.

So the patch under discussion works for Rust exactly *because* it does
its redzone to show "this is where I expect the stack to end". The
i386 java load may simply not have that marker for us to use..

               Linus