Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753171AbdFVPQa (ORCPT ); Thu, 22 Jun 2017 11:16:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50422 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752466AbdFVPQ3 (ORCPT ); Thu, 22 Jun 2017 11:16:29 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com A67B67F411 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=oleg@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com A67B67F411 Date: Thu, 22 Jun 2017 17:16:23 +0200 From: Oleg Nesterov To: Hugh Dickins Cc: Linus Torvalds , Cyrill Gorcunov , Andrey Vagin , Pavel Emelyanov , Dmitry Safonov , Andrew Morton , Adrian Reber , Michael Kerrisk , Willy Tarreau , kernel test robot , Michal Hocko , LKML , LKP , Larry Woodman , Rik van Riel Subject: Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c Message-ID: <20170622151623.GB762@redhat.com> References: <20170621023552.GB32082@yexl-desktop> <20170621193338.GA29222@redhat.com> <20170621202751.GA29638@redhat.com> <20170621205617.GA29841@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 22 Jun 2017 15:16:29 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3408 Lines: 85 On 06/21, Hugh Dickins wrote: > > On Wed, 21 Jun 2017, Linus Torvalds wrote: > > On Wed, Jun 21, 2017 at 1:56 PM, Oleg Nesterov wrote: > > > > > > I understand. My point is that this check was invalidated by stack-guard-page > > > a long ago, and this means that we add the user-visible change now. > > > > Yeah. I guess we could consider it an *old* regression that got fixed, > > but if people started relying on the regression... > > > > >> Do you have a pointer to the report for this regression? I must have missed it. > > > > > > See http://marc.info/?t=149794523000001&r=1&w=2 > > > > Ok. > > > > And thinking about it, while that is a silly test-case, the notion of > > "create top-down segment, then start populating it _before_ moving the > > stack pointer into it" is actually perfectly valid. > > > > So I guess checking against the stack pointer is wrong in that case - > > at least if the stack pointer isn't inside that vma to begin with. > > > > So yes, removing that check looks like the right thing to do for now. > > > > Do you want to send me the patch if you already have a commit message etc? > > I have a bit of a bad feeling about this. > > Perhaps it's just sentimental attachment to all those weird > and ancient stack pointer checks in arch//fault.c. > > We have been inconsistent: cris frv m32r m68k microblaze mn10300 > openrisc powerpc tile um x86 have such checks, the others don't. > So that's a good reason to delete them. OK, I didn't bother to check other acrhitectures, thanks... > But at least at the moment those checks impose some sanity: > just a page less than we had imagined for several years. > Once we remove them, they cannot go back. Should we now > complicate them with an extra page of slop? Something like the patch below? Yes, I thought about this too. I simply do not know. Honestly, I do not even know why MAP_GROWSDOWN exists. I mean, I do not understand how user-space can actually use it to get auto-growing, the usage of MAP_GROWSDOWN in (say) criu is clear. The main thread's stack can grow, but this is only because it is placed at the right place, above mm->mmap_base in case of top-down layout. > I'm not entirely persuaded by your pre-population argument: > it's perfectly possible to prepare a MAP_GROWSDOWN area with > an initial size, that's populated in a normal way, before handing > off for stack expansion - isn't it? Exactly. > I'd be interested to hear more about that (redhat internal) bug > report that Oleg mentions: whether it gives stronger grounds for > making this sudden change than the CRIU testcase. Probably not. Well, the customer reported multiple problems, but most of them were caused by rhel-specific bugs. As for "MAP_GROWSDOWN does not grow", most probably this was another test-case, not the real application. I will ask and report back if this is not true. In short, I agree with any decision. Even with "we do not care if we break some artificial test-cases". Oleg. --- --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1409,7 +1409,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, bad_area(regs, error_code, address); return; } - if (error_code & PF_USER) { + if ((error_code & PF_USER) && (address + PAGE_SIZE < vma->vm_start)) { /* * Accessing the stack below %sp is always a bug. * The large cushion allows instructions like enter