Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752355AbdFVPFK (ORCPT ); Thu, 22 Jun 2017 11:05:10 -0400 Received: from mail-lf0-f68.google.com ([209.85.215.68]:36008 "EHLO mail-lf0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751193AbdFVPFJ (ORCPT ); Thu, 22 Jun 2017 11:05:09 -0400 Date: Thu, 22 Jun 2017 18:05:00 +0300 From: Cyrill Gorcunov To: Oleg Nesterov Cc: Hugh Dickins , Andrey Vagin , LKML , Pavel Emelyanov , Dmitry Safonov , Andrew Morton , Adrian Reber Subject: Re: [criu] 1M guard page ruined restore Message-ID: <20170622150500.GA28378@uranus> References: <20170620075206.GB1909@uranus.lan> <20170621152256.GC31050@uranus> <20170621155730.GA32554@redhat.com> <20170621160410.GF31050@uranus> <20170621170129.GA32752@redhat.com> <20170622142300.GA762@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170622142300.GA762@redhat.com> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2200 Lines: 45 On Thu, Jun 22, 2017 at 04:23:00PM +0200, Oleg Nesterov wrote: > Cyrill, > > I am replying to my own email because I got lost in numerous threads/emails > connected to stack guard/gap problems. IIRC you confirmed that the 1st load > doesn't fail and the patch fixes the problem. So everything is clear, and we > will discuss this change in another thread. Yes. > But let me add that (imo) you should not change this test-case. You simply > should not run it if kerndat_mm_guard_page_maps() detects the new kernel at > startup. > > The new version makes no sense for criu, afaics. Yes, yes, thank you very > much for this test-case, it found the kernel regression ;) But criu has > nothing to do with this problem, and it is not clear right now if we are > going to fix it or not. To be fair the first reporter is Andrew Vagin :) He wrote the test and poked me to look into. If we're not going to fix it in the kernel then sure -- we won't run it on new kernels (hell knows though, what else application may fail, as Linus pointed it's perfectly valid to map and autogrow the vma). > With the recent kernel changes criu should never look outside of start-end > region reported by /proc/maps; and restore doesn't even need to know if a > GROWSDOWN region will actually grow or not, because (iiuc) you do not need > to auto-grow the stack vma during restore, criu re-creates the whole vma > with the same length using MAP_FIXED and it should never write below the > addr returned by mmap(MAP_FIXED). Yes, and we already do, thanks. > So (afaics) the only complication is that the process can be dumped on > a system running with (say) stack_guard_gap=4K kernel parameter, and then > restored on another system running with stack_guard_gap=1M. In this case > the application may fail after restore if it tries to auto-grow the stack, > but this is unlikely and this is another story. Yes, it's different problem and it would be cool to be able to fetch this value somehow (maybe via sysfs or something). Otherwise if such container migration case happen we simply find error code in the restore log and I fear it won't be clear that the error happened exactly because of gap settings variation.