Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751888AbdFTKvU (ORCPT ); Tue, 20 Jun 2017 06:51:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53900 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750985AbdFTKvT (ORCPT ); Tue, 20 Jun 2017 06:51:19 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com BDA936AAD8 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=oleg@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com BDA936AAD8 Date: Tue, 20 Jun 2017 12:51:16 +0200 From: Oleg Nesterov To: Cyrill Gorcunov Cc: Hugh Dickins , Andrey Vagin , LKML , Pavel Emelyanov , Dmitry Safonov , Andrew Morton Subject: Re: [criu] 1M guard page ruined restore Message-ID: <20170620105116.GA20974@redhat.com> References: <20170620075206.GB1909@uranus.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170620075206.GB1909@uranus.lan> User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 20 Jun 2017 10:51:19 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1964 Lines: 47 On 06/20, Cyrill Gorcunov wrote: > > | diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > | index f0c8b33..520802d 100644 > | --- a/fs/proc/task_mmu.c > | +++ b/fs/proc/task_mmu.c > | @@ -300,11 +300,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid) > | > | /* We don't show the stack guard page in /proc/maps */ > | start = vma->vm_start; > | - if (stack_guard_page_start(vma, start)) > | - start += PAGE_SIZE; > | end = vma->vm_end; > | - if (stack_guard_page_end(vma, end)) > | - end -= PAGE_SIZE; > | > | seq_setwidth(m, 25 + sizeof(void *) * 6 - 1); > | seq_printf(m, "%08lx-%08lx %c%c%c%c %08llx %02x:%02x %lu ", > > For which we of course are not ready because we've been implying the > guard page is returned here so we adjust addresses locally when saving > them into images. > > So now we need to figure out somehow if show_map_vma accounts [PAGE_SIZE|guard_area] or not, > I guess we might use kernel version here but it won't be working fine on custom kernels, > or kernels with the patch backported. You can write a simple test. Just do mmap(MAP_GROWSDOWN) and look at /proc/self/maps. If it reports vm_start + PAGE_SIZE rather than addr returned by mmap, then the kernel is old. > Second I guess we might need to detect @stack_guard_gap runtime as > well I do not think so. criu does not need to know about the new guard area at all. It simply doesn't exist from user-space pov. In fact, I think this should have been true even before this change, just stack_guard_page_start() was not accurate and this is the reason (I guess) you had to play with stack guard; the first page (hidden by show_map_vma) can have a valid stack data, for example if the application played with MAP_FIXED or munmap(). So I think you should simply disable, say, unmap_guard_pages() and most of all other MAP_GROWSDOWN code in criu. Oleg.