Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752987AbdFUQ23 (ORCPT ); Wed, 21 Jun 2017 12:28:29 -0400 Received: from mail-lf0-f65.google.com ([209.85.215.65]:34773 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752946AbdFUQ21 (ORCPT ); Wed, 21 Jun 2017 12:28:27 -0400 Date: Wed, 21 Jun 2017 19:04:10 +0300 From: Cyrill Gorcunov To: Oleg Nesterov Cc: Hugh Dickins , Andrey Vagin , LKML , Pavel Emelyanov , Dmitry Safonov , Andrew Morton , Adrian Reber Subject: Re: [criu] 1M guard page ruined restore Message-ID: <20170621160410.GF31050@uranus> References: <20170620075206.GB1909@uranus.lan> <20170621152256.GC31050@uranus> <20170621155730.GA32554@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170621155730.GA32554@redhat.com> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4651 Lines: 153 On Wed, Jun 21, 2017 at 05:57:30PM +0200, Oleg Nesterov wrote: > (add Adrian) > > On 06/21, Cyrill Gorcunov wrote: > > > > The patches for criu are on the fly. Still one of the test case > > start failing with the new kernels. Basically the test does > > the following: > > Cyrill, please read the last email I sent you in another (private) discussion. > Most probably you should throw out some tests which assume the kernel has the > stack-guard-page hack, it was replaced by the stack-guard-hole hack ;) Yes, thank you. > > - allocate growsdown memory area > > - touch first byte (which before the patch force the kernel > > to extend the stack allocating new page) > > - touch first-1 byte > > > > --- > > int main(int argc, char **argv) > > { > > char *start_addr, *start_addr1, *fake_grow_down, *test_addr, *grow_down; > > volatile char *p; > > > > start_addr = mmap(NULL, PAGE_SIZE * 10, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); > > if (start_addr == MAP_FAILED) { > > printf("Can't mal a new region"); > > return 1; > > } > > printf("start_addr %lx\n", start_addr); > > munmap(start_addr, PAGE_SIZE * 10); > > > > fake_grow_down = mmap(start_addr + PAGE_SIZE * 5, PAGE_SIZE, > > PROT_READ | PROT_WRITE, > > MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED | MAP_GROWSDOWN, -1, 0); > > if (fake_grow_down == MAP_FAILED) { > > printf("Can't mal a new region"); > > return 1; > > } > > printf("start_addr %lx\n", fake_grow_down); > > > > p = fake_grow_down; > > *p-- = 'c'; > > I guess this works? I mean, *p-- = 'c' should not fail... It fails. > > > *p = 'b'; > > OK, now we need to expand the stack. This can fail or not. This depends on > whether this vma (created by mmap(MAP_GROWSDOWN) has a stack_guard_gap hole > between its ->vm_prev. > > > function get dropped off. Hugh, it is done on intent and > > userspace programs have to extend stack manually? > > No. a MAP_GROWSDOWN area should grow automatically. Unless the hole between > vm_prev becomes less than stack_guard_gap. > > This is the whole point of guard hole, or guard page we had before. Just the > previous implementation was not accurate, that is why criu had to have some > hacks to workaround. > > It no longer needs to know about guard hole/page/whatever. Just remove > (conditionalize) all the MAP_GROWSDOWN code. Except, of course, you still > need to record MAP_GROWSDOWN in vma_area->e->flags (_vmflag_match), in order > to restore this vma correctly. Oleg, look, it seems I've been testing on the wrong VM :) (Sign, so many opened at once it's easy to forget in which one you're runngin) Here is the complete code. It supposed to _extend_ stack but it fails on the latest master + Hugh's [PATCH] mm: fix new crash in unmapped_area_topdown() --- [root@fc2 criu]# ~/st2 start_addr 7fe6162a8000 start_addr 7fe6163d9000 Segmentation fault (core dumped) --- #include #include #include #include #include #include #include #define PAGE_SIZE 4096 int main(int argc, char **argv) { char *start_addr, *start_addr1, *fake_grow_down, *test_addr, *grow_down; volatile char *p; start_addr = mmap(NULL, PAGE_SIZE * 512, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); if (start_addr == MAP_FAILED) { printf("Can't mal a new region"); return 1; } printf("start_addr %lx\n", start_addr); munmap(start_addr, PAGE_SIZE * 512); start_addr += PAGE_SIZE * 300; fake_grow_down = mmap(start_addr + PAGE_SIZE * 5, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED | MAP_GROWSDOWN, -1, 0); if (fake_grow_down == MAP_FAILED) { printf("Can't mal a new region"); return 1; } printf("start_addr %lx\n", fake_grow_down); p = fake_grow_down; *p-- = 'c'; *p = 'b'; /* overlap the guard page of fake_grow_down */ test_addr = mmap(start_addr + PAGE_SIZE * 3, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); if (test_addr == MAP_FAILED) { printf("Can't mal a new region"); return 1; } printf("test_addr %lx\n", test_addr); grow_down = mmap(start_addr + PAGE_SIZE * 2, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED | MAP_GROWSDOWN, -1, 0); if (grow_down == MAP_FAILED) { printf("Can't mal a new region"); return 1; } printf("grow_down %lx\n", grow_down); munmap(test_addr, PAGE_SIZE); if (fake_grow_down[0] != 'c' || *(fake_grow_down - 1) != 'b') { printf("%c %c\n", fake_grow_down[0], *(fake_grow_down - 1)); return 1; } p = grow_down; *p-- = 'z'; *p = 'x'; return 0; }