Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936376AbdDSOzm (ORCPT ); Wed, 19 Apr 2017 10:55:42 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43850 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935659AbdDSOzi (ORCPT ); Wed, 19 Apr 2017 10:55:38 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 2BE5683F46 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=bhe@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 2BE5683F46 Date: Wed, 19 Apr 2017 22:55:17 +0800 From: Baoquan He To: Thomas Garnier Cc: Jeff Moyer , Ingo Molnar , Dan Williams , LKML , linux-nvdimm@ml01.01.org Subject: Re: KASLR causes intermittent boot failures on some systems Message-ID: <20170419145517.GB2311@x1> References: <20170419133630.GA2311@x1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.0 (2016-08-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Wed, 19 Apr 2017 14:55:23 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1771 Lines: 43 On 04/19/17 at 07:27am, Thomas Garnier wrote: > On Wed, Apr 19, 2017 at 6:36 AM, Baoquan He wrote: > > Hi all, > > > > I login in Jeff's system, and added debug code, no clue found. However > > DaveY found he disabled page_offset randomization only and the efi issue > > won't be seen on his system with kaslr enabled. I did it too on Jeff's > > pmem system, it has the same result. I have rebooted several times, all > > boot successfully. In the current code, no __PAGE_OFFSET_BASE is used > > directly, don't know why it failed. > > Great! I still cannot repro it. > > > > > Does anyone have any idea or hint I can try? I read pmem code about > > the devm_nsio_enable/pmem_attach_disk/arch_add_memory, have no idea yet. > > I would test couple things: > - Set page_offset_base to 0 by default and set it to > __PAGE_OFFSET_BASE in kernel_randomize_memory (without randomizing > it). If it crashes on a low address, it might be due to using __va or > PAGE_OFFSET in general before randomization is done. Thanks, Thomas! Changed code like below, it should have the same effect as you suggested. @@ -140,6 +140,8 @@ void __init kernel_randomize_memory(void) * Select a random virtual address using the extra entropy * available. */ + if (i == 0) + continue; entropy = remain_entropy / (ARRAY_SIZE(kaslr_regions) - i); Didn't see failure since above change applied. > - Does any change in __PAGE_OFFSET lead to a crash? Or only when > __PAGE_OFFSET is on a specific range. Given that you may have to > reboot multiple times to get a crash, I assume that a specific range > is the problem but might be worth checking. Good point, will check.