Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp711769imu; Fri, 25 Jan 2019 09:34:57 -0800 (PST) X-Google-Smtp-Source: ALg8bN6KQPgKKeNOKTZK1595+M5/dMBq/yGMGpXjTjTxmpBIOj1vMu1TcB3kYFYXL0hpL4vkXFi5 X-Received: by 2002:a62:1484:: with SMTP id 126mr11653102pfu.257.1548437696978; Fri, 25 Jan 2019 09:34:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548437696; cv=none; d=google.com; s=arc-20160816; b=RBI5w2aIyLS5Ru1+iV5BMf70H+Ft1brGtdYjeFcC5gR7dUyYGLvDyN8b7y6SgWs3Aj qY0enKvTm7mXMsVPsKRHA5TiR0W38hHF2i/SEykTp8emWuy3fJc0jaDOwjS2xP/P2Tvn wKHOG/InBLWgPQEvKdUT+BSRNuYSiQMnW7iMp6t1AhbTUcLS9aT0Sondx1KdozWWa5EW n0wFOzLWax+/GV7FZ0Zi63u6indwT+zNhq1jS07ox+a8Sfj1aB4qqh8MGwb6Gscl1rcK BbOvWRYZ1z+6DfigMdOsNA2QxE9Ci5VW+WGKeG0CZ/TRYBooA02zzDHzX411SyCUS4BE 034g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=tNivtO0blFnU1la5jnWe7lo0ShHoEbuwV0ZwDIMNCJU=; b=rxXJWUwKNTGXgMAp0yNnTX1TgxL6lCgXM7jv5pqalIqgTQ7p83xHt8wgQAGwGh6mwE 7GnqP5n8kaCMSngbklbBp5vCc7eQoFTRCjmhLoUM+5/zI4LFvfg3lHwBP4Lc5Z068J0I P4GcYWMnI3AuBQB22KVH1mZ68U5RUJLWFOKKrU/qRTkuOWS0eTvPCkFWMXwBRGeK6NgV qrAFvq7skmk/gZSZHGl7Qrc6qH52DUXeJGY66aHg6m6GGcPPPztlHT8lMMOCmSfgDnl3 PN5Kq0rL3JvCrqQwTLxBevonqUecI+a1RuJ4/xn2ceeHPoTi7c7Bh5srUPKOIYT2PJBR bL8w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r18si23757783pgb.491.2019.01.25.09.34.40; Fri, 25 Jan 2019 09:34:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728907AbfAYRdS (ORCPT + 99 others); Fri, 25 Jan 2019 12:33:18 -0500 Received: from mx2.suse.de ([195.135.220.15]:48830 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726252AbfAYRdR (ORCPT ); Fri, 25 Jan 2019 12:33:17 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 90D6FB05A; Fri, 25 Jan 2019 17:33:16 +0000 (UTC) Date: Fri, 25 Jan 2019 18:33:15 +0100 From: Michal Hocko To: robert shteynfeld Cc: Linus Torvalds , Mikhail Zaslonko , Linux List Kernel Mailing , Gerald Schaefer , Mikhail Gavrilov , Dave Hansen , Alexander Duyck , Andrew Morton , Pavel Tatashin , Steven Sistare , Daniel Jordan , Bob Picco Subject: Re: kernel panic due to https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2830bf6f05fb3e05bc4743274b806c821807a684 Message-ID: <20190125173315.GC20411@dhcp22.suse.cz> References: <20190125073704.GC3560@dhcp22.suse.cz> <20190125081924.GF3560@dhcp22.suse.cz> <20190125082952.GG3560@dhcp22.suse.cz> <20190125155810.GQ3560@dhcp22.suse.cz> <20190125163938.GA20411@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190125163938.GA20411@dhcp22.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 25-01-19 17:39:38, Michal Hocko wrote: > On Fri 25-01-19 11:16:30, robert shteynfeld wrote: > > Attached is the dmesg from patched kernel. > > Your Node1 physical memory range precedes Node0 which is quite unusual > but it shouldn't be a huge problem on its own. But memory ranges are > not aligned to the memory section > > [ 0.286954] Early memory node ranges > [ 0.286955] node 1: [mem 0x0000000000001000-0x0000000000090fff] > [ 0.286955] node 1: [mem 0x0000000000100000-0x00000000dbdf8fff] > [ 0.286956] node 1: [mem 0x0000000100000000-0x0000001423ffffff] > [ 0.286956] node 0: [mem 0x0000001424000000-0x0000002023ffffff] > > As you can see the last pfn for the node1 is inside the section and > Node0 starts right after. This is quite unusual as well. If for no other > reasons then the memmap of those struct pages will be remote for one or > the other. Actually I am not even sure we can handle that properly > because we do expect 1:1 mapping between sections and nodes. > > Now it also makes some sense why 2830bf6f05fb ("mm, memory_hotplug: > initialize struct pages for the full memory section") made any > difference. We simply write over a potentially initialized struct page > and blow up on that. I strongly suspect that the commit just uncovered > a pre-existing problem. Let me think what we can do about that. Appart from force aligning node's start the only other option is to revert 2830bf6f05fb and handling the underlying issue in the hotplug code. I really wanted to prevent that because memory hotplug assumes sections to be in a single node at way too many places. Maybe somebody has a more clever idea though. -- Michal Hocko SUSE Labs