Received: by 10.213.65.68 with SMTP id h4csp3287imn; Thu, 15 Mar 2018 07:55:06 -0700 (PDT) X-Google-Smtp-Source: AG47ELsI/mmhlcVv7Vfn4lmAKyr0EXbj6fBijpkV+JvNvg39vKB+rRn2q0yu11tCiUwWHOtUAQwZ X-Received: by 2002:a17:902:7045:: with SMTP id h5-v6mr8422203plt.217.1521125706669; Thu, 15 Mar 2018 07:55:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521125706; cv=none; d=google.com; s=arc-20160816; b=c9lFjURsG/FNJMVuoe1U6fvJDiVUDKmusJaFDSTFPmysd7zIqK5X5Xv3gzDjFxka4R iZxZv0fJWDwmwNvv+FuW4Ln3QuKAmIbB4q/MdFo1XMqRay0D5r6L83jQUqir9MDtv+QF SAMN+AC/+Xkz320iyGHhdBZwlfrC554jw3fNQ8AWxML5NYC98qPO642nM0qC538dO5r6 8uKhTBzwdfilLhH9wQkl51QgDKorRPMh4s8VlAhn6tEo/GMFlRiDIyRGvTQZeYc5Ckof BdGbaobwYFZ/94M4tSw34vKrC2xZx5ZT2NzHKLBboH0denkSn1SBsas/c8SGgZp9WE9g ianw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:arc-authentication-results; bh=C054gAVi7xHyWvy2LAB8Icqn9EuQ0x7rw3M9b3fHUQM=; b=P7FIYhalQMLsD6zjHPaea/pNxNKUuQINYtJA5lxuf2mRwgDj2rtWzB9yBLZqOAHCrW tQ3KHV6jAIjyY27rxN6IE6rEo8gqIE+rwuKDtI2S5IeciVePeoGkFjf7EW1ErxTy43SP o8+f4p6kED+cOB9Q6+p8L4Kc8Jq9edpD+nKqHLrBEWMa7laR00Eum1z48Ucj2OME6Eim IxOigYZbHZLKVpXJVLDaxTG9GN+9zuIllvDfU7zMOfigXWRP/MIbvbGXcw0wRP3EwAx8 oNPmdlDwt2wJxUnRX/Xb7lEc0Zcrxjm6+jORbU6YSeEsbYQ+78PdKLuLD+8pQtWP9bHv KkMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x75si3961669pff.339.2018.03.15.07.54.52; Thu, 15 Mar 2018 07:55:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752685AbeCOOx0 (ORCPT + 99 others); Thu, 15 Mar 2018 10:53:26 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:38565 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752232AbeCOOxX (ORCPT ); Thu, 15 Mar 2018 10:53:23 -0400 Received: by mail-oi0-f65.google.com with SMTP id t69so5232475oih.5 for ; Thu, 15 Mar 2018 07:53:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=C054gAVi7xHyWvy2LAB8Icqn9EuQ0x7rw3M9b3fHUQM=; b=VXEPUEocSOCsPZqNQyTAI7PuOHGvXJXSUZ6VBj+sMW2mp6gcFq3DlQVL0NprVsR6Np sjLnf+5O23u0zrx+iZRbnRQlpv6FxtUJDDeJlfXyibk2kSA5POCkCmUCtWASfcPUj5xS 0+Clt+Cu6jek9IBOJ/JIqQvxZZovq7PDCOrRy4wOsFtT6EacL8pnw8wp6UBCoyOUYFAT DusCG4VRO5esC/28wnK6qIugB2HwrP30WoyuONE7WsHYNXZhQ1/pEKEnEED7kLgQZ4KV KXzR/fvMAxOeToz34JM4/HWTJStBThGCXibD1QkG0Vlf5gJjsSo3ecJGjzuLWufYsd/O PSpA== X-Gm-Message-State: AElRT7HXhC0tRCDmEiLDzLo+WccGNY1wj1DQ4nUIMIiOF2v6dlBgD1sO XeRK9uInRz2LPHX3OxcWRC4Do46JHeq5f7AczhJ8Xg== X-Received: by 10.202.93.136 with SMTP id r130mr5512550oib.84.1521125603370; Thu, 15 Mar 2018 07:53:23 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:39f6:0:0:0:0:0 with HTTP; Thu, 15 Mar 2018 07:53:22 -0700 (PDT) In-Reply-To: <20180315115055.GD23100@dhcp22.suse.cz> References: <20180313224240.25295-1-neelx@redhat.com> <20180314141727.GE23100@dhcp22.suse.cz> <20180315115055.GD23100@dhcp22.suse.cz> From: Daniel Vacek Date: Thu, 15 Mar 2018 15:53:22 +0100 Message-ID: Subject: Re: [PATCH] mm/page_alloc: fix boot hang in memmap_init_zone To: Michal Hocko Cc: Naresh Kamboju , Sudeep Holla , Ard Biesheuvel , open list , linux-mm@kvack.org, Andrew Morton , Mel Gorman , Paul Burton , Pavel Tatashin , Vlastimil Babka , stable Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 15, 2018 at 12:50 PM, Michal Hocko wrote: > On Thu 15-03-18 02:30:41, Daniel Vacek wrote: >> On Wed, Mar 14, 2018 at 3:17 PM, Michal Hocko wrote: >> > On Tue 13-03-18 23:42:40, Daniel Vacek wrote: >> >> On some architectures (reported on arm64) commit 864b75f9d6b01 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >> >> causes a boot hang. This patch fixes the hang making sure the alignment >> >> never steps back. >> > >> > I am sorry to be complaining again, but the code is so obscure that I >> >> No worries, I'm glad for any review. Which code exactly you do find >> obscure? This patch or my former fix or the original commit >> introducing memblock_next_valid_pfn()? Coz I'd agree the original >> commit looks pretty obscure... > > As mentioned in the other email, the whole going back and forth in the > same loop is just too ugly to live. It's not really supposed to go back, but I guess you understand. >> > would _really_ appreciate some more information about what is going >> > on here. memblock_next_valid_pfn will most likely return a pfn within >> > the same memblock and the alignment will move it before the old pfn >> > which is not valid - so the block has some holes. Is that correct? >> >> I do not understand what you mean by 'pfn within the same memblock'? > > Sorry, I should have said in the same pageblock > >> And by 'the block has some holes'? > > memblock_next_valid_pfn clearly returns pfn which is within a pageblock > and that is why we do not initialize pages in the begining of the block > while move_freepages_block does really expect the full pageblock to be > initialized properly. That is the fundamental problem, right? Yes, that's correct. >> memblock has types 'memory' (as usable memory) and 'reserved' (for >> unusable mem), if I understand correctly. > > We might not have struct pages for invalid pfns. That really depends on > the memory mode. Sure sparse mem model will usually allocate struct > pages for whole memory sections but that is not universally true and > adding such a suble assumption is simply wrong. This is gray area for me. But if I understand correctly this assumption comes from the code. It was already there and got broken hence I was trying to keep it. If anything needs redesigning I'm all for it. But I was just calming the fire here. I only didn't test on arm, which seems to be the only one different. > I suspect you are making strong assumptions based on a very specific > implementation which might be not true in general. That was the feeling > I've had since the patch was proposed for the first time. This is such a > cluttered area that I am not really sure myself, thoug. I understand. And again this is likely correct. I'll be glad for any assistance here. My limited knowledge is the primary cause for lack of relevant details I guess. What I checked looks like pfn_valid is a generic function used by all arches but arm, which seems to be the only one to implement CONFIG_HAVE_ARCH_PFN_VALID if I didn't miss anything. So if this config is enabled on arm, it uses it's own version of pfn_valid(). If not, I'd expect all arches behave the same. That's where my assumption comes from. > -- > Michal Hocko > SUSE Labs