Received: by 10.213.65.68 with SMTP id h4csp1560627imn; Thu, 15 Mar 2018 03:18:36 -0700 (PDT) X-Google-Smtp-Source: AG47ELssMIiKj7Lgm5yejG6kLBOWyzsfSSBU4p2H/9jUl2S1Z+XVRp0ol94CaCLSUJlsbVSRz3tS X-Received: by 2002:a17:902:684a:: with SMTP id f10-v6mr7501296pln.129.1521109116210; Thu, 15 Mar 2018 03:18:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521109116; cv=none; d=google.com; s=arc-20160816; b=WSPy/r1lKm0jcwXZ7mZLB/5RFzmfKfE3IdoAnmDW3nIglVFTRMmCXwmXB41i6wrlLt 7fFhftpVqwnV4XWt9HN1H1lDqKn190aLpJspHMNpDuJXQC98B2Iqz3Y/qbQcUUMEfTc4 6waR4tao6FpweCvfBSXvmSj6sX80ZENgi0AiOrAGtG93Od9JEXLljCkPOHMvbopPwyjB S49p0w7eSCI1IzNt3fg8yqnuev6AvHKsn3T5yGEK/LYzJdMhjbvuNTfK4821INin2ihM 03Sj4kPsdaju7w0BkOVPqkHWXw587gSOiUS6V2sYisahxW5tuXbuj3rqgHtDm85vhnnt 5pAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=gOwpWgjkza42nSyFF09304y+gdPpd7C2qBT5GN+Nl6s=; b=nu+46l7ddcyhAr1zoh6G1UYS9APryICWWfVHO3DSJZWqda8wC+6d/DALbOySoWk8zj m2qnKjDCNwBVzxQ4cU8r0o1l8Ot3jn6s5zdjuGH9qTgbjlV/iKdZomxjbgTUyzLObfQN SI02Ovl1M5ee3geaNLYnbdYuPtSPwwleJD/KX2g0d4dp2G9rcRmmxmMlnbF4odHrAmnI AE2dICQpbrEvqO1E+6d/bMwMS9+JJZS0CVeYYtzoyFd1HvY6MdiR7AX05GLe6Q0rKygR AihfykqiXpP0QAVRmAtJ4f/CUEb66fK2RcrLid8OcvlkIKCkJYxBxMzl6zxyPCwtP791 S1LA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=RGt1KVaY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k11-v6si3660799plt.531.2018.03.15.03.18.20; Thu, 15 Mar 2018 03:18:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=RGt1KVaY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751478AbeCOKR1 (ORCPT + 99 others); Thu, 15 Mar 2018 06:17:27 -0400 Received: from mail-io0-f196.google.com ([209.85.223.196]:44617 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750726AbeCOKR0 (ORCPT ); Thu, 15 Mar 2018 06:17:26 -0400 Received: by mail-io0-f196.google.com with SMTP id h23so7912407iob.11 for ; Thu, 15 Mar 2018 03:17:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=gOwpWgjkza42nSyFF09304y+gdPpd7C2qBT5GN+Nl6s=; b=RGt1KVaYcgbZ8kFi+6Q9Jgz8kfP3zb73FdRqu3rx0aT/n+7hC4015YY3GErXl/yfAU ZrH18hfXzrhD6LgcWpRLCRkJ27SAO2jXEDoU9+0vssAq8dJ7ILoMtvTJLyxKNhdNHxwg ziUm/YXxcXwurOFsTPGruPL9ajnGhU79dFEfA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=gOwpWgjkza42nSyFF09304y+gdPpd7C2qBT5GN+Nl6s=; b=CilMVqy/Ug/95PMeSVFgyeeWpqdbyiuRrsvymdCA1eU/Qj/3Y9ElQk/nEixmncGZiy gGmsyx9BcFzC1bdDBpsU+KBRhEjYt/H/ou+8kiVogr+IvcZ0sTKp3ICbQW6+v+IjDQNT Wp58fFkcM2R3pFv9qYQUpfk1qohwBglMDwxUZbO0A1V5JCbKSxjsKyIslNymHzrLoxlm av+Gzxx+s870BwL8QNGFjQIJCokPL1tgQEuOT8QodVvURYDizkYLCUb0H2/M3dEnE9nl /xbXpc4AIqGmZNvIm70LOxkdL7RPnP5PUQuHwM5dGzdtiqRNwj7x6aGxi3l6ssdL2Qik X9hg== X-Gm-Message-State: AElRT7FOssUqESmTuRpGWYgxaCTwdr8uv7ZdLvdFl9V4eyZU8khqj8dM ed1D6wDzpcU8GU2w58QWVr5ZBY6kZ9dZI3EDocgN+Q== X-Received: by 10.107.5.199 with SMTP id 190mr8646696iof.107.1521109045299; Thu, 15 Mar 2018 03:17:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.138.209 with HTTP; Thu, 15 Mar 2018 03:17:24 -0700 (PDT) In-Reply-To: <20180315101411.GA23100@dhcp22.suse.cz> References: <20180314134431.13241-1-ard.biesheuvel@linaro.org> <20180314141323.GD23100@dhcp22.suse.cz> <20180314145450.GI23100@dhcp22.suse.cz> <20180315101411.GA23100@dhcp22.suse.cz> From: Ard Biesheuvel Date: Thu, 15 Mar 2018 10:17:24 +0000 Message-ID: Subject: Re: [PATCH] Revert "mm/page_alloc: fix memmap_init_zone pageblock alignment" To: Michal Hocko Cc: linux-arm-kernel , Linux Kernel Mailing List , Mark Rutland , Will Deacon , Catalin Marinas , Marc Zyngier , Daniel Vacek , Mel Gorman , Paul Burton , Pavel Tatashin , Vlastimil Babka , Andrew Morton , Linus Torvalds Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 15 March 2018 at 10:14, Michal Hocko wrote: > On Wed 14-03-18 15:54:16, Ard Biesheuvel wrote: >> On 14 March 2018 at 14:54, Michal Hocko wrote: >> > On Wed 14-03-18 14:35:12, Ard Biesheuvel wrote: >> >> On 14 March 2018 at 14:13, Michal Hocko wrote: >> >> > Does http://lkml.kernel.org/r/20180313224240.25295-1-neelx@redhat.com >> >> > fix your issue? From the debugging info you provided it should because >> >> > the patch prevents jumping backwards. >> >> > >> >> >> >> The patch does fix the boot hang. >> >> >> >> But I am concerned that we are papering over a fundamental flaw in >> >> memblock_next_valid_pfn(). >> > >> > It seems that memblock_next_valid_pfn is doing the right thing here. It >> > is the alignment which moves the pfn back AFAICS. I am not really >> > impressed about the original patch either, to be completely honest. >> > It just looks awfully tricky. I still didn't manage to wrap my head >> > around the original issue though so I do not have much better ideas to >> > be honest. >> >> So first of all, memblock_next_valid_pfn() never refers to its max_pfn >> argument, which is odd nut easily fixed. > > There is a patch to remove that parameter sitting in the mmotm tree. > >> Then, the whole idea of substracting one so that the pfn++ will >> produce the expected value is rather hacky, > > Absolutely agreed! > >> But the real problem is that rounding down pfn for the next iteration >> is dodgy, because early_pfn_valid() isn't guaranteed to return true >> for the rounded down value. I know it is probably fine in reality, but >> dodgy as hell. > > Yes, that is what I meant when saying I was not impressed... I am always > nervous when a loop makes jumps back and forth. I _think_ the main > problem here is that we try to initialize a partial pageblock even > though a part of it is invalid. We should simply ignore struct pages > for those pfns. We don't do that and that is mostly because of the > disconnect between what the page allocator and early init code refers to > as a unit of memory to care about. I do not remember exactly why but I > strongly suspect this is mostly a performance optimization on the page > allocator side so that we do not have to check each and every pfn. Maybe > we should signal partial pageblocks from an early code and drop the > optimization in the page allocator init code. > >> The same applies to the call to early_pfn_in_nid() btw > > Why? By 'the same' I mean it isn't guaranteed to return true for the rounded down value *at the API level*. I understand it will be mostly fine in reality, but juggling (in)valid PFNs like this is likely to end badly.