Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp18112ybj; Mon, 4 May 2020 15:11:46 -0700 (PDT) X-Google-Smtp-Source: APiQypIJrrMBBBxyZV4HW6Phw/eaaQCGuuWwDaO4+ifiHRrOYvV86L04TXpL6cSkli5xejfbfqpE X-Received: by 2002:aa7:cf0e:: with SMTP id a14mr175225edy.188.1588630306380; Mon, 04 May 2020 15:11:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588630306; cv=none; d=google.com; s=arc-20160816; b=HR2j50uXLgggXvRWDDXZlA3pbiHFdMthUMFD8zi40FEjNcgPL3xeY767CthTlaqAdm NfXD6mE4jOdOq8M2bjK5mmzrbBkzkuHB2Gm7y+OLn+aIE9c5I+SmzXR2nt6TE8OBXeQD azydf1HREU2nDmCWybfWSucNkq7U3bUVQDxRwbM0XPrAb/PlkJdQQXI2OffimOZC5rvo INIZDGDTL5CciaSlhdiXWpWBwj0v0FDEQTF2UmhrGnL1Ct1N8rfbiFKumD2vGzMZvJ7k rn9RCiOKGDWoeRcsGXM59KbpwYubqdHIxAJQo4TM3aLNpSt09Ho/9NcdYe8A25t3/ylt G1hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=EZk7KO7iCLuHsBzIYCsHCdT3vf2qALdqqTIYGdX8DrM=; b=KX4z+Ar9jZH2zpLFI5dpL+ewEGItnYrxjnb8TxIVD25nixr4Wf6KSJnQ3y4gb5Qdp9 trTnXz2f7u4HekFtmM5qfiqvqNw9LGD1p+kHDzsmflKSBKwzYMKqOrH8deirJu9tC6Nx oLGSs3ZncRpStEhRZvsBuWVUAuDDhC9K8VZg08NFWsuD3N2rt8c6WEmcVM2Zhla303ch ak1AiXcdz1C0MyznlEWiZHFMLxnfmC3cY60cXmckPUkMnaXrQ/GYDinQ1cRqRfY2a3lE va5XIeK/9tM6BW7NzSllTFNbtNLKL0/O+t+y5+mveXO6mWMeKH6a8f7XgIhIuPoZbpCJ bTaw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="MunwQTK/"; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j23si7211554ejm.361.2020.05.04.15.11.11; Mon, 04 May 2020 15:11:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="MunwQTK/"; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726487AbgEDWK7 (ORCPT + 99 others); Mon, 4 May 2020 18:10:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1726419AbgEDWK7 (ORCPT ); Mon, 4 May 2020 18:10:59 -0400 Received: from mail-il1-x142.google.com (mail-il1-x142.google.com [IPv6:2607:f8b0:4864:20::142]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0ED8CC061A0E; Mon, 4 May 2020 15:10:58 -0700 (PDT) Received: by mail-il1-x142.google.com with SMTP id w6so429353ilg.1; Mon, 04 May 2020 15:10:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EZk7KO7iCLuHsBzIYCsHCdT3vf2qALdqqTIYGdX8DrM=; b=MunwQTK/oZEY08VH24oKY2sqbh9YFifGKbMgAwqtSvL51Uff/DmC7w6GxZGvW1JDj7 r7qhZeyUfSluCoFmNFKERno/XgtP1/AIy87C8SGX9cRBwrZmGl5N65aQWCTwoklxOaNF Y2BWYETGpOZGY9mn7NnJWFFx18tWPwyOdevCtdOxcedFSKMo+fqEFxsCWUT0pQgYfHYP g4xOw5797xu9M8YcRUN9pdmj++icvaMIrNy0RqJuw3Q3J1kREak946j1UXw6gmKAeXdy wZR/zuqInAvY9IARbZTNFhq/UVT42ura0EQcSwnDBSSS8Ga0fpMV6O8LGx5X35VZlXIC aooA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EZk7KO7iCLuHsBzIYCsHCdT3vf2qALdqqTIYGdX8DrM=; b=WtoAykUnzYUr4MdQlhvXqhU6keqSnXc2AJdUfV+dRkUOOe1r3K8Spy8//jP8xy/l5K A9FJ3YiH3/wKXfYPXLahYgriYxoR7y1SrTI3ojk3UX2Uz3rS6jMXpJ5ytvB4rbEhmz7Z ibK+KuFFdvcS6kOxTOJoNuuU4uvampA9NQFuY9k0B/0mqw1i5aZxnhreqBBvpg42eFF9 jCiuAJDjqgG6Up54BQdTK9dOo2KQpOqIUkyr+D0grxLR9JDcx87BzxqyIOJ4kbvbyJ1r 5+T9HHAHfLNCISQvRZwhkdn0oV7USCOOjw7Kulmc8KtWXnemutSQvm8cmoCLNktIBou8 lV6A== X-Gm-Message-State: AGi0Pua7uEJEuwzsL4u+cFD40KZ5WcEgbvh+/E2d38xtHY+jWrLqcuTq x2vKIVSqqK9rST9u1wd3b86vMaRI/31fn/pvczc= X-Received: by 2002:a92:5f46:: with SMTP id t67mr590358ilb.64.1588630257228; Mon, 04 May 2020 15:10:57 -0700 (PDT) MIME-Version: 1.0 References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> <20200430201125.532129-6-daniel.m.jordan@oracle.com> <20200501024539.tnjuybydwe3r4u2x@ca-dmjordan1.us.oracle.com> In-Reply-To: <20200501024539.tnjuybydwe3r4u2x@ca-dmjordan1.us.oracle.com> From: Alexander Duyck Date: Mon, 4 May 2020 15:10:46 -0700 Message-ID: Subject: Re: [PATCH 5/7] mm: move zone iterator outside of deferred_init_maxorder() To: Daniel Jordan Cc: Alexander Duyck , Andrew Morton , Herbert Xu , Steffen Klassert , Alex Williamson , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Thu, Apr 30, 2020 at 7:45 PM Daniel Jordan wrote: > > Hi Alex, > > On Thu, Apr 30, 2020 at 02:43:28PM -0700, Alexander Duyck wrote: > > On 4/30/2020 1:11 PM, Daniel Jordan wrote: > > > padata will soon divide up pfn ranges between threads when parallelizing > > > deferred init, and deferred_init_maxorder() complicates that by using an > > > opaque index in addition to start and end pfns. Move the index outside > > > the function to make splitting the job easier, and simplify the code > > > while at it. > > > > > > deferred_init_maxorder() now always iterates within a single pfn range > > > instead of potentially multiple ranges, and advances start_pfn to the > > > end of that range instead of the max-order block so partial pfn ranges > > > in the block aren't skipped in a later iteration. The section alignment > > > check in deferred_grow_zone() is removed as well since this alignment is > > > no longer guaranteed. It's not clear what value the alignment provided > > > originally. > > > > > > Signed-off-by: Daniel Jordan > > > > So part of the reason for splitting it up along section aligned boundaries > > was because we already had an existing functionality in deferred_grow_zone > > that was going in and pulling out a section aligned chunk and processing it > > to prepare enough memory for other threads to keep running. I suspect that > > the section alignment was done because normally I believe that is also the > > alignment for memory onlining. > > I think Pavel added that functionality, maybe he could confirm. > > My impression was that the reason deferred_grow_zone aligned the requested > order up to a section was to make enough memory available to avoid being called > on every allocation. > > > With this already breaking things up over multiple threads how does this > > work with deferred_grow_zone? Which thread is it trying to allocate from if > > it needs to allocate some memory for itself? > > I may not be following your question, but deferred_grow_zone doesn't allocate > memory during the multithreading in deferred_init_memmap because the latter > sets first_deferred_pfn so that deferred_grow_zone bails early. It has been a while since I looked at this code so I forgot that deferred_grow_zone is essentially blocked out once we start the per-node init. > > Also what is to prevent a worker from stop deferred_grow_zone from bailing > > out in the middle of a max order page block if there is a hole in the middle > > of the block? > > deferred_grow_zone remains singlethreaded. It could stop in the middle of a > max order block, but it can't run concurrently with deferred_init_memmap, as > per above, so if deferred_init_memmap were to init 'n free the remaining part > of the block, the previous portion would have already been initialized. So we cannot stop in the middle of a max order block. That shouldn't be possible as part of the issue is that the buddy allocator will attempt to access the buddy for the page which could cause issues if it tries to merge the page with one that is not initialized. So if your code supports that then it is definitely broken. That was one of the reasons for all of the variable weirdness in deferred_init_maxorder. I was going through and making certain that while we were initializing the range we were freeing the pages in MAX_ORDER aligned blocks and skipping over whatever reserved blocks were there. Basically it was handling the case where a single MAX_ORDER block could span multiple ranges. On x86 this was all pretty straightforward and I don't believe we needed the code, but I seem to recall there were some other architectures that had more complex memory layouts at the time and that was one of the reasons why I had to be careful to wait until I had processed the full MAX_ORDER block before I could start freeing the pages, otherwise it would start triggering memory corruptions.