Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2186583imm; Thu, 27 Sep 2018 08:44:16 -0700 (PDT) X-Google-Smtp-Source: ACcGV62AqTXEc+WcNC+rCmolTqE1tV5bIal5xgt0GTVDHkH4IFv15djhArGRjeg23ohSRvtTZHDX X-Received: by 2002:a63:a047:: with SMTP id u7-v6mr11062430pgn.145.1538063056043; Thu, 27 Sep 2018 08:44:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538063056; cv=none; d=google.com; s=arc-20160816; b=Ue5UPmsuBoCeNiitWcrjKjLlmvxo6thgWLrEr9AnEA7OFpgrd5/1C/4VofKwj0g2f5 RHlOlcc0EQYX52p4PKi+O1jM0zRcEG8wuG3n8jft8NvEhcZgX30WTczMIBP0rR4US+Bh jop4ogutEcGlxZzI2ygKpYwEoATPvafBEFTwoKRQJA/caQBoqoqvVXZ0j5VwruA0+euM l18g3KM7tXicglPssxe9GlAbsIZTUqJ2pB8k1jDhLo7yJ6GKdkP7n1qF687rx5s2AU/E FaaWblBrA+Jh/HshcjQW/IuEcis5d0fFoBdy+HoA5KjcZTlv/7xJJlx0w0ZEZQoox+xO J49g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:autocrypt:openpgp:from:references:cc:to :subject; bh=Rh79K4CtOfSm81bwZQSNpXH1CladBMISlNTaQ7ve5eM=; b=WRSHmvzdVzZ8qIR8JmQwWrVPO6tJhnJ14jgqd7JHmavYXeNiElTkjS1GmPII2ZX6ST CWyuunCxlheZWqnGcreUimcQmdgltbOd2xJ0b+4ix9JbLCF7kuRwzTujSe05ymA1tVFm UIrkgu4Sg4TKfmCO6tDyrJBGto88CwYrauEEMKKxgcUEisb6STCaWyFbykygesPtElhc U21mv7QT7lBPYdRyCHkHr+tXe6qcVfXzjaZkbz5jgObHoVYk57yI7BALnHjaWdsxl2Gd fzdBmUxfCYS6F8ediBWLG+La+lual1JAm1RvnlStu+GKZSgExpnjKRR4W9Z/uQOtQ/OT 7MYw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y15-v6si2288335plp.371.2018.09.27.08.43.59; Thu, 27 Sep 2018 08:44:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728269AbeI0WAP (ORCPT + 99 others); Thu, 27 Sep 2018 18:00:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48646 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727534AbeI0WAP (ORCPT ); Thu, 27 Sep 2018 18:00:15 -0400 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.25]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7393A3697D; Thu, 27 Sep 2018 15:41:24 +0000 (UTC) Received: from [10.36.116.205] (ovpn-116-205.ams2.redhat.com [10.36.116.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id D77F02015AFD; Thu, 27 Sep 2018 15:41:20 +0000 (UTC) Subject: Re: [PATCH v5 4/4] mm: Defer ZONE_DEVICE page initialization to the point where we init pgmap To: Oscar Salvador , Michal Hocko Cc: Alexander Duyck , linux-mm@kvack.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, pavel.tatashin@microsoft.com, dave.jiang@intel.com, dave.hansen@intel.com, jglisse@redhat.com, rppt@linux.vnet.ibm.com, dan.j.williams@intel.com, logang@deltatee.com, mingo@kernel.org, kirill.shutemov@linux.intel.com References: <20180925200551.3576.18755.stgit@localhost.localdomain> <20180925202053.3576.66039.stgit@localhost.localdomain> <20180926075540.GD6278@dhcp22.suse.cz> <6f87a5d7-05e2-00f4-8568-bb3521869cea@linux.intel.com> <20180927110926.GE6278@dhcp22.suse.cz> <20180927122537.GA20378@techadventures.net> <20180927131329.GI6278@dhcp22.suse.cz> <20180927145035.GA21373@techadventures.net> From: David Hildenbrand Openpgp: preference=signencrypt Autocrypt: addr=david@redhat.com; prefer-encrypt=mutual; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwX4EEwECACgFAljj9eoCGwMFCQlmAYAGCwkI BwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEE3eEPcA/4Na5IIP/3T/FIQMxIfNzZshIq687qgG 8UbspuE/YSUDdv7r5szYTK6KPTlqN8NAcSfheywbuYD9A4ZeSBWD3/NAVUdrCaRP2IvFyELj xoMvfJccbq45BxzgEspg/bVahNbyuBpLBVjVWwRtFCUEXkyazksSv8pdTMAs9IucChvFmmq3 jJ2vlaz9lYt/lxN246fIVceckPMiUveimngvXZw21VOAhfQ+/sofXF8JCFv2mFcBDoa7eYob s0FLpmqFaeNRHAlzMWgSsP80qx5nWWEvRLdKWi533N2vC/EyunN3HcBwVrXH4hxRBMco3jvM m8VKLKao9wKj82qSivUnkPIwsAGNPdFoPbgghCQiBjBe6A75Z2xHFrzo7t1jg7nQfIyNC7ez MZBJ59sqA9EDMEJPlLNIeJmqslXPjmMFnE7Mby/+335WJYDulsRybN+W5rLT5aMvhC6x6POK z55fMNKrMASCzBJum2Fwjf/VnuGRYkhKCqqZ8gJ3OvmR50tInDV2jZ1DQgc3i550T5JDpToh dPBxZocIhzg+MBSRDXcJmHOx/7nQm3iQ6iLuwmXsRC6f5FbFefk9EjuTKcLMvBsEx+2DEx0E UnmJ4hVg7u1PQ+2Oy+Lh/opK/BDiqlQ8Pz2jiXv5xkECvr/3Sv59hlOCZMOaiLTTjtOIU7Tq 7ut6OL64oAq+zsFNBFXLn5EBEADn1959INH2cwYJv0tsxf5MUCghCj/CA/lc/LMthqQ773ga uB9mN+F1rE9cyyXb6jyOGn+GUjMbnq1o121Vm0+neKHUCBtHyseBfDXHA6m4B3mUTWo13nid 0e4AM71r0DS8+KYh6zvweLX/LL5kQS9GQeT+QNroXcC1NzWbitts6TZ+IrPOwT1hfB4WNC+X 2n4AzDqp3+ILiVST2DT4VBc11Gz6jijpC/KI5Al8ZDhRwG47LUiuQmt3yqrmN63V9wzaPhC+ xbwIsNZlLUvuRnmBPkTJwwrFRZvwu5GPHNndBjVpAfaSTOfppyKBTccu2AXJXWAE1Xjh6GOC 8mlFjZwLxWFqdPHR1n2aPVgoiTLk34LR/bXO+e0GpzFXT7enwyvFFFyAS0Nk1q/7EChPcbRb hJqEBpRNZemxmg55zC3GLvgLKd5A09MOM2BrMea+l0FUR+PuTenh2YmnmLRTro6eZ/qYwWkC u8FFIw4pT0OUDMyLgi+GI1aMpVogTZJ70FgV0pUAlpmrzk/bLbRkF3TwgucpyPtcpmQtTkWS gDS50QG9DR/1As3LLLcNkwJBZzBG6PWbvcOyrwMQUF1nl4SSPV0LLH63+BrrHasfJzxKXzqg rW28CTAE2x8qi7e/6M/+XXhrsMYG+uaViM7n2je3qKe7ofum3s4vq7oFCPsOgwARAQABwsFl BBgBAgAPBQJVy5+RAhsMBQkJZgGAAAoJEE3eEPcA/4NagOsP/jPoIBb/iXVbM+fmSHOjEshl KMwEl/m5iLj3iHnHPVLBUWrXPdS7iQijJA/VLxjnFknhaS60hkUNWexDMxVVP/6lbOrs4bDZ NEWDMktAeqJaFtxackPszlcpRVkAs6Msn9tu8hlvB517pyUgvuD7ZS9gGOMmYwFQDyytpepo YApVV00P0u3AaE0Cj/o71STqGJKZxcVhPaZ+LR+UCBZOyKfEyq+ZN311VpOJZ1IvTExf+S/5 lqnciDtbO3I4Wq0ArLX1gs1q1XlXLaVaA3yVqeC8E7kOchDNinD3hJS4OX0e1gdsx/e6COvy qNg5aL5n0Kl4fcVqM0LdIhsubVs4eiNCa5XMSYpXmVi3HAuFyg9dN+x8thSwI836FoMASwOl C7tHsTjnSGufB+D7F7ZBT61BffNBBIm1KdMxcxqLUVXpBQHHlGkbwI+3Ye+nE6HmZH7IwLwV W+Ajl7oYF+jeKaH4DZFtgLYGLtZ1LDwKPjX7VAsa4Yx7S5+EBAaZGxK510MjIx6SGrZWBrrV TEvdV00F2MnQoeXKzD7O4WFbL55hhyGgfWTHwZ457iN9SgYi1JLPqWkZB0JRXIEtjd4JEQcx +8Umfre0Xt4713VxMygW0PnQt5aSQdMD58jHFxTk092mU+yIHj5LeYgvwSgZN4airXk5yRXl SE+xAvmumFBY Organization: Red Hat GmbH Message-ID: <27f1265b-a8b4-b203-077b-468d1e823cc8@redhat.com> Date: Thu, 27 Sep 2018 17:41:19 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <20180927145035.GA21373@techadventures.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.25 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 27 Sep 2018 15:41:24 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 27/09/2018 16:50, Oscar Salvador wrote: > On Thu, Sep 27, 2018 at 03:13:29PM +0200, Michal Hocko wrote: >> On Thu 27-09-18 14:25:37, Oscar Salvador wrote: >>> On Thu, Sep 27, 2018 at 01:09:26PM +0200, Michal Hocko wrote: >>>>> So there were a few things I wasn't sure we could pull outside of the >>>>> hotplug lock. One specific example is the bits related to resizing the pgdat >>>>> and zone. I wanted to avoid pulling those bits outside of the hotplug lock. >>>> >>>> Why would that be a problem. There are dedicated locks for resizing. >>> >>> True is that move_pfn_range_to_zone() manages the locks for pgdat/zone resizing, >>> but it also takes care of calling init_currently_empty_zone() in case the zone is empty. >>> Could not that be a problem if we take move_pfn_range_to_zone() out of the lock? >> >> I would have to double check but is the hotplug lock really serializing >> access to the state initialized by init_currently_empty_zone? E.g. >> zone_start_pfn is a nice example of a state that is used outside of the >> lock. zone's free lists are similar. So do we really need the hoptlug >> lock? And more broadly, what does the hotplug lock is supposed to >> serialize in general. A proper documentation would surely help to answer >> these questions. There is way too much of "do not touch this code and >> just make my particular hack" mindset which made the whole memory >> hotplug a giant pile of mess. We really should start with some proper >> engineering here finally. > > CC David > > David has been looking into this lately, he even has updated memory-hotplug.txt > with some more documentation about the locking aspect [1]. > And with this change [2], the hotplug lock has been moved > to the online/offline_pages. > > From what I see (I might be wrong), the hotplug lock is there > to serialize the online/offline operations. mem_hotplug_lock is especially relevant for users of get_online_mems/put_online_mems. Whatever affects them, you can't move out of the lock. Everything else is theoretically serialized via device_hotplug_lock now. > > In online_pages, we do (among other things): > > a) initialize the zone and its pages, and link them to the zone > b) re-adjust zone/pgdat nr of pages (present, spanned, managed) > b) check if the node changes in regard of N_MEMORY, N_HIGH_MEMORY or N_NORMAL_MEMORY. > c) fire notifiers > d) rebuild the zonelists in case we got a new zone > e) online memory sections and free the pages to the buddy allocator > f) wake up kswapd/kcompactd in case we got a new node > > while in offline_pages we do the opposite. > > Hotplug lock here serializes the operations as a whole, online and offline memory, > so they do not step on each other's feet. > > Having said that, we might be able to move some of those operations out of the hotplug lock. > The device_hotplug_lock coming from every memblock (which is taken in device_online/device_offline) should protect > us against some operations being made on the same memblock (e.g: touching the same pages). Yes, very right. -- Thanks, David / dhildenb