Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp3724533imd; Mon, 29 Oct 2018 11:19:07 -0700 (PDT) X-Google-Smtp-Source: AJdET5cq0hnzNLzeZfFOKN4eJRg/JdSyKBWLXhKeU5llg5Dc/8JTzJFcaaUsG9WQsjx5iZ3thNy8 X-Received: by 2002:a62:83c2:: with SMTP id h185-v6mr3408437pfe.187.1540837147353; Mon, 29 Oct 2018 11:19:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540837147; cv=none; d=google.com; s=arc-20160816; b=c5X/fJoX9aKuo2IrcdTTv9qCNUVfG7HlXw+4CblLuYGYKeN/RihtHdxZKf2lwxM8AI V7DGVcrrbYTIGi1puBU8ztYudeqgMm3NbaRSCnEo3HMzfJHeqP03DxqDkkr1aeaAl4VW 0XSyM+p63EAIBCRHzj48uuEv/cGgu5vxBc4lqHZ4nQxuxh6BQk2c3agNNHrixsQZgv1z Lv/XcOGTF4/GewJMWlhPwK2FPZsTQ12Ise1MtswVltd9KcnjofAYSTYDaaMKqxThV70V idqZWQsKCEj8cLDl+cZLnilXFve07dao6Jj7NHObJk7ZBL4Q6e2xmjwcU+K4LOEwvPix ySsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=dS+PPSIH28uUTJaG2BMdFSn59I6oRC3ImnXM3ntPcqw=; b=bsV2QNvhZDS41AEu++zM/48TS48/AZHbXFA6j6O+YUYjmet0ZE7Gevx49/3shECNo0 SqaCYrANJn0LqawbTauijZBi92CuTuzlrEh/iW2Tax97dMqguKSTy12kCEGVhK1QZyts v+72ikGUnmIqOOTWWStI4LcNRNQlVlVoPN2FCjMM3ooDAezLYteG12dpXRpRL5BOR2lK O65PcoWTIO6bifq6eud0oRUB8a9NHq/rxtDyLI2N0hTM+LoVumihjaQZGgtmW9UD7KRu bGxiZrFBVT5Avm6JD3lPZXv/mnyKwJGaqDkaK/BeqOHD42po10hUq1sXVL0uYGgHjc+d YG7Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a10-v6si9356796pfi.222.2018.10.29.11.18.51; Mon, 29 Oct 2018 11:19:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729194AbeJ3DIR (ORCPT + 99 others); Mon, 29 Oct 2018 23:08:17 -0400 Received: from mx2.suse.de ([195.135.220.15]:57634 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729178AbeJ3DIR (ORCPT ); Mon, 29 Oct 2018 23:08:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 5511CAF64; Mon, 29 Oct 2018 18:18:28 +0000 (UTC) Date: Mon, 29 Oct 2018 19:18:27 +0100 From: Michal Hocko To: Alexander Duyck Cc: Dan Williams , Linux MM , Andrew Morton , Linux Kernel Mailing List , linux-nvdimm , Pasha Tatashin , Dave Hansen , =?iso-8859-1?B?Suly9G1l?= Glisse , rppt@linux.vnet.ibm.com, Ingo Molnar , "Kirill A. Shutemov" , yi.z.zhang@linux.intel.com Subject: Re: [PATCH v5 4/4] mm: Defer ZONE_DEVICE page initialization to the point where we init pgmap Message-ID: <20181029181827.GO32673@dhcp22.suse.cz> References: <20181011085509.GS5873@dhcp22.suse.cz> <6f32f23c-c21c-9d42-7dda-a1d18613cd3c@linux.intel.com> <20181017075257.GF18839@dhcp22.suse.cz> <971729e6-bcfe-a386-361b-d662951e69a7@linux.intel.com> <20181029141210.GJ32673@dhcp22.suse.cz> <84f09883c16608ddd2ba88103f43ec6a1c649e97.camel@linux.intel.com> <20181029163528.GL32673@dhcp22.suse.cz> <18dfc5a0db11650ff31433311da32c95e19944d9.camel@linux.intel.com> <20181029172415.GM32673@dhcp22.suse.cz> <8e7a4311a240b241822945c0bb4095c9ffe5a14d.camel@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8e7a4311a240b241822945c0bb4095c9ffe5a14d.camel@linux.intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 29-10-18 10:42:33, Alexander Duyck wrote: > On Mon, 2018-10-29 at 18:24 +0100, Michal Hocko wrote: > > On Mon 29-10-18 10:01:28, Alexander Duyck wrote: [...] > > > So there end up being a few different issues with constructors. First > > > in my mind is that it means we have to initialize the region of memory > > > and cannot assume what the constructors are going to do for us. As a > > > result we will have to initialize the LRU pointers, and then overwrite > > > them with the pgmap and hmm_data. > > > > Why we would do that? What does really prevent you from making a fully > > customized constructor? > > It is more an argument of complexity. Do I just pass a single pointer > and write that value, or the LRU values in init, or do I have to pass a > function pointer, some abstracted data, and then call said function > pointer while passing the page and the abstracted data? I though you have said that pgmap is the current common denominator for zone device users. I really do not see what is the problem to do diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 89d2a2ab3fe6..9105a4ed2c96 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5516,7 +5516,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, not_early: page = pfn_to_page(pfn); - __init_single_page(page, pfn, zone, nid); + if (pgmap && pgmap->init_page) + pgmap->init_page(page, pfn, zone, nid, pgmap); + else + __init_single_page(page, pfn, zone, nid); if (context == MEMMAP_HOTPLUG) SetPageReserved(page); that would require to replace altmap throughout the call chain and replace it by pgmap. Altmap could be then renamed to something more clear diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 89d2a2ab3fe6..048e4cc72fdf 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5474,8 +5474,8 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, * Honor reservation requested by the driver for this ZONE_DEVICE * memory */ - if (altmap && start_pfn == altmap->base_pfn) - start_pfn += altmap->reserve; + if (pgmap && pgmap->get_memmap) + start_pfn = pgmap->get_memmap(pgmap, start_pfn); for (pfn = start_pfn; pfn < end_pfn; pfn++) { /* [...] > If I have to implement the code to verify the slowdown I will, but I > really feel like it is just going to be time wasted since we have seen > this in other spots within the kernel. Please try to understand that I am not trying to force you write some artificial benchmarks. All I really do care about is that we have sane interfaces with reasonable performance. Especially for one-off things in relattively slow paths. I fully recognize that ZONE_DEVICE begs for a better integration but really, try to go incremental and try to unify the code first and microptimize on top. Is that way too much to ask for? Anyway we have gone into details while the primary problem here was that the hotplug lock doesn't scale AFAIR. And my question was why cannot we pull move_pfn_range_to_zone and what has to be done to achieve that. That is a fundamental thing to address first. Then you can microptimize on top. -- Michal Hocko SUSE Labs