Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2876812imm; Mon, 10 Sep 2018 07:44:53 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbTGqWEXMQI2lCLt1PVv5mVenXgAVpSXQM6XkClGy8sQCztaH6/9whfgCt/G0brMABDLqIp X-Received: by 2002:a63:9244:: with SMTP id s4-v6mr22752580pgn.141.1536590692978; Mon, 10 Sep 2018 07:44:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536590692; cv=none; d=google.com; s=arc-20160816; b=pEAzblm1GWqPs71Kvk3PZ8ipB9jw5g/ARpjB1E3pdHEXQ1v0bAjdGHz7fRMNyZvr8Q an84SruaSj+0N/EfZ1AMFXXRvorI7f8KXWX0hj+uHB1nwE4CIH90YODjGEaLU98EwaWa ag3VQ9N4KNlC7e9439Dffy0gMUeX4sehB/dfOpR3wxBGbDsQ4eapk+BeI4hNFSjXvN3g 3Tipu/PESd2dY6yEll5tPEB2ZE6g1rU5PRUOn9hdbgV2EF6YFc6VK9sncCb/9oD237Fw Mb5L6IgWanu0XfI7Tk3B32poeeZLRsYsm+opX+jVuNsrIYan1lEL0KDK3p2EMC+4Wir3 xxrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=jWMczB9H/afru8qoZut6myjSuhxUEJTtWtBotAlBK1A=; b=oXC33lzdTSTBOmDsnvhe2dCpINOm8LlYeP6kNDJVZNca1bbavIhVY0GhYWqHOTBgsf ycgGswQi+tQqvftf5pCXwuINEo/B/sQfVXkb40k+EN7wvAziH03PRr/nKMZiiHLb0aFe Sd/ZCq3RdU/kLlp9E8eUN56R3JsoPB1q3Fhzy83RG1AhyMLWDkn9dJDWByESDWeeQPWh 8P7s+AR/7bIVxxDZTAup5ndom3HBrPzGtfQY1GOQ828UWA+hsD2pkJcY2PqGZVYlMGrm 17aKWChWM61ZHbRVbCkm5bXGHbzwr5Afpc8rbVXWJGKgeYIIpXR2HwYyEBJyoxpU8YcL KiiQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z4-v6si16508630pgf.193.2018.09.10.07.44.37; Mon, 10 Sep 2018 07:44:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728520AbeIJTgS (ORCPT + 99 others); Mon, 10 Sep 2018 15:36:18 -0400 Received: from mx2.suse.de ([195.135.220.15]:33232 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728110AbeIJTgS (ORCPT ); Mon, 10 Sep 2018 15:36:18 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 61346AEF3; Mon, 10 Sep 2018 14:41:52 +0000 (UTC) Date: Mon, 10 Sep 2018 16:41:52 +0200 From: Michal Hocko To: Pasha Tatashin Cc: "zaslonko@linux.ibm.com" , Andrew Morton , LKML , Linux Memory Management List , "osalvador@suse.de" , "gerald.schaefer@de.ibm.com" Subject: Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary Message-ID: <20180910144152.GL10951@dhcp22.suse.cz> References: <20180910123527.71209-1-zaslonko@linux.ibm.com> <20180910131754.GG10951@dhcp22.suse.cz> <20180910135959.GI10951@dhcp22.suse.cz> <20180910141946.GJ10951@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 10-09-18 14:32:16, Pavel Tatashin wrote: > On Mon, Sep 10, 2018 at 10:19 AM Michal Hocko wrote: > > > > On Mon 10-09-18 14:11:45, Pavel Tatashin wrote: > > > Hi Michal, > > > > > > It is tricky, but probably can be done. Either change > > > memmap_init_zone() or its caller to also cover the ends and starts of > > > unaligned sections to initialize and reserve pages. > > > > > > The same thing would also need to be done in deferred_init_memmap() to > > > cover the deferred init case. > > > > Well, I am not sure TBH. I have to think about that much more. Maybe it > > would be much more simple to make sure that we will never add incomplete > > memblocks and simply refuse them during the discovery. At least for now. > > On x86 memblocks can be upto 2G on machines with over 64G of RAM. sorry I meant pageblock_nr_pages rather than memblocks. > Also, memory size is way to easy too change via qemu arguments when VM > starts. If we simply disable unaligned trailing memblocks, I am sure > we would get tons of noise of missing memory. > > I think, adding check_hotplug_memory_range() would work to fix the > immediate problem. But, we do need to figure out a better solution. > > memblock design is based on archaic assumption that hotplug units are > physical dimms. VMs and hypervisors changed all of that, and we can > have much finer hotplug requests on machines with huge DIMMs. Yet, we > do not want to pollute sysfs with millions of tiny memory devices. I > am not sure what a long term proper solution for this problem should > be, but I see that linux hotplug/hotremove subsystems must be > redesigned based on the new requirements. Not an easy task though. Anyway, sparse memory modely is highly based on memory sections so it makes some sense to have hotplug section based as well. Memblocks as a higher logical unit on top of that is kinda hack. The userspace API has never been properly thought through I am afraid. -- Michal Hocko SUSE Labs