Received: by 2002:ac0:8845:0:0:0:0:0 with SMTP id g63csp894791img; Tue, 26 Feb 2019 10:22:10 -0800 (PST) X-Google-Smtp-Source: AHgI3IZZqhOEpnT3sh5t8V+4yaRB4lPqNbIZeNbqq1FiJHtjR4zbpPCbilMPfmFtd9jUeGPCphQq X-Received: by 2002:a62:ea08:: with SMTP id t8mr27198935pfh.60.1551205330870; Tue, 26 Feb 2019 10:22:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551205330; cv=none; d=google.com; s=arc-20160816; b=H5amdUgjxBTlXk5kmpJFbZAPGxlatYxii2Ie0UxUvwr5DQB/nndQEVUwwVv2CrGc6E 3x5I6VhXXf9JGOMNksvmiatSdBNxJznjMPbY58D5kXzXxcIpf5J9Vk1KipGecDbhrb5A LCjwyt2PDNYUWUhCISFWexsWhI3CJLwwKt/h+E1kYLDNOSglYGuRIhFXMCJWqfzJECUj haTWkWLt3mhGChUPnER67MyVnN8r6KLiINhC/KEGzd2U0dTOpOD+L4idh+HftTnsCu57 LQEluCnNbOSPYoIlPR5xkWDcmcEbJOuQTF/rfJ/B3qyfjZkYewT4lbqm4VryNxQebE+P /bkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=ulR1z8Icbp9hBEZRteavqKxz+nYkOVC2eBJ4G8ARXRE=; b=ui6/ebia5k8aMaYZEA6JaUEkWqp2c9+VrwEptezTbxcr+7YyeJNL1jiLVkesN5o85C CF8DrueUqBE9JidyTxJaT2NQIkoehhfbipaCjGgp3RHTwUftPX7zFdcMVTDzP0RP12We 4/ai6Aeep4i+okrGZD98oXoI7BvIi1JQuKf325ShbCQcoLEzeghENb8iGl0zOdQlmTg4 sPHLMkBWqktI8Vwsm9349YmsR2YEXEde72uzPOJEQ6pNnZgF/bu383YEgrdGFpNOF5Yw NuFF9lgVCISD+98Wn9TiNgExZheYeB8rHnRBxBcQ/83vIVUMO9R8x6kx5zoPpz6xOMLv VCLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m11si12534581pfh.47.2019.02.26.10.21.55; Tue, 26 Feb 2019 10:22:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729067AbfBZSUJ (ORCPT + 99 others); Tue, 26 Feb 2019 13:20:09 -0500 Received: from mx2.suse.de ([195.135.220.15]:50196 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728793AbfBZSUJ (ORCPT ); Tue, 26 Feb 2019 13:20:09 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 00E02AD05; Tue, 26 Feb 2019 18:20:07 +0000 (UTC) Date: Tue, 26 Feb 2019 19:20:07 +0100 From: Michal Hocko To: Qian Cai Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC Message-ID: <20190226182007.GH10588@dhcp22.suse.cz> References: <20190225191710.48131-1-cai@lca.pw> <20190226123521.GZ10588@dhcp22.suse.cz> <4d4d3140-6d83-6d22-efdb-370351023aea@lca.pw> <20190226142352.GC10588@dhcp22.suse.cz> <1551203585.6911.47.camel@lca.pw> <20190226181648.GG10588@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190226181648.GG10588@dhcp22.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 26-02-19 19:16:48, Michal Hocko wrote: > On Tue 26-02-19 12:53:05, Qian Cai wrote: > > On Tue, 2019-02-26 at 15:23 +0100, Michal Hocko wrote: > > > On Tue 26-02-19 09:16:30, Qian Cai wrote: > > > > > > > > > > > > On 2/26/19 7:35 AM, Michal Hocko wrote: > > > > > On Mon 25-02-19 14:17:10, Qian Cai wrote: > > > > > > When onlining memory pages, it calls kernel_unmap_linear_page(), > > > > > > However, it does not call kernel_map_linear_page() while offlining > > > > > > memory pages. As the result, it triggers a panic below while onlining on > > > > > > ppc64le as it checks if the pages are mapped before unmapping, > > > > > > Therefore, let it call kernel_map_linear_page() when setting all pages > > > > > > as reserved. > > > > > > > > > > This really begs for much more explanation. All the pages should be > > > > > unmapped as they get freed AFAIR. So why do we need a special handing > > > > > here when this path only offlines free pages? > > > > > > > > > > > > > It sounds like this is exact the point to explain the imbalance. When > > > > offlining, > > > > every page has already been unmapped and marked reserved. When onlining, it > > > > tries to free those reserved pages via __online_page_free(). Since those > > > > pages > > > > are order 0, it goes free_unref_page() which in-turn call > > > > kernel_unmap_linear_page() again without been mapped first. > > > > > > How is this any different from an initial page being freed to the > > > allocator during the boot? > > > > > > > As least for IBM POWER8, it does this during the boot, > > > > early_setup > > early_init_mmu > > harsh__early_init_mmu > > htab_initialize [1] > > htab_bolt_mapping [2] > > > > where it effectively map all memblock regions just like > > kernel_map_linear_page(), so later mem_init() -> memblock_free_all() will unmap > > them just fine. > > > > [1] > > for_each_memblock(memory, reg) { > > base = (unsigned long)__va(reg->base); > > size = reg->size; > > > > DBG("creating mapping for region: %lx..%lx (prot: %lx)\n", > > base, size, prot); > > > > BUG_ON(htab_bolt_mapping(base, base + size, __pa(base), > > prot, mmu_linear_psize, mmu_kernel_ssize)); > > } > > > > [2] linear_map_hash_slots[paddr >> PAGE_SHIFT] = ret | 0x80; > > Thanks for the clarification. I would have expected that there is a > generic path to do kernel_map_pages from an appropriate place. I am also > wondering whether blowing up is actually the right thing to do. Is the > ppc specific code correct? Isn't your patch simply working around a > bogus condition? Btw. what happens if the offlined pfn range is removed completely? Is the range still mapped? What kind of consequences does this have? Also when does this tweak happens on a completely new hotplugged memory range? -- Michal Hocko SUSE Labs