Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp726725pxu; Wed, 6 Jan 2021 02:45:45 -0800 (PST) X-Google-Smtp-Source: ABdhPJwQYKW+5uADdvLowrV4H1z3CWUjYBktu/6xniB6ZCGOOnpQutPOdOCnEdgGx/G2NZBus3vN X-Received: by 2002:a17:906:9acc:: with SMTP id ah12mr2359591ejc.386.1609929945180; Wed, 06 Jan 2021 02:45:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1609929945; cv=none; d=google.com; s=arc-20160816; b=VCdCk9kC9jHlZQxB8KVmW89l8yvmj/yfjDOiOenhVSbXbxsYgV7F75tQpjHD1nv2LR 3boyNq+cCZxMlul6iymtnrFdP0NUQSCXhQq3UE5okdma6wtf2EeqjFIgYP14lNpHhzla cCxtYCKuYVu9pg44/PIjZ68MN6tFw8C3isfgnT4321D/sklxGwsvlpX3adSSGAUsv8uD BavecuGK6a3miW/zFnJ0zWA6jfDwi1+VENZW9SWrl8ml7G/uWVs5NfAR7fzSphKT4sbH HtynfVGPwOh9fP5KcCVsmGpQpKe86oBVgB8rb0Rk2uOFVvhjo1Ijs1eQqpg7kkTr9aTh yGGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=GG7/WfWjFg7HmMqm1r9UvZA47ZQRvGK/QFKsgnLBag8=; b=nDn1Wv+y9tr1Bn7OGgN5QqOFkK0OBlrjAqMj3mJi+TltGJwpH94X3Rof7z0oh+cFFc gqZt9436R59U/6uHrLvLKM9ZYDGujfzcgu+yzu1zVT9oCFdg2uceh5HNM0/O1kpuRTWO HSQlNIMQ7hNI11RyxZpFP07ZcfCus0dgy3c6zhMlVoWOH6zSmzcqaanqQLDJwuWtM3Vg RN4DJxwPXeBsqsTrMgE1q4ThqPtDp+EKdYEPxSw/XowAoLxqwMTe0NUHaH6Qns4TD8+b zj3xWKVi7pKxftw1hoGc4+FgB5aYC3aYrYGo0I2BPo1B8BUtAo0wqMCAlMkglzkRxRpR XdSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=Nqjs9A6d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id jg16si739602ejc.580.2021.01.06.02.45.21; Wed, 06 Jan 2021 02:45:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=Nqjs9A6d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726452AbhAFKns (ORCPT + 99 others); Wed, 6 Jan 2021 05:43:48 -0500 Received: from mx2.suse.de ([195.135.220.15]:35446 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726492AbhAFKns (ORCPT ); Wed, 6 Jan 2021 05:43:48 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1609929781; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=GG7/WfWjFg7HmMqm1r9UvZA47ZQRvGK/QFKsgnLBag8=; b=Nqjs9A6doM4Eyhsu2K3pyah9DaBWnDmoT/2f/7WDonwvLvK1MClOhL4McUr6SxQ/TL/DSc m3il7pMsPj6HyQr1r3keyPBv9JCgqhculoghp9vBYfNNsuaJzq53zYs6xn+JXM4+kBxpSH y7q7ThejLj8/zKA3m8jWjg8/J964dUY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id B4FAFABC4; Wed, 6 Jan 2021 10:43:01 +0000 (UTC) Date: Wed, 6 Jan 2021 11:42:55 +0100 From: Michal Hocko To: David Hildenbrand Cc: Dan Williams , linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: Teach pfn_to_online_page() about ZONE_DEVICE section collisions Message-ID: <20210106104255.GK13207@dhcp22.suse.cz> References: <160990599013.2430134.11556277600719835946.stgit@dwillia2-desk3.amr.corp.intel.com> <785b9095-eca4-8100-33ea-6ae84e02a92e@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <785b9095-eca4-8100-33ea-6ae84e02a92e@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 06-01-21 10:56:19, David Hildenbrand wrote: [...] > Note that this is not sufficient in the general case. I already > mentioned that we effectively override an already initialized memmap. > > --- > > [ SECTION ] > Before: > [ ZONE_NORMAL ][ Hole ] > > The hole has some node/zone (currently 0/0, discussions ongoing on how > to optimize that to e.g., ZONE_NORMAL in this example) and is > PG_reserved - looks like an ordinary memory hole. > > After memremap: > [ ZONE_NORMAL ][ ZONE_DEVICE ] > > The already initialized memmap was converted to ZONE_DEVICE. Your > slowpath will work. > > After memunmap (no poisioning): > [ ZONE_NORMAL ][ ZONE_DEVICE ] > > The slow path is no longer working. pfn_to_online_page() might return > something that is ZONE_DEVICE. > > After memunmap (poisioning): > [ ZONE_NORMAL ][ POISONED ] > > The slow path is no longer working. pfn_to_online_page() might return > something that will BUG_ON via page_to_nid() etc. > > --- > > Reason is that pfn_to_online_page() does no care about sub-sections. And > for now, it didn't had to. If there was an online section, it either was > > a) Completely present. The whole memmap is initialized to sane values. > b) Partially present. The whole memmap is initialized to sane values. > > memremap/memunmap messes with case b) I do not see we ever clear the newly added flag and my understanding is that the subsection removed would lead to get_dev_pagemap returning a NULL. Which would obviously need to be checked for pfn_to_online_page. Or do I miss anything and the above is not the case and we could still get false positives? > Well have to further tweak pfn_to_online_page(). You'll have to also > check pfn_section_valid() *at least* on the slow path. Less-hacky would > be checking it also in the "somehwat-faster" path - that would cover > silently overriding a memmap that's visible via pfn_to_online_page(). > Might slow down things a bit. > > > Not completely opposed to this, but I would certainly still prefer just > avoiding this corner case completely instead of patching around it. Thanks! Well, I would love to have no surprises either. So far there was not actual argument why the pmem reserved space cannot be fully initialized. On the other hand making sure that pfn_to_online_page sounds like the right thing to do. And having an explicit check for zone device there in a slow path makes sense to me. -- Michal Hocko SUSE Labs