Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1540720imm; Tue, 2 Oct 2018 09:48:06 -0700 (PDT) X-Google-Smtp-Source: ACcGV60hUS1tLRaQ8kx/RsPzFQigGO0etzOS6Gr2kW7bpLFYAVdAXdan0U5NhIdLX4eQgLijLBFd X-Received: by 2002:a62:1e83:: with SMTP id e125-v6mr17308760pfe.231.1538498886263; Tue, 02 Oct 2018 09:48:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538498886; cv=none; d=google.com; s=arc-20160816; b=IboDrarX59iVdbTWW+dWJflvDB/OlIKL6HRiu8erHVgXRi1EqpG6Pk/S50NUYr53xO sf5kc1ek2VSXBNySHoQW4QcaEzcvrsAsAUxHgw0e4Gu1qQlpy02lmKpKzruQreM/9dr1 Inr9BmXF0accNrMJpx47beDc1Q9EAGLWHjM7DNx3RSqTtVBQLW4UlKrDOIr/cB37c3ND 99SdXI45SMonZxDLqcvByjSk34/2GkqvBrHdLSfEb9WeKO0p/XV1N7sAOURTykDgRMzp Lv4u7pSfwVa2R74aJ1jFg7ZHHX+Xy1usOFP/qFobJmF3RbWGO/OnuILjulDBTrOOXMil hXNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=OcnOvr2jk5BaOr11hm+T15atwvA4Hc6cS13kFbrUdRg=; b=0X+YI7HAch38L9/wQvkKC9pANG4KPi8fIiDn3WbTgJyZive/OVJWPuxYFCY1dhSGe5 IXV2OZbEEuwIpwHu3G6JyMFV30JebbnoF8m52i7RlJkg5ou998kQN3qvwSC/eEIOiPQ4 ZwlzDmij3wpaHR6IHT93h4YnHEr9aAxQUWi7AlxHRXrQeFO45rsdMPNHykL3p2u6kZcX goJiDbBQ2rKt+bybFSvXskkw7gnD6IWjF+6wthPZPi4AOStwR8XbTkDjOm5erDdsIfd2 KiaC/7pDR67to8O8INFCvMpG2yIf8QGev7JbOYgfm1XrDJe6I8N20RHb6k2vZOPAeFXJ 5rWQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k80-v6si17006389pfg.42.2018.10.02.09.47.50; Tue, 02 Oct 2018 09:48:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729270AbeJBWs7 (ORCPT + 99 others); Tue, 2 Oct 2018 18:48:59 -0400 Received: from mx2.suse.de ([195.135.220.15]:45872 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729154AbeJBWs7 (ORCPT ); Tue, 2 Oct 2018 18:48:59 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 058C9AED5; Tue, 2 Oct 2018 16:04:51 +0000 (UTC) Date: Tue, 2 Oct 2018 18:04:46 +0200 From: Michal Hocko To: Michael Bringmann Cc: Tyrel Datwyler , Thomas Falcon , Kees Cook , Mathieu Malaterre , Pavel Tatashin , Nicholas Piggin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Mauricio Faria de Oliveira , Juliet Kim , Thiago Jung Bauermann , Nathan Fontenot , Andrew Morton , YASUAKI ISHIMATSU , linuxppc-dev@lists.ozlabs.org, Dan Williams , Oscar Salvador Subject: Re: [PATCH] migration/mm: Add WARN_ON to try_offline_node Message-ID: <20181002160446.GA18290@dhcp22.suse.cz> References: <20181001185616.11427.35521.stgit@ltcalpine2-lp9.aus.stglabs.ibm.com> <20181001202724.GL18290@dhcp22.suse.cz> <20181002145922.GZ18290@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 02-10-18 10:14:49, Michael Bringmann wrote: > On 10/02/2018 09:59 AM, Michal Hocko wrote: > > On Tue 02-10-18 09:51:40, Michael Bringmann wrote: > > [...] > >> When the device-tree affinity attributes have changed for memory, > >> the 'nid' affinity calculated points to a different node for the > >> memory block than the one used to install it, previously on the > >> source system. The newly calculated 'nid' affinity may not yet > >> be initialized on the target system. The current memory tracking > >> mechanisms do not record the node to which a memory block was > >> associated when it was added. Nathan is looking at adding this > >> feature to the new implementation of LMBs, but it is not there > >> yet, and won't be present in earlier kernels without backporting a > >> significant number of changes. > > > > Then the patch you have proposed here just papers over a real issue, no? > > IIUC then you simply do not remove the memory if you lose the race. > > The problem occurs when removing memory after an affinity change > references a node that was previously unreferenced. Other code > in 'kernel/mm/memory_hotplug.c' deals with initializing an empty > node when adding memory to a system. The 'removing memory' case is > specific to systems that perform LPM and allow device-tree changes. > The powerpc kernel does not have the option of accepting some PRRN > requests and accepting others. It must perform them all. I am sorry, but you are still too cryptic for me. Either there is a correctness issue and the the patch doesn't really fix anything or the final race doesn't make any difference and then the ppc code should be explicit about that. Checking the node inside the hotplug core code just looks as a wrong layer to mitigate an arch specific problem. I am not saying the patch is a no-go but if anything we want a big fat comment explaining how this is possible because right now it just points to an incorrect API usage. That being said, this sounds pretty much ppc specific problem and I would _prefer_ it to be handled there (along with a big fat comment of course). -- Michal Hocko SUSE Labs