Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3651156yba; Tue, 16 Apr 2019 16:20:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqynhRFqj4g+VIzfbiTV8FTMCGN32CW1xsg8y29+M7mWsTxlHn11sXRw4PVEj5/Qw80d2fpk X-Received: by 2002:a17:902:7081:: with SMTP id z1mr85927267plk.252.1555456852539; Tue, 16 Apr 2019 16:20:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555456852; cv=none; d=google.com; s=arc-20160816; b=lWzkYSneQJp3cnZPtewmw5ZL2/NUAIzHextd1bj+/7P5C4rWm9Jx8dN22gEWV6MI6e qeubjO3e0Bpuk77WtNifvjPzs04gg/gf+iaLfSG+h3UqzBDWfrziWaPdacMMbbk1RCoj UgHb8wfipvPT0wck8s5DY/gLlLXcaSqinTSlCHx8EVuHJTml2i/oUooMxBhHdwNUjDHi aTY5V32kaGidn3mN7BcgrxRpRmomutRd3LhH3cO2anjHwlUMdxiJID4ffRMq10vRjkeW mb/N9T6+wJ+HjM2J+erOwqDdHX/hpauzGRMx7Y22PNFiHhKXIdt0jE4uX7iP3QgzxxHQ Bv+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject; bh=01b00K5G4kWV9TWBe+mmSt1tt+Co/sL/lLHLBrjLBHw=; b=q1dpAjqdUQBpljNwCKcnMycJ0CUkfPn48B2uuOYqbGNqHzsSQEijT33VuQaM8YFJgs 5EgM0Ovlc2J7CXL+ELumyS58yP19CIPi0eHgacLK6elJ/pd9kT15mc+xS5p8fFmhueR0 ekJGBPUscDxAD4dMYSF0f5xXxdZFkz0/4Aguoq+5astgIzfLmkEUJXGi+6kBzK7krlja TfybGlb2pbTT07q+EyiUK0R9mJ3x/prD+9pVonZ4+WCT6XTBn6DyZu5zL4Nu/PVRZVP0 sWxJJKghIz1/FhZGuEymiUIktcuhmKYNlI4HYQjtzmvhFxk2ddtspYrunhASPPtxNAr+ 1Y7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r184si46927669pgr.24.2019.04.16.16.20.36; Tue, 16 Apr 2019 16:20:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730844AbfDPXSp (ORCPT + 99 others); Tue, 16 Apr 2019 19:18:45 -0400 Received: from out30-45.freemail.mail.aliyun.com ([115.124.30.45]:34734 "EHLO out30-45.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728237AbfDPXSo (ORCPT ); Tue, 16 Apr 2019 19:18:44 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R331e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07417;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TPV9h2n_1555456718; Received: from US-143344MP.local(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TPV9h2n_1555456718) by smtp.aliyun-inc.com(127.0.0.1); Wed, 17 Apr 2019 07:18:41 +0800 Subject: Re: [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node From: Yang Shi To: Michal Hocko Cc: mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1554955019-29472-1-git-send-email-yang.shi@linux.alibaba.com> <20190412084702.GD13373@dhcp22.suse.cz> <20190416074714.GD11561@dhcp22.suse.cz> <876768ad-a63a-99c3-59de-458403f008c4@linux.alibaba.com> Message-ID: Date: Tue, 16 Apr 2019 16:18:37 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <876768ad-a63a-99c3-59de-458403f008c4@linux.alibaba.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >>>> Why cannot we start simple and build from there? In other words I >>>> do not >>>> think we really need anything like N_CPU_MEM at all. >>> In this patchset N_CPU_MEM is used to tell us what nodes are cpuless >>> nodes. >>> They would be the preferred demotion target.  Of course, we could >>> rely on >>> firmware to just demote to the next best node, but it may be a >>> "preferred" >>> node, if so I don't see too much benefit achieved by demotion. Am I >>> missing >>> anything? >> Why cannot we simply demote in the proximity order? Why do you make >> cpuless nodes so special? If other close nodes are vacant then just use >> them. And, I'm supposed we agree to *not* migrate from PMEM node (cpuless node) to any other node on reclaim path, right? If so we need know if the current node is DRAM node or PMEM node. If DRAM node, do demotion; if PMEM node, do swap. So, using N_CPU_MEM to tell us if the current node is DRAM node or not. > We could. But, this raises another question, would we prefer to just > demote to the next fallback node (just try once), if it is contended, > then just swap (i.e. DRAM0 -> PMEM0 -> Swap); or would we prefer to > try all the nodes in the fallback order to find the first less > contended one (i.e. DRAM0 -> PMEM0 -> DRAM1 -> PMEM1 -> Swap)? > > > |------|     |------| |------|        |------| > |PMEM0|---|DRAM0| --- CPU0 --- CPU1 --- |DRAM1| --- |PMEM1| > |------|     |------| |------|       |------| > > The first one sounds simpler, and the current implementation does so > and this needs find out the closest PMEM node by recognizing cpuless > node. > > If we prefer go with the second option, it is definitely unnecessary > to specialize any node. >