Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp3546492img; Mon, 25 Mar 2019 12:30:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqzNfirZi+z4KSRC1qgFK/KEjHLEohoG7QcKBvqGABcdD0Ym/0citDTCrDpHSlwmbDowzPxB X-Received: by 2002:a17:902:3f83:: with SMTP id a3mr25581704pld.6.1553542256769; Mon, 25 Mar 2019 12:30:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553542256; cv=none; d=google.com; s=arc-20160816; b=bOotG0nmL4WIGQpgR7c12PI0/7XZdv/q5xHiS1cSkj7A3OuTaeFC6JpxJ/zSfPrg6m ntKyP73I/CsXs05eYj43sTJfquJbLNlNl4OVafRH43Uvjz8XaSsBWnj8XDnFEYctaW3L zfOM1H21iCtUmlKsnaD9Mi0Y5s5rEdy6k1W3QBnxX7yQMuyv8u0cjjfMtLe4nVFvwaBZ Uux8rwql6kJKLlAU0XdEg4DYcK0FCo4t5tKolTjIt6o9wPsiJ31I9UPox7hP5H6oIM9E px5g4r00XH0DJvedYC8igAkvIl5g/zp54Osi55vXg/Oxl+D5Fuevec6qYERa9pFm2L+h 292g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=hXTt04+uP3CF2N6BhBBgJjHnEUj/7PvEFKIa60d+Aqg=; b=N3bRfp56Q7ylaBE8B08EbVlbqIt+TZMjYJV9ind1twZSYol/bnW5VaIep3ZNFUFkzv S2HN1ZEYE/796xmUvaYu7X1LouGatmnH5DtDZawjds6hKDrOd8/Zj1II5nr2dTQcD5jH g/R6SxF8nQfDlAqK+QbjBWS2GuQ+Wz41+/hB7+pT9vCzOilJeaaL48WIcL8zYkyBY6eg h/7Oq6LEXJT7pNqjEiAqKypgKfSS0V3Rtf6jVm8KFxtJ95Crecw1x+jELnZviccou+lK HRctplwbX6003a2Xs+bJ23rp0ub7W7TB42Um2C4lp6dv0i7jeZgW0raWBZGtB7yUJjEo 1dAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=1mdk+MBy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y17si3439336plr.204.2019.03.25.12.30.41; Mon, 25 Mar 2019 12:30:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=1mdk+MBy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730015AbfCYTaA (ORCPT + 99 others); Mon, 25 Mar 2019 15:30:00 -0400 Received: from mail-ot1-f65.google.com ([209.85.210.65]:33873 "EHLO mail-ot1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729548AbfCYTaA (ORCPT ); Mon, 25 Mar 2019 15:30:00 -0400 Received: by mail-ot1-f65.google.com with SMTP id k21so7802049otf.1 for ; Mon, 25 Mar 2019 12:30:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=hXTt04+uP3CF2N6BhBBgJjHnEUj/7PvEFKIa60d+Aqg=; b=1mdk+MBy7JP5R/Slb3flUfuhDER9hIZOWP46hDJe4hA2ypCGSWakbhLIz+M86ZE5bs aOlurSVep85tCgTR1UQb/q6fr5Iwc+Wn7GO2M7iZ9MHdokXYk4cjabJofPmsycdAw4/S N31aGtC9FFAbwvAMOaZtnxe8vfDu5ROV/RUu13S2iAV44eJRiFXYJCVAyheQw0FbA3kL 6yPvH8c5oxG4U6Qkq5/jjQMOHYbjexhhDqFUxBFQc8p2NkV9GlPJt2vtA8j4UGaw3XJm U5RgW2tclPEMtg+IkqJve0Uo3wMTR+hazLxQYs20HbwPGSdMb6u4l3/grQ0RUnYvq04x jGSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=hXTt04+uP3CF2N6BhBBgJjHnEUj/7PvEFKIa60d+Aqg=; b=VhN/ILNLy1XXzfBcOoZJhXf5X3qgrnj9x9njVaD5RH7EiXDcHhHYeREGU0p5nBX5Dj g7xj3KGqHkRJulI4nmZxq6tj2P8754jdt2iMAUFvUkQfYwfxMOFOyjbTXHWoGiZLTQ3O Ozo6+ZOkj5FdubWbin53b/PUXyExJf5GF2RDrKt/jdy8kFRYDSOQ2Zel/NuT3ktvvHnD QKeii91poZTI4JWcemHXwj9IN9C6peEHasfg0ZcT320BJ2qyuK9wLxNNjOay5i0wFb0G SMbS9jKZfgZSENbKRUp2gJgp923ZiuDwsZ6j7CIfqr4PhEiFyxoqsuAQmUuNycILZLm7 HqHg== X-Gm-Message-State: APjAAAXdsh9ABIe4nisvoSpOUfd11O2vVwrm2b4duvzHBOZ+psvHzDcE Hg5kzLLljSuVjzVUE1aTsWNeJan8hlykohOl2lfDqg== X-Received: by 2002:a9d:224a:: with SMTP id o68mr20030389ota.214.1553542199941; Mon, 25 Mar 2019 12:29:59 -0700 (PDT) MIME-Version: 1.0 References: <1553316275-21985-1-git-send-email-yang.shi@linux.alibaba.com> <3df2bf0e-0b1d-d299-3b8e-51c306cdc559@inria.fr> In-Reply-To: <3df2bf0e-0b1d-d299-3b8e-51c306cdc559@inria.fr> From: Dan Williams Date: Mon, 25 Mar 2019 12:29:48 -0700 Message-ID: Subject: Re: [RFC PATCH 0/10] Another Approach to Use PMEM as NUMA Node To: Brice Goglin Cc: Yang Shi , Michal Hocko , Mel Gorman , Rik van Riel , Johannes Weiner , Andrew Morton , Dave Hansen , Keith Busch , Fengguang Wu , "Du, Fan" , "Huang, Ying" , Linux MM , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 25, 2019 at 10:45 AM Brice Goglin wrote= : > > Le 25/03/2019 =C3=A0 17:56, Dan Williams a =C3=A9crit : > > > > I'm generally against the concept that a "pmem" or "type" flag should > > indicate anything about the expected performance of the address range. > > The kernel should explicitly look to the HMAT for performance data and > > not otherwise make type-based performance assumptions. > > > Oh sorry, I didn't mean to have the kernel use such a flag to decide of > placement, but rather to expose more information to userspace to clarify > what all these nodes are about when userspace will decide where to > allocate things. I understand, but I'm concerned about the risk of userspace developing vendor-specific, or generation-specific policies around a coarse type identifier. I think the lack of type specificity is a feature rather than a gap, because it requires userspace to consider deeper information. Perhaps "path" might be a suitable replacement identifier rather than type. I.e. memory that originates from an ACPI.NFIT root device is likely "pmem". > I understand that current NVDIMM-F are not slower than DDR and HMAT > would better describe this than a flag. But I have seen so many buggy or > dummy SLIT tables in the past that I wonder if we can expect HMAT to be > widely available (and correct). That's always a fear that the platform BIOS will try to game OS behavior. However, that was the reason that HMAT was defined to indicate actual performance values rather than relative. It is hopefully harder to game than the relative SLIT values, but I'l grant you it's now impossible. > Is there a safe fallback in case of missing or buggy HMAT? For instance, > is DDR supposed to be listed before NVDIMM (or HBM) in SRAT? One fallback might be to make some of these sysfs attributes writable so userspace can correct the situation, but I'm otherwise unclear of what you mean by "safe". If a platform has hard dependencies on correctly enumerating memory performance capabilities then there's not much the kernel can do if the HMAT is botched. I would expect the general case is that the performance capabilities are a soft dependency. but things still work if the data is wrong.