Received: by 2002:ac0:8845:0:0:0:0:0 with SMTP id g63csp793937img; Thu, 28 Feb 2019 08:00:20 -0800 (PST) X-Google-Smtp-Source: AHgI3IZJGOXBpIPXo7uNVR1wz1t3/dA05fvcP4Rfquby9HrsUZEugmYB2OUtL6iyacGp3UKFrKX3 X-Received: by 2002:a63:1a5d:: with SMTP id a29mr8485816pgm.369.1551369620450; Thu, 28 Feb 2019 08:00:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551369620; cv=none; d=google.com; s=arc-20160816; b=gD/by9w1jcih5JSM/MbgqCayHetTmlpDlUnFWA4faFNjqUQ7IJloQRXKakHZJ2FNuu d4TXHXZ1aOlaHvkSMtJd51IDWn21aOGkkc1HrfxKqSX8iSqjdp4UZM6nzEBGCF77kIB1 x0K9ORHcXRzWuzFjE8cOjO5gEMEQzSKPbQbWzjnnu+YsRh2Eff61pB+cTSG5Kr6F/kXw IzVbp/Tqj33LrNiMaJR5guDowxNERPYYPKpN9n1TWIYST7Vt515a8pEAugwyYqs3ItJu aZ0yVdTzYOWjN5P/WQNGmXOZaudgy0TSjgX3h8tEXOSToB/1YxoGmDq+7Gp09SLpJUad 2x7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=S5D2QMDutFUGJ1AbgQc468F84xPwM7FhWp31+MeV1/U=; b=eyS92Fde6JMdW/ufod6jZW9ERAoiK/iLFwkty7JM3bmeymug/GNvlWmF86c0kkWDim R6birOIpmlR61QOa7M2oM3FMqOCb5URg9HdNpwiBFrbbAqadFTJj3lQEhOsLyK7RbZiI NPn0AwV8xTsmGapYTnlm81lpyX+M4MhCTHqwCCNolfRzT8WZihVk37HnMgJ2GQ9lCEKS ttVUMHapkj561dqQsUFA9GEEc+puQmHmA8209Cdc25A0fDxMnKyE98v5DxVsXMAf32IZ tPy0z2R8yzObP6iijPw9Ow7AobRscpc1gSauf5V9dw1z1Ck0KjonAvk959iftDxWCFrY 4q6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 128si19477723pfz.86.2019.02.28.08.00.05; Thu, 28 Feb 2019 08:00:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729880AbfB1P7d (ORCPT + 99 others); Thu, 28 Feb 2019 10:59:33 -0500 Received: from mail-ed1-f68.google.com ([209.85.208.68]:38799 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726343AbfB1P7d (ORCPT ); Thu, 28 Feb 2019 10:59:33 -0500 Received: by mail-ed1-f68.google.com with SMTP id h58so17416638edb.5; Thu, 28 Feb 2019 07:59:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=S5D2QMDutFUGJ1AbgQc468F84xPwM7FhWp31+MeV1/U=; b=ogG8i30u7TYbomGy/+8wsCzyahd7ekI3gM5x6wAi2O6sPpEEATjmH/NcfYuOxkyT1A lnbo0q+udIaQDEH+OWy6uorTBSiW6ptd2t9EY2GzsVK+6WsN+sJ1fRidWl9L90gqC6Gj Ov+sLMKCl+uj5sRZYgHpzz0QDhTkgEdUGbM43pW8csaqX+TzbMGticjn3hD/eE6rhWl+ dKNy+Fxcwvf0hrIbw0LMtJ4vaN6CbJPcHbAlpEOJR5nZ1qq2Wky0ow8ThPR51TRH8L0S popAFhRB6NoxnyLIr+rrD+YbiC6WZbI2PoetmaAWZUMPfNXcCOGaES3s+XC2hcPCfbvo +95w== X-Gm-Message-State: APjAAAWvovoiC4xMCrOZ8FADaMu/qxqXsJByzz6mDFE8Fro3ggTthksU QWDRt1mzx5NnNg0/csYozshczAf4GB/Tpm1zAoQaXQ== X-Received: by 2002:a50:9e61:: with SMTP id z88mr287620ede.100.1551369571116; Thu, 28 Feb 2019 07:59:31 -0800 (PST) MIME-Version: 1.0 References: <635b2bf8b1151a191cd9299276b75791a818c0c2.1550545163.git.len.brown@intel.com> <07d2908dc72bf964b27380999e1c826587d69136.1550545163.git.len.brown@intel.com> <20190220105542.GB17969@hirez.programming.kicks-ass.net> <20190226135426.GU32477@hirez.programming.kicks-ass.net> In-Reply-To: <20190226135426.GU32477@hirez.programming.kicks-ass.net> From: Len Brown Date: Thu, 28 Feb 2019 10:59:19 -0500 Message-ID: Subject: Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support To: Peter Zijlstra Cc: X86 ML , linux-kernel@vger.kernel.org, Len Brown , linux-doc@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 26, 2019 at 8:54 AM Peter Zijlstra wrote: > > > It would've been nice to have the CPUID instruction 1F leaf reference > > > 3B-3.9 in the SDM, and maybe mention this here too. > > > > I didn't mention SDM sections because they change -- leaving stale > > pointers in our commit messages. The SDM is re-published 4 times per > > year. > > Yah, I know. Which is why I keep all SDMs. So if you say, book 3 section > 8 of Jul'17, I can find it :-) The SDM is like software -- usually (but not always) you are better off with the latest version:-) > > Cache enumeration in Leaf-4 is totally unchanged. > > ACPI NUMA tables are totally unchanged. > > Sure; and yet Sub-NUMA-Clustering broke stuff in interesting ways. I'm > trying to get a feel for how these levels will interact with all that. > > Before that SNC stuff, caches had never spanned NODEs (and I still > think that is 'creative' at best). Yeah, SNC is sort of a curve ball. I guess it made enough stuff run better that it is available as an option. But it doesn't help everything, so it is disabled by default... I think from a scheduler point of view, sticking with the output of CPUID.4 for the cache topology, and the ACPI tables for the node topology/distances, is the right strategy. > > From a scheduler point of view, imagine that a SKX system with 4 die > > in 4 packages was mechanically re-designed so that those 4 die resided > > in 2 double-sized packages. > > > > They may have tweaked the links between the die, but logically it is > > identical and compatible, and the legacy kernel will function > > properly. > > This example has LLC in die and yes that works. > > But I can imagine things like L2 in tile and L3 across tiles but within > DIE and then it _might_ make sense to still consider the tile for > scheduling. > > Another option is having the LLC off die; also not unheard of. > > And then there's many creative and slightly crazy ways this can all be > combined :/ If any of those crazy things happen, CPUID.B/CPUID.1F are not going to help software understand it -- CPUID.4 and the NUMA tables are the tool of choice. > > So the effect of Leaf B,1F is that it defines the scope of MSRs. eg. > > what processors does a die-scope MSR cover. That is why the rest of > > the patch is about sysfs topology, and about package MSR scope. > > > > Yes, there will be more exotic MSR situations in future products -- > > the first ones are pretty simple -- something called a > > package-scope-MSR in the SDM today becomes a die-scope-MSR in this > > generation on a multi-die/package system. > > Yes :-( > > > It also reflects how many packages appear in sysfs, and this can > > effect licensing of some kinds of software. > > That's just plain insanity and we should not let that affect our sysfs > interfaces. This change isn't made for compatibility with per-package licensing. Indeed, vendors, who license based on package-count need to be made aware that on a system with multi-die/package, they'll see their package count go _down_ as a result of this change. Thankfully, I'm told that per-package licensing is quite rare -- most stuff that cares has moved to per-CPU. I think a good semantic side effect of this series is that it maintains the invariant that a physical package and a socket are synonymous. While we don't use the word "socket" in Linux anymore, the industry broadly assume that the two are synonyms. And people expect that a physical package really is a physical package -- you can see it, buy it in a box, and hold it in your hand. Functionally, the bottom line is that it allows software to discover topology levels that previously needed to be discovered by looking up family/model, in the past, which was sort of annoying. The things that care are things that care about MSR scope. Thankfully, the list of things that care about MSR scope is quite finite. thanks, -Len -- Len Brown, Intel Open Source Technology Center