Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp3108356pxu; Mon, 19 Oct 2020 04:17:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx7FAfxzzH1xuoFmsWeUk4QxvHYmwi7fVhuPgiNTc61gALLSzUR4F4TyeO3TUW6wphZt5W1 X-Received: by 2002:a17:907:94cb:: with SMTP id dn11mr16975226ejc.326.1603106278335; Mon, 19 Oct 2020 04:17:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603106278; cv=none; d=google.com; s=arc-20160816; b=UmMR4DOc7TKFGmkt1WHMfBy50kmkpv4p80gBHrl0njqInjQm6A3247lOOQTyFJoc8q 4wA3yjsYVC7cElfvRSEg0wk6IyBfxrl0Tj0IaoiCHem/s74K+fjklOJ5vLHvCJU28bCA w80RDYHPkc8X0hGUrLSVkvj3U/sxS5Z3+vMexxf04zgwwclyBj3kpGJmgSBR83oGpzPt 2FxeOvwtZqor1789t4o4B7j7t84OnLvFm1G0ias6JQAqIusC0ZPXg7cEFMUTwF7rEZGL fbAdJnp9xL3ki6LOJq9inPWvLmjsamShXUIpJxW2sn7E0IUzRmOQOlZqRmbuBEc+EFac G/BA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:autocrypt:from :references:cc:to:subject; bh=kSpXA0YIhxYnrvWZK+9t5/Mbgptges8Hosr30SeXebI=; b=MgrGeA1tjsYLKgqQkedFNB9UOA3U3JiHGtTWT7W9ECl7LSObOUSd9sBJggh7hHjEaO ERmMFajXwatLmWo/iWmT0ZyGuv4awtVp3Tn5tVcg7HGQqHLjMH/wDWEHaDtwvhY13Nzp aRZar0D5kZ6e1kDbN+2rVWcX7mfedGJJD8naS7sns4ae9eMTUuI6wXQbq0wNkB+Fv5GI kpYz3PrIsPfO1HzZFtw1jVC9RBENG+GPA5CmltuIDx+/L/Ilqx917yTbosuJPzs3JlgR QaK3YrmjVAAvDL3qksmreFxcPv8kSLV6UVTiho5vDzmxx22xTmF91Dnu6AvnOM5SJ/wl +Glg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l15si7675359edr.209.2020.10.19.04.17.35; Mon, 19 Oct 2020 04:17:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728628AbgJSKH3 (ORCPT + 99 others); Mon, 19 Oct 2020 06:07:29 -0400 Received: from mail3-relais-sop.national.inria.fr ([192.134.164.104]:51919 "EHLO mail3-relais-sop.national.inria.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727192AbgJSKH2 (ORCPT ); Mon, 19 Oct 2020 06:07:28 -0400 X-Greylist: delayed 429 seconds by postgrey-1.27 at vger.kernel.org; Mon, 19 Oct 2020 06:07:26 EDT X-IronPort-AV: E=Sophos;i="5.77,394,1596492000"; d="scan'208";a="362167706" Received: from nat-inria-interne-52-gw-01-bso.bordeaux.inria.fr (HELO [10.204.4.248]) ([194.199.1.52]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 19 Oct 2020 12:00:15 +0200 Subject: Re: [RFC PATCH] topology: Represent clusters of CPUs within a die. To: Jonathan Cameron , linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Len Brown , Greg Kroah-Hartman , Sudeep Holla , guohanjun@huawei.com, Will Deacon , linuxarm@huawei.com References: <20201016152702.1513592-1-Jonathan.Cameron@huawei.com> From: Brice Goglin Autocrypt: addr=Brice.Goglin@inria.fr; prefer-encrypt=mutual; keydata= mQINBFNg91oBEADMfOyfz9iilNPe1Yy3pheXLf5O/Vpr+gFJoXcjA80bMeSWBf4on8Mt5Fg/ jpVuNBhii0Zyq4Lip1I2ve+WQjfL3ixYQqvNRLgfw/FL0gNHSOe9dVFo0ol0lT+vu3AXOVmh AM4IrsOp2Tmt+w89Oyvu+xwHW54CJX3kXp4c7COz79A6OhbMEPQUreerTavSvYpH5pLY55WX qOSdjmlXD45yobQbMg9rFBy1BECrj4DJSpym/zJMFVnyC5yAq2RdPFRyvYfS0c491adD/iw9 eFZY1XWj+WqLSW8zEejdl78npWOucfin7eAKvov5Bqa1MLGS/2ojVMHXJN0qpStpKcueV5Px igX8i4O4pPT10xCXZ7R6KIGUe1FE0N7MLErLvBF6AjMyiFHix9rBG0pWADgCQUUFjc8YBKng nwIKl39uSpk5W5rXbZ9nF3Gp/uigTBNVvaLO4PIDw9J3svHQwCB31COsUWS1QhoLMIQPdUkk GarScanm8i37Ut9G+nB4nLeDRYpPIVBFXFD/DROIEfLqOXNbGwOjDd5RWuzA0TNzJSeOkH/0 qYr3gywjiE81zALO3UeDj8TaPAv3Dmu7SoI86Bl7qm6UOnSL7KQxZWuMTlU3BF3d+0Ly0qxv k1XRPrL58IyoHIgAVom0uUnLkRKHczdhGDpNzsQDJaO71EPp8QARAQABtCRCcmljZSBHb2ds aW4gPEJyaWNlLkdvZ2xpbkBpbnJpYS5mcj6JAjgEEwECACIFAlNg+aMCGwMGCwkIBwMCBhUI AgkKCwQWAgMBAh4BAheAAAoJEESRkPMjWr076RoQAJhJ1q5+wlHIf+YvM0N1V1hQyf+aL35+ BPqxlyw4H65eMWIN/63yWhcxrLwNCdgY1WDWGoiW8KVCCHwJAmrXukFvXjsvShLQJavWRgKH eea12T9XtLc6qY/DEi2/rZvjOCKsMjnc1CYW71jbofaQP6lJsmC+RPWrnL/kjZyVrVrg7/Jo GemLmi/Ny7nLAOt6uL0MC/Mwld14Yud57Qz6VTDGSOvpNacbkJtcCwL3KZDBfSDnZtSbeclY srXoMnFXEJJjKJ6kcJrZDYPrNPkgFpSId/WKJ5pZBoRsKH/w2OdxwtXKCYHksMCiI4+4fVFD WlmVNYzW8ZKXjAstLh+xGABkLVXs+0WjvC67iTZBXTmbYJ5eodv8U0dCIR/dxjK9wxVKbIr2 D+UVbGlfqUuh1zzL68YsOg3L0Xc6TQglKVl6RxX87fCU8ycIs9pMbXeRDoJohflo8NUDpljm zqGlZxBjvb40p37ReJ+VfjWqAvVh+6JLaMpeva/2K1Nvr9O/DOkSRNetrd86PslrIwz8yP4l FaeG0dUwdRdnToNz6E8lbTVOwximW+nwEqOZUs1pQNKDejruN7Xnorr7wVBfp6zZmFCcmlw9 8pSMV3p85wg6nqJnBkQNTzlljycBvZLVvqc6hPOSXpXf5tjkuUVWgtbCc8TDEQFx8Phkgda6 K1LNuQINBFNg91oBEADp3vwjw8tQBnNfYJNJMs6AXC8PXB5uApT1pJ0fioaXvifPNL6gzsGt AF53aLeqB7UXuByHr8Bmsz7BvwA06XfXXdyLQP+8Oz3ZnUpw5inDIzLpRbUuAjI+IjUtguIK AkU1rZNdCXMOqEwCaomRitwaiX9H7yiDTKCUaqx8yAuAQWactWDdyFii2FA7IwVlD/GBqMWV weZsMfeWgPumKB3jyElm1RpkzULrtKbu7MToMH2fmWqBtTkRptABkY7VEd8qENKJBZKJGisk Fk6ylp8VzZdwbAtEDDTGK00Vg4PZGiIGbQo8mBqbc63DY+MdyUEksTTu2gTcqZMm/unQUJA8 xB4JrTAyljo/peIt6lsQa4+/eVolfKL1t1C3DY8f4wMoqnZORagnWA2oHsLsYKvcnqzA0QtY IIb1S1YatV+MNMFf3HuN7xr/jWlfdt59quXiOHU3qxIzXJo/OfC3mwNW4zQWJkG233UOf6YE rmrSaTIBTIWF8CxGY9iXPaJGNYSUa6R/VJS09EWeZgRz9Gk3h5AyDrdo5RFN9HNwOj41o0cj eLDF69092Lg5p5isuOqsrlPi5imHKcDtrXS7LacUI6H0c8onWoH9LuW99WznEtFgPJg++TAv f9M2x57Gzl+/nYTB5/Kpl1qdPPC91zUipiKbnF5f8bQpol0WC+ovmQARAQABiQIfBBgBAgAJ BQJTYPdaAhsMAAoJEESRkPMjWr074+0P/iEcN27dx3oBTzoeGEBhZUVQRZ7w4A61H/vW8oO8 IPkZv9kFr5pCfIonmHEbBlg6yfjeHXwF5SF2ywWRKkRsFHpaFWywxqk9HWXu8cGR1pFsrwC3 EdossuVbEFNmhjHvcAo11nJ7JFzPTEnlPjE6OY9tEDwl+kp1WvyXqNk9bosaX8ivikhmhB47 7BA3Kv8uUE7UL6p7CBdqumaOFISi1we5PYE4P/6YcyhQ9Z2wH6ad2PpwAFNBwxSu+xCrVmaD skAwknf6UVPN3bt67sFAaVgotepx6SPhBuH4OSOxVHMDDLMu7W7pJjnSKzMcAyXmdjON05Sz SaILwfceByvHAnvcFh2pXK9U4E/SyWZDJEcGRRt79akzZxls52stJK/2Tsr0vKtZVAwogiaK uSp+m6BRQcVVhTo/Kq3E0tSnsTHFeIO6QFHKJCJv4FRE3Dmtz15lueihUBowsq9Hk+u3UiLo SmrMAZ6KgA4SQxB2p8/M53kNJl92HHc9nc//aCQDi1R71NyhtSx+6PyivoBkuaKYs+S4pHmt sFE+5+pkUNROtm4ExLen4N4OL6Kq85mWGf2f6hd+OWtn8we1mADjDtdnDHuv+3E3cacFJPP/ wFV94ZhqvW4QcyBWcRNFA5roa7vcnu/MsCcBoheR0UdYsOnJoEpSZswvC/BGqJTkA2sf Message-ID: <942b4d68-8d19-66d8-c84b-d17eba837e9a@inria.fr> Date: Mon, 19 Oct 2020 12:00:15 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <20201016152702.1513592-1-Jonathan.Cameron@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le 16/10/2020 à 17:27, Jonathan Cameron a écrit : > Both ACPI and DT provide the ability to describe additional layers of > topology between that of individual cores and higher level constructs > such as the level at which the last level cache is shared. > In ACPI this can be represented in PPTT as a Processor Hierarchy > Node Structure [1] that is the parent of the CPU cores and in turn > has a parent Processor Hierarchy Nodes Structure representing > a higher level of topology. > > For example Kunpeng 920 has clusters of 4 CPUs. These do not share > any cache resources, but the interconnect topology is such that > the cost to transfer ownership of a cacheline between CPUs within > a cluster is lower than between CPUs in different clusters on the same > die. Hence, it can make sense to deliberately schedule threads > sharing data to a single cluster. > > This patch simply exposes this information to userspace libraries > like hwloc by providing cluster_cpus and related sysfs attributes. > PoC of HWLOC support at [2]. > > Note this patch only handle the ACPI case. > > Special consideration is needed for SMT processors, where it is > necessary to move 2 levels up the hierarchy from the leaf nodes > (thus skipping the processor core level). > > Currently the ID provided is the offset of the Processor > Hierarchy Nodes Structure within PPTT. Whilst this is unique > it is not terribly elegant so alternative suggestions welcome. > > Note that arm64 / ACPI does not provide any means of identifying > a die level in the topology but that may be unrelate to the cluster > level. > > RFC questions: > 1) Naming > 2) Related to naming, do we want to represent all potential levels, > or this enough? On Kunpeng920, the next level up from cluster happens > to be covered by llc cache sharing, but in theory more than one > level of cluster description might be needed by some future system. > 3) Do we need DT code in place? I'm not sure any DT based ARM64 > systems would have enough complexity for this to be useful. > 4) Other architectures? Is this useful on x86 for example? Hello Jonathan Intel has CPUID registers to describe "tiles" and "modules" too (not used yet as far as I know). The list of levels could become quite long if any processor ever exposes those. If having multiple cluster levels is possible, maybe it's time to think about introducing some sort of generic levels: cluster0_id = your cluster_id cluster0_cpus/cpulist = your cluster_cpus/cpulis cluster0_type = would optionally contain hardware-specific info such as "module" or "tile" on x86 cluster_levels = 1 hwloc already does something like this for some "rare" levels such as s390 book/drawers (by the way, thanks a lot for the hwloc PoC, very good job), we call them "Groups" instead of "cluster" above. However I don't know if the Linux scheduler would like that. Is it better to have 10+ levels with static names, or a dynamic number of levels? Brice