Received: by 2002:ab2:7a09:0:b0:1f8:46dc:890e with SMTP id k9csp304030lqo; Wed, 15 May 2024 15:24:28 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXmdi2m5BKsj6EaeNp7ymUqCUPMFvauA2kpGV3a+rdAWNCojw+7ImYPolxij9vTfzjRWEK7C8WrxNTZGbzzvKMuko8wGkjc/aKMM9lg4w== X-Google-Smtp-Source: AGHT+IF4xn5v4KC/OFhMUfNwiil0Qk4dRam6WU29KM3dqOVrJ4nR+Upx5RToZqha/l/e589afGZX X-Received: by 2002:a05:6a00:398e:b0:6f4:d079:bb2b with SMTP id d2e1a72fcca58-6f4e02adb90mr17073076b3a.9.1715811868099; Wed, 15 May 2024 15:24:28 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715811868; cv=pass; d=google.com; s=arc-20160816; b=By8HwWZrVXl3jWdmbaMIzrGscdaLfWWJ02rNB0aunnQNehtapGBWRn9XjpLRVPEFIw qYPvpQykHwnD2Q+oA4ETJ4IEz+wQlgIQ+yk45jnmIay1ydZMTFXca9jp2WdxYXiAlysz eWbAKeDr+vftrK19U6g9uU2kJQ9t8gbLe1+CM1Ln6LRzStv/v0dATS87xP4pPGFjtBl5 0nCAufOCfIbUOgSnvat2GDh8rkoxaLfK792judod6KBThxXiJkz12ZdNLhB+Ah/SVG1a +NOE8NmhruwzvcqnU1MkxMcE1/F52bMEJKF4nWx87p04hVlHgltmvjKg+JPNx2WSPLIf X72w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=trbSagty833dL3CNFuE5t1AH/U4ujCcrF+NDPjK+Skw=; fh=qTx768JYq6oxdX3jfjUzjJGF83d0IKGtQzHdow3lchk=; b=aT6qYsd33q57LwGTsnoH1DGVb8Wp3r02CJztQpA1zMa7VPQiERS8cv9ZUD4NwHt+EV m8i/YJJEnDSWqhy9MQZX7JEJEjNW0RdrnIIDHCEHbOlInOJll5ZYaEPjfzTgrznAoNIx LIMUzQX0g8CgS4uYeUBvG0eolTzvcs+CUfRz0+YwrsFkLyGZv6kvPTHs5J4AtYmVL1EB dCUlcEfi7Z8knyw+NdREhSGQK9J6kTEnYiTzjFDpxuWZcNW2sWUlsdan+FrPH3Vnv8i5 49pcu8EmmFQQvIc2HONYdD8xiV1ayThmSjxKpF0UfFCjg13amWCRb0OZvxTn3Rh4E6Lk 5y0w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=cLMdm0fk; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-180430-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-180430-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id d2e1a72fcca58-6f4d2b3ff01si14369582b3a.343.2024.05.15.15.24.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 May 2024 15:24:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-180430-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=cLMdm0fk; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-180430-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-180430-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 9DDD3B23107 for ; Wed, 15 May 2024 22:24:19 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 66C3115D5C4; Wed, 15 May 2024 22:23:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="cLMdm0fk" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 707F415CD42 for ; Wed, 15 May 2024 22:23:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715811820; cv=none; b=gsru2NfXuRg2ZXELtaH8VtzeGAChxZA9rKDvENWhSxWoZaIZCexaOdL2caKHvTZEJEKuSbJG95Uk8APms5wkJg9VHpNHIHBvfI4OshFCrsYnu4axw4ufgKXCitUrz69k+MsJbxcsqgVGRineYyTx8VlNJaOGz3YvGZHcCMJlxjc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715811820; c=relaxed/simple; bh=MJQAktYwpXHjAxCB+VULMGYNOJ0ynJMmLI/jIPZWQDc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XDqmnq7jtzd1F3kriz4/bsTWBO9JMzUqawFYqGJYKjiSGDr5hEU2i34Ff61BjHMiSZx/m42HIZQIfbGjbF1o1X8x0GHLZbWhfAmQv4eZFr+EseRjYX0RKYrTZNUpbPhL6rH9QoYCvTH+/lLgQY4SCnvDLOcOLeQsvX+2uLP0a+c= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=cLMdm0fk; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715811818; x=1747347818; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MJQAktYwpXHjAxCB+VULMGYNOJ0ynJMmLI/jIPZWQDc=; b=cLMdm0fkFnkhXQwku8S5H+B4mPjtmezQh3r5cj9xwboqEavmdyVbAIOl 2L4TSGlkYJ0DLMyts00BtkTmi8oVTFilaM6xC5D7pafdmZzult/kW7AMH 0d0CLzuf2qX50p+Uyv8ponTWfL68Xcjsd/b52Q17vtrVgNlP/fQov+8+Q 6ozRY3pnc+LU+Ru3MgUpBYMvnLgE3JbtAMLBCIMAPWBiPiOEMs95/IBa2 uZiy3Q163Bk55oZsocxKupaKFoEDTXoM9DSs5Yo3FanRHEVowZWzUqhEw fBQnQ0l+gAic/rHUhlLOgQl6G280w9URC2MAox1w4DnIrqejt6i5614c6 g==; X-CSE-ConnectionGUID: lliL5aFqS52l+0JHL2tRWg== X-CSE-MsgGUID: GD67jTzPSKmOfyVTK63BOw== X-IronPort-AV: E=McAfee;i="6600,9927,11074"; a="15671626" X-IronPort-AV: E=Sophos;i="6.08,162,1712646000"; d="scan'208";a="15671626" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2024 15:23:34 -0700 X-CSE-ConnectionGUID: f0GTjT7cR4KmwubDVD5RHw== X-CSE-MsgGUID: jVgBaGmqSx2Rx+Er2s+2/g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,162,1712646000"; d="scan'208";a="35989152" Received: from agluck-desk3.sc.intel.com ([172.25.222.105]) by orviesa005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2024 15:23:34 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Maciej Wieczor-Retman , Peter Newman , James Morse , Babu Moger , Drew Fustini , Dave Martin Cc: x86@kernel.org, linux-kernel@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v18 06/17] x86/resctrl: Introduce snc_nodes_per_l3_cache Date: Wed, 15 May 2024 15:23:14 -0700 Message-ID: <20240515222326.74166-7-tony.luck@intel.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240515222326.74166-1-tony.luck@intel.com> References: <20240515222326.74166-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Intel Sub-NUMA Cluster (SNC) is a feature that subdivides the CPU cores and memory controllers on a socket into two or more groups. These are presented to the operating system as NUMA nodes. This may enable some workloads to have slightly lower latency to memory as the memory controller(s) in an SNC node are electrically closer to the CPU cores on that SNC node. This cost may be offset by lower bandwidth since the memory accesses for each core can only be interleaved between the memory controllers on the same SNC node. Resctrl monitoring on an Intel system depends upon attaching RMIDs to tasks to track L3 cache occupancy and memory bandwidth. There is an MSR that controls how the RMIDs are shared between SNC nodes. The default mode divides them numerically. E.g. when there are two SNC nodes on a socket the lower number half of the RMIDs are given to the first node, the remainder to the second node. This would be difficult to use with the Linux resctrl interface as specific RMID values assigned to resctrl groups are not visible to users. The other mode divides the RMIDs and renumbers the ones on the second SNC node to start from zero. Even with this renumbering SNC mode requires several changes in resctrl behavior for correct operation. Add a global integer "snc_nodes_per_l3_cache" that shows how many SNC nodes share each L3 cache. When "snc_nodes_per_l3_cache" is "1", SNC mode is either not implemented, or not enabled. Update all places to take appropriate action when SNC mode is enabled: 1) The number of logical RMIDs per L3 cache available for use is the number of physical RMIDs divided by the number of SNC nodes. 2) Likewise the "mon_scale" value must be divided by the number of SNC nodes. 3) Disable the "-o mba_MBps" mount option in SNC mode because the monitoring is being done per SNC node, while the bandwidth allocation is still done at the L3 cache scope. Trying to use this feedback loop might result in contradictory changes to the throttling level coming from each of the SNC node bandwidth measurements. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/resctrl/internal.h | 2 ++ arch/x86/kernel/cpu/resctrl/core.c | 6 ++++++ arch/x86/kernel/cpu/resctrl/monitor.c | 4 ++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 3 ++- 4 files changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 135190e0711c..49440f194253 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -484,6 +484,8 @@ extern struct rdt_hw_resource rdt_resources_all[]; extern struct rdtgroup rdtgroup_default; extern struct dentry *debugfs_resctrl; +extern unsigned int snc_nodes_per_l3_cache; + enum resctrl_res_level { RDT_RESOURCE_L3, RDT_RESOURCE_L2, diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 395bac851f6e..bfa9d3a429fd 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -331,6 +331,12 @@ static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r) return r->default_ctrl; } +/* + * Number of SNC nodes that share each L3 cache. Default is 1 for + * systems that do not support SNC, or have SNC disabled. + */ +unsigned int snc_nodes_per_l3_cache = 1; + static void mba_wrmsr_intel(struct msr_param *m) { struct rdt_hw_ctrl_domain *hw_dom = resctrl_to_arch_ctrl_dom(m->dom); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 89d7e6fcbaa1..0f66825a1ac9 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -1022,8 +1022,8 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r) int ret; resctrl_rmid_realloc_limit = boot_cpu_data.x86_cache_size * 1024; - hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale; - r->num_rmid = boot_cpu_data.x86_cache_max_rmid + 1; + hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale / snc_nodes_per_l3_cache; + r->num_rmid = (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3_cache; hw_res->mbm_width = MBM_CNTR_WIDTH_BASE; if (mbm_offset > 0 && mbm_offset <= MBM_CNTR_WIDTH_OFFSET_MAX) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index cc31ede1a1e7..0923492a8bd0 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2346,7 +2346,8 @@ static bool supports_mba_mbps(void) struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl; return (is_mbm_local_enabled() && - r->alloc_capable && is_mba_linear()); + r->alloc_capable && is_mba_linear() && + snc_nodes_per_l3_cache == 1); } /* -- 2.44.0