Received: by 2002:a05:6500:1b8f:b0:1fa:5c73:8e2d with SMTP id df15csp272048lqb; Tue, 28 May 2024 15:21:00 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCX1Q/u2ElnRhtZ74t6HjxuvcZmrWctD6tkgKf/TjR6SVNWfRn6dZhRAzvRrHiTpBPZ8trOpDEn8MMwRTKuE7oH7MixIU/QoxDpcNIHC8g== X-Google-Smtp-Source: AGHT+IH8HtboK44yLcnVqFubH21LbZdxi/0zxipDHZyn5eka1PI1kLnkMcu1dWiH2aC5N/D12z3p X-Received: by 2002:a05:6214:449b:b0:6ab:9bc2:a1b3 with SMTP id 6a1803df08f44-6abcd0b51eamr173092026d6.41.1716934860144; Tue, 28 May 2024 15:21:00 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716934860; cv=pass; d=google.com; s=arc-20160816; b=CGuWjOFJgKYUo7d6xaiJU2/7en592FE2EnHhnWLa/O+y3gOFPAq46TwvB8dKe5BCm5 eWhFnhU3aHlYxt2oLOFwu/eK17qDIGJ9Zw/GcyHBotACPFbP2l6T+/wkxDMSn+7bhknt kWK4eer3y5sSUEMiCQ6NmhzlWv/FbdZeT4zds8YkK6u9wUNQ7mzNx3is/lTqUUPdDrsg ycCmPjS7lWlXITD91nzdfWLsV0aD0TmE9Iygof/Aev20oXB54eXlxMsXgWkKUDQe5O56 UV8DaafCBlc9LthQofrZJHRnQvpNSK056hvdlEsWrumAC6K10Gig75gZtdmOWzv+K0Yz AjiQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=REIAqyu+05oBCK2nJJlx6H8cRzWFGpccrvvlF+oD0X0=; fh=qTx768JYq6oxdX3jfjUzjJGF83d0IKGtQzHdow3lchk=; b=NvhtGlRJhw9P8PD48qyUMERUQvKCNzGlIE4fHXvWQm/hVCqYu1vskKiFrfSLoJbYwn ppW7An3Q/RLP4wFg0MEwxXqWdngAkkgR8QhS+Kp/+Vve6fqJxj7c+nvQ58yDnlLUNHgK taUK7D0QE3N94D9C0FRRM+uKSZrK5o92nnJ3AUR/sJ6XW2LapvioWQycxuTf7Cr5rQ+E s1t+KXt8ChkEJ+YiO7xlt2qzrjK7X5lQNNPJmbHKb39iMdLAeKXJqgOJbfsN19gZcqny f0oiR2lrxUB4NfTjl8I9VZOAWhNvdsp7KKN2NtMPZVc6U0et8Fxq42hgq0xDSfpf/vQM srdA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="OAni/gMc"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-193160-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-193160-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id 6a1803df08f44-6ac1625f225si114090586d6.359.2024.05.28.15.20.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 May 2024 15:21:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-193160-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="OAni/gMc"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-193160-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-193160-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id DC8E31C245A3 for ; Tue, 28 May 2024 22:20:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7226C13DDBF; Tue, 28 May 2024 22:20:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="OAni/gMc" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1F0513E889 for ; Tue, 28 May 2024 22:20:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716934822; cv=none; b=g3RjjWsYnv8Ex+YxZM3eXWfx9CMXCHAe+W/y3imGByW2q9uCx9RKHmntYqQYu/bmBzqYqMzIpEZjdNkugfIbxNkLorh8sOqBbGPiQJlplt6Len/E2HEc0mLE0M6pjEb9VWKthlliWXyOpCWTchja6pfI/vbSG9dRKc8jrmClFhY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716934822; c=relaxed/simple; bh=gG4r96r2lntBDsHt+9wxOd7VjzgcPxv496sdhJQghAY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Aj5Zyjd87OUNysIT/OHRpof47aa+8+bfwGf3KfPqCAT0r4x5A3hr2GUP3IPCeu/JKatY7R5USBEo+lCKregnQBSzUZz/VarpE/AtXMM2x1Xs1Vur8FLXIR+NCB8z6N6MDCrzSu0lGPtJm6gEr9KqSBeEEvhCV7+JWOoFz7EMDN4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=OAni/gMc; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1716934821; x=1748470821; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gG4r96r2lntBDsHt+9wxOd7VjzgcPxv496sdhJQghAY=; b=OAni/gMc/YbdKMpXvSgsjeC9Gie8/uFNcIjdqa/bmymUf/qJlr3Lc3Ug fU7TICgYh5AFYCeI6/4QMA+NHpA6Xb0Ve9ntS59yzMbE/At0gOP8Jc0M9 HS/rXWf5SFLap6hNm5WNR6q6hHooxDUghplTT3b0mji5IJlSfv+uNJ6VW 309ADSSt9JoGUszXrjPIlHEsZcYkwEft69ty/wiIyrlFTjee7B/TVIuln 99S0eMSIdcwo/9uUkSePqTGL6adEDBgpk7ZLSfgDMomlJ3r1xQRgJJlqo 994KKZYAMMl6gRYVYswFMKE5KtEQHMq7DhjPryDVarvrwk0xIHfQs0qTn Q==; X-CSE-ConnectionGUID: xRUxTQhxTU+nlkyjPezUVQ== X-CSE-MsgGUID: rT8l98LMQiCvQJU03922BA== X-IronPort-AV: E=McAfee;i="6600,9927,11085"; a="30812174" X-IronPort-AV: E=Sophos;i="6.08,196,1712646000"; d="scan'208";a="30812174" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2024 15:20:17 -0700 X-CSE-ConnectionGUID: /SZ2HQi1T4O9BV/wX228hg== X-CSE-MsgGUID: X7OhId77RWeXx0xELZWnyA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,196,1712646000"; d="scan'208";a="40090728" Received: from agluck-desk3.sc.intel.com ([172.25.222.70]) by orviesa005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2024 15:20:16 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Maciej Wieczor-Retman , Peter Newman , James Morse , Babu Moger , Drew Fustini , Dave Martin Cc: x86@kernel.org, linux-kernel@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v19 06/20] x86/resctrl: Introduce snc_nodes_per_l3_cache Date: Tue, 28 May 2024 15:19:51 -0700 Message-ID: <20240528222006.58283-7-tony.luck@intel.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240528222006.58283-1-tony.luck@intel.com> References: <20240528222006.58283-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Intel Sub-NUMA Cluster (SNC) is a feature that subdivides the CPU cores and memory controllers on a socket into two or more groups. These are presented to the operating system as NUMA nodes. This may enable some workloads to have slightly lower latency to memory as the memory controller(s) in an SNC node are electrically closer to the CPU cores on that SNC node. This cost may be offset by lower bandwidth since the memory accesses for each core can only be interleaved between the memory controllers on the same SNC node. Resctrl monitoring on an Intel system depends upon attaching RMIDs to tasks to track L3 cache occupancy and memory bandwidth. There is an MSR that controls how the RMIDs are shared between SNC nodes. The default mode divides them numerically. E.g. when there are two SNC nodes on a socket the lower number half of the RMIDs are given to the first node, the remainder to the second node. This would be difficult to use with the Linux resctrl interface as specific RMID values assigned to resctrl groups are not visible to users. The other mode divides the RMIDs and renumbers the ones on the second SNC node to start from zero. Even with this renumbering SNC mode requires several changes in resctrl behavior for correct operation. Add a static global to arch/x86/kernel/cpu/resctrl/monitor.c to indicate how many SNC domains share an L3 cache instance. Initialize this to "1". Runtime detection of SNC mode will adjust this value. Update all places to take appropriate action when SNC mode is enabled: 1) The number of logical RMIDs per L3 cache available for use is the number of physical RMIDs divided by the number of SNC nodes. 2) Likewise the "mon_scale" value must be divided by the number of SNC nodes. 3) Add a function to convert from logical RMID values (assigned to tasks and loaded into the IA32_PQR_ASSOC MSR on context switch) to physical RMID values to load into IA32_QM_EVTSEL MSR when reading counters on each SNC node. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/resctrl/monitor.c | 37 ++++++++++++++++++++++++--- 1 file changed, 33 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 89d7e6fcbaa1..b9b4d2b5ca82 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -97,6 +97,8 @@ unsigned int resctrl_rmid_realloc_limit; #define CF(cf) ((unsigned long)(1048576 * (cf) + 0.5)) +static int snc_nodes_per_l3_cache = 1; + /* * The correction factor table is documented in Documentation/arch/x86/resctrl.rst. * If rmid > rmid threshold, MBM total and local values should be multiplied @@ -185,10 +187,37 @@ static inline struct rmid_entry *__rmid_entry(u32 idx) return entry; } -static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val) +/* + * When Sub-NUMA Cluster (SNC) mode is not enabled, the physical RMID + * is the same as the logical RMID. + * + * When SNC mode is enabled the physical RMIDs are distributed across + * the SNC nodes. E.g. with two SNC nodes per L3 cache and 200 physical + * RMIDs are divided with 0..99 on the first node and 100..199 on + * the second node. Compute the value of the physical RMID to pass to + * resctrl_arch_rmid_read(). + * + * Caller is responsible to make sure execution running on a CPU in + * the domain to be read. + */ +static int logical_rmid_to_physical_rmid(int lrmid) +{ + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + int cpu = smp_processor_id(); + + if (snc_nodes_per_l3_cache == 1) + return lrmid; + + return lrmid + (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->num_rmid; +} + +static int __rmid_read(u32 lrmid, + enum resctrl_event_id eventid, u64 *val) { u64 msr_val; + int prmid; + prmid = logical_rmid_to_physical_rmid(lrmid); /* * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured * with a valid event code for supported resource type and the bits @@ -197,7 +226,7 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val) * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62) * are error bits. */ - wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid); + wrmsr(MSR_IA32_QM_EVTSEL, eventid, prmid); rdmsrl(MSR_IA32_QM_CTR, msr_val); if (msr_val & RMID_VAL_ERROR) @@ -1022,8 +1051,8 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r) int ret; resctrl_rmid_realloc_limit = boot_cpu_data.x86_cache_size * 1024; - hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale; - r->num_rmid = boot_cpu_data.x86_cache_max_rmid + 1; + hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale / snc_nodes_per_l3_cache; + r->num_rmid = (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3_cache; hw_res->mbm_width = MBM_CNTR_WIDTH_BASE; if (mbm_offset > 0 && mbm_offset <= MBM_CNTR_WIDTH_OFFSET_MAX) -- 2.45.0