Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp2361692lqp; Sun, 24 Mar 2024 15:46:43 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXlCv5uUx6p1xg+F5wX7Qcra2P5n/N/lBgidqRQYpf+6QGm+TRZg4iEvk/6z7uMWYwIEdDwGixmeWZimLjJS6mIS4xCUXhne32iCQFnBg== X-Google-Smtp-Source: AGHT+IHWX9BdLPYrPIvY4HdNPU2vylUaRX+25U0BhqNawZVxz/GWlM2rsY38SapM+1sUVOsqfHSJ X-Received: by 2002:a05:620a:1279:b0:78a:4338:7956 with SMTP id b25-20020a05620a127900b0078a43387956mr5797529qkl.34.1711320403447; Sun, 24 Mar 2024 15:46:43 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711320403; cv=pass; d=google.com; s=arc-20160816; b=VBAyBTOBWBHOQllX0jLkC+6bsgifmLe8o7/IfCUOkzdItFx+TaEzMOtt+TcCiq27wI tQV07c6xRwUfVcyoAf3tLE9A1GW+ZpWYOGEtWPUBYC7PU0Ky/odD0NZybBAlakoqtuAt H0fWywTHT8eEbdBQd9rsH+SeAdP7QXrZ1NOTLJX/s5D2Wj0pw2XDkBB3Ld9mxP2k9hFu b55HGfVh6fTh0rzwzICITTSeW9A4o1tnV6jrUOv7iqitkM03FhgXvThcgbVKZFnW5BGw rGm2GxWzVwy0KjD8osO8pKdRsPgLTbXWzhs3dHZYU/G5jdARtZnwMCSOIsHjDSE168M5 FeGA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=s1syU9RC0fZ0xAXc3VeXxMFpcN4Hp7cM3O6gbFcspys=; fh=aSh2K8nYsfMo4pQGM1aithG7qwUUnuBQrMSopvwnNqQ=; b=Z+9CGw5fSAn5hyncrIMtp+E9zHd76jTVYe6QxcvS3plHFfz0aH2Z4sz3nlDgt2rad6 ol5mM9crmDnGJ9e6p3nwvmCFF+Ii3OTR+oGlDom4fOjW5HFq/JD7hidLJKNiHLGK8N3i eQNbjx6pSS7WGT1BId8txWbhAQuP1UpELIuRBcJHXfbrY+nfky4adNdaH7ytYb+r8x+A L8r/sakGXlvXzsefS8rLoxb44U+Au/WvNDhJ6up4LXcvCXu3NT579qm2i3nTzIlQjwIR lYFu+jsNSKDjRgXcO8YPpcmFR78QW0a6I0aRE0UY6P2I0Xk+rj8MyWdp636Yzlqw5r5G fGXw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="XH7Q4Rd/"; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-112934-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-112934-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id pc27-20020a05620a841b00b00789d967058fsi4287934qkn.542.2024.03.24.15.46.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 24 Mar 2024 15:46:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-112934-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="XH7Q4Rd/"; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-112934-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-112934-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 120811C20E95 for ; Sun, 24 Mar 2024 22:46:43 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C6EF475817; Sun, 24 Mar 2024 22:35:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XH7Q4Rd/" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B6ED3757F0; Sun, 24 Mar 2024 22:35:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711319746; cv=none; b=pnNxG9E7qLpAv3Fi3jZIlPHJFOA2ma/jfmDUsMMcsqmr4StZJmv6QZC8TD4GGvl0hPjEfGDqmTCUhA4UTUD2ZD9Js+wsfnACzzyLLU/mzmyEs9Sj9MEkmQ6VpkLKsgxRwyP5LFFQ+NxemnR4Srz1qWqeeFVfcc7LJzAJXP93nsM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711319746; c=relaxed/simple; bh=3O2ZGMzYJDzn9u6tCYoGpiraSg8V3UozvV2JPZySKYU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HLG3Ne9aQrt+TNWv3zpkpVVS965J5KGEWDJxUKMYx7yICw+fRCoW+pjnVUJqUoIiNed8kTTHK7/0esMFq87mc+05sgiTbnqacQJe6z5mVoGnTI9I7avLGmV2n9NeNbKVYL3uiEL9oOE088drt1JlQEBOFk6LjdTTMJfsrzKual0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XH7Q4Rd/; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id C0F65C43390; Sun, 24 Mar 2024 22:35:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1711319746; bh=3O2ZGMzYJDzn9u6tCYoGpiraSg8V3UozvV2JPZySKYU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XH7Q4Rd/GEmIiIkyOvs34bVQyGbyXW0+Kf9G2gNAkrgRLszyTkFyA2FdvU1vIHa8u V4cOOQoUzhlazqCD8ybOkgyzbn9ApOnPjJapk3oTAj1rEvsghVCv6NcLDSmp1w2EQk DYxLFitTr4IHcAr26azp0VkrClJD4dqaKaTNbKcSKxaJq7HT0yOm5uAdU05oVxCInl N035ZNPl06ljnZPjPMWhP76yFpT14uoa2hVH1zziqfxS42HNQP8M4XNp8gAwQYKrr6 lSS2hbZR9dd6feiX/vrf590vGQTftALyvJggMQiIQbLoMAoL2UIYsVLxPYaxHrKuYZ AE39D1c5Lqg9w== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Tony Luck , Xiaochen Shen , Borislav Petkov , Reinette Chatre , Sasha Levin Subject: [PATCH 6.8 047/715] x86/resctrl: Implement new mba_MBps throttling heuristic Date: Sun, 24 Mar 2024 18:23:46 -0400 Message-ID: <20240324223455.1342824-48-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240324223455.1342824-1-sashal@kernel.org> References: <20240324223455.1342824-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit From: Tony Luck [ Upstream commit c2427e70c1630d98966375fffc2b713ab9768a94 ] The mba_MBps feedback loop increases throttling when a group is using more bandwidth than the target set by the user in the schemata file, and decreases throttling when below target. To avoid possibly stepping throttling up and down on every poll a flag "delta_comp" is set whenever throttling is changed to indicate that the actual change in bandwidth should be recorded on the next poll in "delta_bw". Throttling is only reduced if the current bandwidth plus delta_bw is below the user target. This algorithm works well if the workload has steady bandwidth needs. But it can go badly wrong if the workload moves to a different phase just as the throttling level changed. E.g. if the workload becomes essentially idle right as throttling level is increased, the value calculated for delta_bw will be more or less the old bandwidth level. If the workload then resumes, Linux may never reduce throttling because current bandwidth plus delta_bw is above the target set by the user. Implement a simpler heuristic by assuming that in the worst case the currently measured bandwidth is being controlled by the current level of throttling. Compute how much it may increase if throttling is relaxed to the next higher level. If that is still below the user target, then it is ok to reduce the amount of throttling. Fixes: ba0f26d8529c ("x86/intel_rdt/mba_sc: Prepare for feedback loop") Reported-by: Xiaochen Shen Signed-off-by: Tony Luck Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Reinette Chatre Tested-by: Xiaochen Shen Link: https://lore.kernel.org/r/20240122180807.70518-1-tony.luck@intel.com Signed-off-by: Sasha Levin --- arch/x86/kernel/cpu/resctrl/internal.h | 4 --- arch/x86/kernel/cpu/resctrl/monitor.c | 42 ++++++-------------------- 2 files changed, 10 insertions(+), 36 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index e3dc35a00a197..52e7e7deee106 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -295,14 +295,10 @@ struct rftype { * struct mbm_state - status for each MBM counter in each domain * @prev_bw_bytes: Previous bytes value read for bandwidth calculation * @prev_bw: The most recent bandwidth in MBps - * @delta_bw: Difference between the current and previous bandwidth - * @delta_comp: Indicates whether to compute the delta_bw */ struct mbm_state { u64 prev_bw_bytes; u32 prev_bw; - u32 delta_bw; - bool delta_comp; }; /** diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index acca577e2b066..3a6c069614eb8 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -440,9 +440,6 @@ static void mbm_bw_count(u32 rmid, struct rmid_read *rr) cur_bw = bytes / SZ_1M; - if (m->delta_comp) - m->delta_bw = abs(cur_bw - m->prev_bw); - m->delta_comp = false; m->prev_bw = cur_bw; } @@ -520,11 +517,11 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) { u32 closid, rmid, cur_msr_val, new_msr_val; struct mbm_state *pmbm_data, *cmbm_data; - u32 cur_bw, delta_bw, user_bw; struct rdt_resource *r_mba; struct rdt_domain *dom_mba; struct list_head *head; struct rdtgroup *entry; + u32 cur_bw, user_bw; if (!is_mbm_local_enabled()) return; @@ -543,7 +540,6 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) cur_bw = pmbm_data->prev_bw; user_bw = dom_mba->mbps_val[closid]; - delta_bw = pmbm_data->delta_bw; /* MBA resource doesn't support CDP */ cur_msr_val = resctrl_arch_get_config(r_mba, dom_mba, closid, CDP_NONE); @@ -555,49 +551,31 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) list_for_each_entry(entry, head, mon.crdtgrp_list) { cmbm_data = &dom_mbm->mbm_local[entry->mon.rmid]; cur_bw += cmbm_data->prev_bw; - delta_bw += cmbm_data->delta_bw; } /* * Scale up/down the bandwidth linearly for the ctrl group. The * bandwidth step is the bandwidth granularity specified by the * hardware. - * - * The delta_bw is used when increasing the bandwidth so that we - * dont alternately increase and decrease the control values - * continuously. - * - * For ex: consider cur_bw = 90MBps, user_bw = 100MBps and if - * bandwidth step is 20MBps(> user_bw - cur_bw), we would keep - * switching between 90 and 110 continuously if we only check - * cur_bw < user_bw. + * Always increase throttling if current bandwidth is above the + * target set by user. + * But avoid thrashing up and down on every poll by checking + * whether a decrease in throttling is likely to push the group + * back over target. E.g. if currently throttling to 30% of bandwidth + * on a system with 10% granularity steps, check whether moving to + * 40% would go past the limit by multiplying current bandwidth by + * "(30 + 10) / 30". */ if (cur_msr_val > r_mba->membw.min_bw && user_bw < cur_bw) { new_msr_val = cur_msr_val - r_mba->membw.bw_gran; } else if (cur_msr_val < MAX_MBA_BW && - (user_bw > (cur_bw + delta_bw))) { + (user_bw > (cur_bw * (cur_msr_val + r_mba->membw.min_bw) / cur_msr_val))) { new_msr_val = cur_msr_val + r_mba->membw.bw_gran; } else { return; } resctrl_arch_update_one(r_mba, dom_mba, closid, CDP_NONE, new_msr_val); - - /* - * Delta values are updated dynamically package wise for each - * rdtgrp every time the throttle MSR changes value. - * - * This is because (1)the increase in bandwidth is not perfectly - * linear and only "approximately" linear even when the hardware - * says it is linear.(2)Also since MBA is a core specific - * mechanism, the delta values vary based on number of cores used - * by the rdtgrp. - */ - pmbm_data->delta_comp = true; - list_for_each_entry(entry, head, mon.crdtgrp_list) { - cmbm_data = &dom_mbm->mbm_local[entry->mon.rmid]; - cmbm_data->delta_comp = true; - } } static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) -- 2.43.0