Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp2748596rda; Wed, 25 Oct 2023 11:06:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGxEDNwtbBn3ndGW0O+gLOFvd0zlOtokyR55KGN/+LvZMQwDv3t0MYnWX/5uP4ZA1pOFyHl X-Received: by 2002:a05:622a:1110:b0:413:5d52:ee17 with SMTP id e16-20020a05622a111000b004135d52ee17mr22645581qty.42.1698257182535; Wed, 25 Oct 2023 11:06:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698257182; cv=none; d=google.com; s=arc-20160816; b=KD7NWN0tcqCKWrt8AsVgOM+TGrVIIz0qGpkAhglsjDiB4nUdEyCpG7QIJ0/FkReAWa 8j4TGLC00m09m7iHs+eyN92HxDPDzfPhxBXdEj4Ru5TnuskOQgb9R4UQBEJq/qkyGEUd B83pUlzzW0r6/vyBCHUYBTogn1VdBcREfGZaLRlSOzebrO8t4qXcu1k5nRJfuYdGodoP oY/ZNezStmFYUX/fUW5cp8diPpPf5gLaUAAMP06tCb5lnZFnr1XBaMlAWXUIcrnXRf/D 8O3eKGzIRbgyk0U3SI2ac1dudZihfCEIxEOOnaLUS2g4Gk+BHSRUVnlJVXXQy8HOV5c7 /jQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=5buw6JMIjVyI52Ra4ChpPRRCSCLpO/HuUKlYEcRWEDw=; fh=66J7Y+TGxNFNx8VjvpfZFFUqd7hqgnmmrifbhuLe8jE=; b=iIvg7VtFncpIuqlY64V1m6W9eBUchHQGXx1BMfdWlNzAFz8n5QB+EgGcKBNDFl7rxK c88Tm6m9j2OsUHG/t61Qjd4+HNyiTb86ibyE5DoWiceJbPe80llk8quj9wX1jtUVsH/3 DVnS5hfP8BIAXVfgeVHuYKZHwVF+QjkKP8mFxhprvDsp+5r4zGj6OjCyogyV3X8Q5oiY M2CCw0ffpk8xL3AVEHFQZgCVfIyrLs8imcMGoaL0YJAmIrDEiuULjQh9Ezz8RZ7wHOag jOPHaxd6KrkcZSXxnFM0v7IgnItbMNpY1I9XYDL9clh9rI/QqUGBTR71PNn8j8pGjkiR YOBg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id t20-20020ac85894000000b0041985eb503csi9178934qta.693.2023.10.25.11.06.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 11:06:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id A7FC3802AA1F; Wed, 25 Oct 2023 11:06:08 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234943AbjJYSF5 (ORCPT + 99 others); Wed, 25 Oct 2023 14:05:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234889AbjJYSFg (ORCPT ); Wed, 25 Oct 2023 14:05:36 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AEDC410DD for ; Wed, 25 Oct 2023 11:05:23 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EF515C15; Wed, 25 Oct 2023 11:06:04 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6519F3F738; Wed, 25 Oct 2023 11:05:20 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com, amitsinght@marvell.com Subject: [PATCH v7 12/24] x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow Date: Wed, 25 Oct 2023 18:03:33 +0000 Message-Id: <20231025180345.28061-13-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20231025180345.28061-1-james.morse@arm.com> References: <20231025180345.28061-1-james.morse@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 25 Oct 2023 11:06:09 -0700 (PDT) The limbo and overflow code picks a CPU to use from the domain's list of online CPUs. Work is then scheduled on these CPUs to maintain the limbo list and any counters that may overflow. cpumask_any() may pick a CPU that is marked nohz_full, which will either penalise the work that CPU was dedicated to, or delay the processing of limbo list or counters that may overflow. Perhaps indefinitely. Delaying the overflow handling will skew the bandwidth values calculated by mba_sc, which expects to be called once a second. Add cpumask_any_housekeeping() as a replacement for cpumask_any() that prefers housekeeping CPUs. This helper will still return a nohz_full CPU if that is the only option. The CPU to use is re-evaluated each time the limbo/overflow work runs. This ensures the work will move off a nohz_full CPU once a housekeeping CPU is available. Tested-by: Shaopeng Tan Tested-by: Peter Newman Reviewed-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v3: * typos fixed Changes since v4: * Made temporary variables unsigned Changes since v5: * Restructured cpumask_any_housekeeping() to avoid later churn. Changes since v6: * Update mbm_work_cpu/cqm_work_cpu when rescheduling. --- arch/x86/kernel/cpu/resctrl/internal.h | 24 ++++++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/monitor.c | 20 +++++++++++++------- 2 files changed, 37 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 521afa016b05..33e24fcc8dd0 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -7,6 +7,7 @@ #include #include #include +#include #include @@ -56,6 +57,29 @@ /* Max event bits supported */ #define MAX_EVT_CONFIG_BITS GENMASK(6, 0) +/** + * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that + * aren't marked nohz_full + * @mask: The mask to pick a CPU from. + * + * Returns a CPU in @mask. If there are housekeeping CPUs that don't use + * nohz_full, these are preferred. + */ +static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask) +{ + unsigned int cpu, hk_cpu; + + cpu = cpumask_any(mask); + if (!tick_nohz_full_cpu(cpu)) + return cpu; + + hk_cpu = cpumask_nth_andnot(0, mask, tick_nohz_full_mask); + if (hk_cpu < nr_cpu_ids) + cpu = hk_cpu; + + return cpu; +} + struct rdt_fs_context { struct kernfs_fs_context kfc; bool enable_cdpl2; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index cf512d4d383e..718770aea2af 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -783,7 +783,6 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, void cqm_handle_limbo(struct work_struct *work) { unsigned long delay = msecs_to_jiffies(CQM_LIMBOCHECK_INTERVAL); - int cpu = smp_processor_id(); struct rdt_resource *r; struct rdt_domain *d; @@ -794,8 +793,11 @@ void cqm_handle_limbo(struct work_struct *work) __check_limbo(d, false); - if (has_busy_rmid(d)) - schedule_delayed_work_on(cpu, &d->cqm_limbo, delay); + if (has_busy_rmid(d)) { + d->cqm_work_cpu = cpumask_any_housekeeping(&d->cpu_mask); + schedule_delayed_work_on(d->cqm_work_cpu, &d->cqm_limbo, + delay); + } mutex_unlock(&rdtgroup_mutex); } @@ -805,7 +807,7 @@ void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms) unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - cpu = cpumask_any(&dom->cpu_mask); + cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->cqm_work_cpu = cpu; schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); @@ -815,7 +817,6 @@ void mbm_handle_overflow(struct work_struct *work) { unsigned long delay = msecs_to_jiffies(MBM_OVERFLOW_INTERVAL); struct rdtgroup *prgrp, *crgrp; - int cpu = smp_processor_id(); struct list_head *head; struct rdt_resource *r; struct rdt_domain *d; @@ -839,7 +840,12 @@ void mbm_handle_overflow(struct work_struct *work) update_mba_bw(prgrp, d); } - schedule_delayed_work_on(cpu, &d->mbm_over, delay); + /* + * Re-check for housekeeping CPUs. This allows the overflow handler to + * move off a nohz_full CPU quickly. + */ + d->mbm_work_cpu = cpumask_any_housekeeping(&d->cpu_mask); + schedule_delayed_work_on(d->mbm_work_cpu, &d->mbm_over, delay); out_unlock: mutex_unlock(&rdtgroup_mutex); @@ -852,7 +858,7 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) if (!static_branch_likely(&rdt_mon_enable_key)) return; - cpu = cpumask_any(&dom->cpu_mask); + cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->mbm_work_cpu = cpu; schedule_delayed_work_on(cpu, &dom->mbm_over, delay); } -- 2.39.2