Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp6560168ybv; Wed, 12 Feb 2020 14:55:06 -0800 (PST) X-Google-Smtp-Source: APXvYqxFcrjvZUW34nbX6OS8n72idXYNU33hjO0BpSO4RkQii+pANRzGjJBonIeC14GeL9MaA/oV X-Received: by 2002:a05:6830:1304:: with SMTP id p4mr11264437otq.327.1581548105994; Wed, 12 Feb 2020 14:55:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581548105; cv=none; d=google.com; s=arc-20160816; b=vQWzFqT27/bWTDLuvOyaWjvu/Ga6tGj8p4kbxnrhlb522NIyfxT0N/NHpe5HUqIGGe 5V7b9lIARyiGl7JGt3XYb+DEWlziFe1jldYx+v+IyzEHfdXhBRfISjvZh32/PhXWw+N/ 1dmQjxK1EgIbNCzSsVsCIo7FzNzDu5N0+/bZrZrTBjLAHul+aH07u3FehcZYj0VKQwHN xZ/+jOKRrgr0+zSFs+VrXDlS45CTjnXnOwtodyzBdsUSJte3gClCjjQmI+9hZh7qJw1w DuCCvjdmjIGmqeo/ol8L9dgH80e9C8+QSCQYTF2tvlkr29hF22kCihHa8Jmmgn/oiB6N Zuww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=XdSgahM8vOv+p+H3Le/UO3p4DwZ5AHyokoot4BdaBh8=; b=PRY8yFXpoDTZm/anqBBY7uAWcVFCxaH0jvf6ovyv3nKh35Ai3gJI+FxG5MRY2WbkN/ R83Cz7nQO3L8P37mpaG2UWpqa0ep5Qqrb50189KchC+tbi4XwcevOiw40gqFnOLlega+ eSViJyFvdrLJdcCIY6V0PcNlFWw3xlPNTpr5guNEQWHxjaQgimwvEwwFTyW6YIL89DhB 8R7nQikPeJ4oKiF94KLWtyPK12lusAU/EkeBTQP4YgK1LksfWrf4Rerl+6QBow0qs+VA hR5q1dGSDXo9w6wrsc09be92CVD+AjfePTW6qaZLSmg1YHVBXXiuidLMg3KBXymC9mzk Qzpg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n21si261617oic.0.2020.02.12.14.54.53; Wed, 12 Feb 2020 14:55:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729246AbgBLWxf (ORCPT + 99 others); Wed, 12 Feb 2020 17:53:35 -0500 Received: from mga09.intel.com ([134.134.136.24]:64157 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727692AbgBLWxe (ORCPT ); Wed, 12 Feb 2020 17:53:34 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Feb 2020 14:53:33 -0800 X-IronPort-AV: E=Sophos;i="5.70,434,1574150400"; d="scan'208";a="237856126" Received: from rchatre-mobl.amr.corp.intel.com (HELO [10.24.14.116]) ([10.24.14.116]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 12 Feb 2020 14:53:33 -0800 Subject: Re: [PATCH] x86/resctrl: Preserve CDP enable over cpuhp To: James Morse , x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" References: <20200212185359.163111-1-james.morse@arm.com> From: Reinette Chatre Message-ID: <8aab67d7-c13e-19f1-9bec-85b7cca55146@intel.com> Date: Wed, 12 Feb 2020 14:53:31 -0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.2 MIME-Version: 1.0 In-Reply-To: <20200212185359.163111-1-james.morse@arm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi James, Thank you very much for catching this. On 2/12/2020 10:53 AM, James Morse wrote: > Resctrl assumes that all cpus are online when the filesystem is Please take care throughout to use CPU/CPUs > mounted, and that cpus remember their CDP-enabled state over cpu > hotplug. > > This goes wrong when resctrl's CDP-enabled state changes while all > the cpus in a domain are offline. > > When a domain comes online, enable (or disable!) CDP to match resctrl's > current setting. > > Fixes: 5ff193fbde20 ("x86/intel_rdt: Add basic resctrl filesystem support") > Signed-off-by: James Morse > > --- > > Seen on a 'Intel(R) Xeon(R) Gold 5120T CPU @ 2.20GHz' from lenovo, taking > all the cores in one package offline, umount/mount to toggle CDP then > bringing them back: the first core to come online still has the old > CDP state. > > This will get called more often than is desirable (worst:3/domain) > but this is better than on every cpu in the domain. Unless someone > can spot a better place to hook it in? From what I can tell this solution is indeed called for every CPU, and more so, for every capable resource associated with each CPU: resctrl_online_cpu() is called for each CPU and it in turn runs ... for_each_capable_rdt_resource(r) domain_add_cpu() ... from where the new code is called. > --- > arch/x86/kernel/cpu/resctrl/core.c | 21 +++++++++++++++++++++ > arch/x86/kernel/cpu/resctrl/internal.h | 3 +++ > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 7 +++++-- > 3 files changed, 29 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c > index 89049b343c7a..1210cb65e6d3 100644 > --- a/arch/x86/kernel/cpu/resctrl/core.c > +++ b/arch/x86/kernel/cpu/resctrl/core.c > @@ -541,6 +541,25 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) > return 0; > } > > +/* resctrl's use of CDP may have changed while this domain slept */ > +static void domain_reconfigure_cdp(void) > +{ > + bool cdp_enable; > + struct rdt_resource *r; (Please note that this area uses reverse-fir tree ordering.) > + > + lockdep_assert_held(&rdtgroup_mutex); > + > + r = &rdt_resources_all[RDT_RESOURCE_L2]; > + cdp_enable = !r->alloc_enabled; This logic can become confusing. Also remember that L2 or L3 resources supporting allocation are not required to support CDP. There are existing products that support allocation without supporting CDP. The goal is to configure CDP correctly on a resource that supports CDP and for that there are the L2DATA, L2CODE, L3DATA, and L3CODE resources. These resources have their "alloc_capable" set if they support CDP and "alloc_enabled" set when CDP is enabled. Would it be possible to have a helper to correctly enable/disable CDP only for resources that support CDP? This helper could have "cpu" in its name to distinguish it from the other system-wide helpers. > + if (r->alloc_capable) > + l2_qos_cfg_update(&cdp_enable); Since this will run on any system that supports L2 allocation it will attempt to disable CDP on a system that does not support CDP. I do not think this is the right thing to do. > + > + r = &rdt_resources_all[RDT_RESOURCE_L3]; > + cdp_enable = !r->alloc_enabled; > + if (r->alloc_capable) > + l3_qos_cfg_update(&cdp_enable); > +} > + > /* > * domain_add_cpu - Add a cpu to a resource's domain list. > * > @@ -578,6 +597,8 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) > d->id = id; > cpumask_set_cpu(cpu, &d->cpu_mask); > > + domain_reconfigure_cdp(); > + domain_add_cpu() is called for each resource associated with each CPU. It seems that this reconfiguration could be moved up to resctrl_online_cpu() and not be run as many times. (One hint that this could be done is that this new function is not using any of the parameters passed from resctrl_online_cpu() to domain_add_cpu().) The re-configuring of CDP would still be done for each CPU as it comes online. > if (r->alloc_capable && domain_setup_ctrlval(r, d)) { > kfree(d); > return; > diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h > index 181c992f448c..29c92d3e93f5 100644 > --- a/arch/x86/kernel/cpu/resctrl/internal.h > +++ b/arch/x86/kernel/cpu/resctrl/internal.h > @@ -602,4 +602,7 @@ void __check_limbo(struct rdt_domain *d, bool force_free); > bool cbm_validate_intel(char *buf, u32 *data, struct rdt_resource *r); > bool cbm_validate_amd(char *buf, u32 *data, struct rdt_resource *r); > > +void l3_qos_cfg_update(void *arg); > +void l2_qos_cfg_update(void *arg); > + The new helper could be located in this same area with all the other CDP related functions and it will just be the one helper exported. > #endif /* _ASM_X86_RESCTRL_INTERNAL_H */ > diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > index 064e9ef44cd6..e11356011a4a 100644 > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > @@ -1804,14 +1804,14 @@ mongroup_create_dir(struct kernfs_node *parent_kn, struct rdtgroup *prgrp, > return ret; > } > > -static void l3_qos_cfg_update(void *arg) > +void l3_qos_cfg_update(void *arg) > { > bool *enable = arg; > > wrmsrl(MSR_IA32_L3_QOS_CFG, *enable ? L3_QOS_CDP_ENABLE : 0ULL); > } > > -static void l2_qos_cfg_update(void *arg) > +void l2_qos_cfg_update(void *arg) > { > bool *enable = arg; > > @@ -1831,6 +1831,9 @@ static int set_cache_qos_cfg(int level, bool enable) > struct rdt_domain *d; > int cpu; > > + /* CDP state is restored during cpuhp, which takes this lock */ > + lockdep_assert_held(&rdtgroup_mutex); > + > if (level == RDT_RESOURCE_L3) > update = l3_qos_cfg_update; > else if (level == RDT_RESOURCE_L2) > Reinette