Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp131136imm; Tue, 19 Jun 2018 17:35:49 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJEA+gCDNHnUFGIJaOAPDF6aDUMEEdOwaUMuupf/dKsi4hhiVFH2B6AxXUdd3kCFMDQhny2 X-Received: by 2002:a62:bca:: with SMTP id 71-v6mr19947743pfl.234.1529454949439; Tue, 19 Jun 2018 17:35:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529454949; cv=none; d=google.com; s=arc-20160816; b=pDgFB6yCcp6ytVpX1N9GksltourNo9219hpxSaeCvc5ga6Mt/rFWUpbX3Xt4+PZae4 80uP5OknogsqT+3cnsWQ3dEUXTwR6H1jpM2NolEk6cFiJYMf7bF8brNfIBJGCaFkygKD ujdRvvM3COXdt23qnAqcjmjsSRon7gN07kNjMp1+HueoqqcO9RvqDniqy8w0WKymWQJM l/3pHozJlt3SfzMupZcolyo3ULslfJtyDfJa63wvzyWPGW3QuXYageZ9n92DfqFKVDjk x6Rx6FGIfvck6c4NsXMwpMuXbRkA3apd1rfb+IdzaVFVReeg5XcFuBm7JB1VpehwlAUm JCNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition :content-transfer-encoding:mime-version:robot-unsubscribe:robot-id :git-commit-id:subject:to:references:in-reply-to:reply-to:cc :message-id:from:date:arc-authentication-results; bh=SGT7DwII5YLGNgXmBEoLf58AiT3YMGBARbf68QMilLg=; b=kw4J5nmWivhQgBxzyEVBJZM2ZNN4LOjOcHanYet6RDqPOP0TpKN4sUFA5+qWRPbLUO BB5h117nSLCYhMYr3v7kGbC7z9JhXDHHYvE2kIwB3LI0p40Jk08iBJm8CFYovRI4tZih OtiShrvWB2uLlPgXscWPQRM2mtI8OguR2t5BU4fTFxGhE7GfKNSXyQJpvilX5d9Dab3K XrukKNVOhpfhQlW3d5OEuXcPxTIuPTO746iW+99aiRaO+/k72taxhgcsmTlSlN8KHwi6 DZRoB6Nmn1cuEKwITTNLPFIQFK2rEubpWhZAS4AB9gaEsAVbApUBSQYXltyT2A/t3oJa Oe8A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 17-v6si986832pfn.37.2018.06.19.17.35.35; Tue, 19 Jun 2018 17:35:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754214AbeFTAde (ORCPT + 99 others); Tue, 19 Jun 2018 20:33:34 -0400 Received: from terminus.zytor.com ([198.137.202.136]:53217 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752827AbeFTAdc (ORCPT ); Tue, 19 Jun 2018 20:33:32 -0400 Received: from terminus.zytor.com (localhost [127.0.0.1]) by terminus.zytor.com (8.15.2/8.15.2) with ESMTPS id w5K0XRG93300688 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 19 Jun 2018 17:33:27 -0700 Received: (from tipbot@localhost) by terminus.zytor.com (8.15.2/8.15.2/Submit) id w5K0XR9S3300685; Tue, 19 Jun 2018 17:33:27 -0700 Date: Tue, 19 Jun 2018 17:33:27 -0700 X-Authentication-Warning: terminus.zytor.com: tipbot set sender to tipbot@zytor.com using -f From: tip-bot for Reinette Chatre Message-ID: Cc: tglx@linutronix.de, reinette.chatre@intel.com, mingo@kernel.org, linux-kernel@vger.kernel.org, hpa@zytor.com Reply-To: tglx@linutronix.de, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, mingo@kernel.org, hpa@zytor.com In-Reply-To: <1282e07cd1a5291bc42dfcd117be12916e538ff2.1527593971.git.reinette.chatre@intel.com> References: <1282e07cd1a5291bc42dfcd117be12916e538ff2.1527593971.git.reinette.chatre@intel.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:x86/cache] x86/intel_rdt: Limit C-states dynamically when pseudo-locking active Git-Commit-ID: f61050aefc0ca1c0b3e93114eadd0a910a66202b X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, T_DATE_IN_FUTURE_96_Q autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on terminus.zytor.com Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: f61050aefc0ca1c0b3e93114eadd0a910a66202b Gitweb: https://git.kernel.org/tip/f61050aefc0ca1c0b3e93114eadd0a910a66202b Author: Reinette Chatre AuthorDate: Tue, 29 May 2018 05:58:03 -0700 Committer: Thomas Gleixner CommitDate: Wed, 20 Jun 2018 00:56:40 +0200 x86/intel_rdt: Limit C-states dynamically when pseudo-locking active Deeper C-states impact cache content through shrinking of the cache or flushing entire cache to memory before reducing power to the cache. Deeper C-states will thus negatively impact the pseudo-locked regions. To avoid impacting pseudo-locked regions C-states are limited on pseudo-locked region creation so that cores associated with the pseudo-locked region are prevented from entering deeper C-states. This is accomplished by requesting a CPU latency target which will prevent the core from entering C6 across all supported platforms. Signed-off-by: Reinette Chatre Signed-off-by: Thomas Gleixner Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1282e07cd1a5291bc42dfcd117be12916e538ff2.1527593971.git.reinette.chatre@intel.com --- Documentation/x86/intel_rdt_ui.txt | 4 +- arch/x86/kernel/cpu/intel_rdt.h | 2 + arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 85 ++++++++++++++++++++++++++++- 3 files changed, 87 insertions(+), 4 deletions(-) diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt index bcd0a6d2fcf8..acac30b67c62 100644 --- a/Documentation/x86/intel_rdt_ui.txt +++ b/Documentation/x86/intel_rdt_ui.txt @@ -461,8 +461,8 @@ in the cache via carefully configuring the CAT feature and controlling application behavior. There is no guarantee that data is placed in cache. Instructions like INVD, WBINVD, CLFLUSH, etc. can still evict “locked” data from cache. Power management C-states may shrink or -power off cache. It is thus recommended to limit the processor maximum -C-state, for example, by setting the processor.max_cstate kernel parameter. +power off cache. Deeper C-states will automatically be restricted on +pseudo-locked region creation. It is required that an application using a pseudo-locked region runs with affinity to the cores (or a subset of the cores) associated diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h index b8e490a43290..2d9cbb9d7a58 100644 --- a/arch/x86/kernel/cpu/intel_rdt.h +++ b/arch/x86/kernel/cpu/intel_rdt.h @@ -142,6 +142,7 @@ struct mongroup { * region * @debugfs_dir: pointer to this region's directory in the debugfs * filesystem + * @pm_reqs: Power management QoS requests related to this region */ struct pseudo_lock_region { struct rdt_resource *r; @@ -155,6 +156,7 @@ struct pseudo_lock_region { void *kmem; unsigned int minor; struct dentry *debugfs_dir; + struct list_head pm_reqs; }; /** diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c index 17ed2e9d4551..0d44dc1f7146 100644 --- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c +++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -172,6 +173,76 @@ static struct rdtgroup *region_find_by_minor(unsigned int minor) return rdtgrp_match; } +/** + * pseudo_lock_pm_req - A power management QoS request list entry + * @list: Entry within the @pm_reqs list for a pseudo-locked region + * @req: PM QoS request + */ +struct pseudo_lock_pm_req { + struct list_head list; + struct dev_pm_qos_request req; +}; + +static void pseudo_lock_cstates_relax(struct pseudo_lock_region *plr) +{ + struct pseudo_lock_pm_req *pm_req, *next; + + list_for_each_entry_safe(pm_req, next, &plr->pm_reqs, list) { + dev_pm_qos_remove_request(&pm_req->req); + list_del(&pm_req->list); + kfree(pm_req); + } +} + +/** + * pseudo_lock_cstates_constrain - Restrict cores from entering C6 + * + * To prevent the cache from being affected by power management entering + * C6 has to be avoided. This is accomplished by requesting a latency + * requirement lower than lowest C6 exit latency of all supported + * platforms as found in the cpuidle state tables in the intel_idle driver. + * At this time it is possible to do so with a single latency requirement + * for all supported platforms. + * + * Since Goldmont is supported, which is affected by X86_BUG_MONITOR, + * the ACPI latencies need to be considered while keeping in mind that C2 + * may be set to map to deeper sleep states. In this case the latency + * requirement needs to prevent entering C2 also. + */ +static int pseudo_lock_cstates_constrain(struct pseudo_lock_region *plr) +{ + struct pseudo_lock_pm_req *pm_req; + int cpu; + int ret; + + for_each_cpu(cpu, &plr->d->cpu_mask) { + pm_req = kzalloc(sizeof(*pm_req), GFP_KERNEL); + if (!pm_req) { + rdt_last_cmd_puts("fail allocating mem for PM QoS\n"); + ret = -ENOMEM; + goto out_err; + } + ret = dev_pm_qos_add_request(get_cpu_device(cpu), + &pm_req->req, + DEV_PM_QOS_RESUME_LATENCY, + 30); + if (ret < 0) { + rdt_last_cmd_printf("fail to add latency req cpu%d\n", + cpu); + kfree(pm_req); + ret = -1; + goto out_err; + } + list_add(&pm_req->list, &plr->pm_reqs); + } + + return 0; + +out_err: + pseudo_lock_cstates_relax(plr); + return ret; +} + /** * pseudo_lock_region_init - Initialize pseudo-lock region information * @plr: pseudo-lock region @@ -239,6 +310,7 @@ static int pseudo_lock_init(struct rdtgroup *rdtgrp) return -ENOMEM; init_waitqueue_head(&plr->lock_thread_wq); + INIT_LIST_HEAD(&plr->pm_reqs); rdtgrp->plr = plr; return 0; } @@ -1132,6 +1204,12 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp) if (ret < 0) return ret; + ret = pseudo_lock_cstates_constrain(plr); + if (ret < 0) { + ret = -EINVAL; + goto out_region; + } + plr->thread_done = 0; thread = kthread_create_on_node(pseudo_lock_fn, rdtgrp, @@ -1140,7 +1218,7 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp) if (IS_ERR(thread)) { ret = PTR_ERR(thread); rdt_last_cmd_printf("locking thread returned error %d\n", ret); - goto out_region; + goto out_cstates; } kthread_bind(thread, plr->cpu); @@ -1158,7 +1236,7 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp) * empty pseudo-locking loop. */ rdt_last_cmd_puts("locking thread interrupted\n"); - goto out_region; + goto out_cstates; } if (!IS_ERR_OR_NULL(debugfs_resctrl)) { @@ -1219,6 +1297,8 @@ out_minor: pseudo_lock_minor_release(new_minor); out_debugfs: debugfs_remove_recursive(plr->debugfs_dir); +out_cstates: + pseudo_lock_cstates_relax(plr); out_region: pseudo_lock_region_clear(plr); out: @@ -1252,6 +1332,7 @@ void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp) goto free; } + pseudo_lock_cstates_relax(plr); debugfs_remove_recursive(rdtgrp->plr->debugfs_dir); device_destroy(pseudo_lock_class, MKDEV(pseudo_lock_major, plr->minor)); pseudo_lock_minor_release(plr->minor);