Received: by 2002:ac0:8c9a:0:0:0:0:0 with SMTP id r26csp301112ima; Thu, 31 Jan 2019 17:06:42 -0800 (PST) X-Google-Smtp-Source: AHgI3Ib7zsD1nb7t3bTU03yTNCRkSFaj0IWjT0QacCeXc1rrlUsmVP1AHzT+F3+pNYTjHg3QE1OA X-Received: by 2002:a63:8b41:: with SMTP id j62mr226469pge.182.1548983202167; Thu, 31 Jan 2019 17:06:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548983202; cv=none; d=google.com; s=arc-20160816; b=M2jBs42v46qQgK2UaJnPFhSMKPIr9BFmkcOUkTCNvYL45kbbRTJsg0JADZDuIQHG+N 1hypXyNf/gVpkOjAQwyNbkotj8PxQd9N7p664uqwdg4BFLtovPKcIacdYs/mDsw8+eg3 IrDFUUESbK8VPdC8iLlJGb5TsLZ0xtMGbdULzc+a+iSL20Qpnnyj27CW3gWoSw8tMKVt 8Sdb9dzzIfIATN6snI2wT+XPkly+NATcMpEz3i4C/XreY9NPpnkVoky9grPg5aljse5p 6NqyVITut4GTJKF93KboAXjSyYydKHUcKmBqGs7b1QqTYE900e0IjlZQIjF7S95PNWve vTdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=puEg/s/znWa/UWSxKitWueTRDucfgAW72O4v5T1Buyw=; b=tzi2W7mgBSW6CG8LTCc8ex8Fqc/SkB+jdU8v1W83wq7WzScQCpDsoYNB7AgYiOU01W vPp5xAzrAueDmJiHBq2FByEUDphnf8TJeYNe+SoKfG4LtkDM5leaDRx3GAGJRaqM1Kaz iMWAkT76ySATrqbvpkbYZd7oivCP53MCak0vi4m8Ni9bsmjX+zGVz3VjxIFe1+Holt3N RPRbSa7niSjRCtEnC+/dJ7zc/qCWLtEIXJoZln6Yd28lQyU8xhZL1njnhcC9lVFZs+Zn q3X+t+CyoyzbP2AY3NumQsYjcHP0MiO0CT0U6tC5KQ8e/f+bh63JSQsPn2LY+BCLYl7U ZcVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=QWWuMjTz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h14si5260255pgd.189.2019.01.31.17.06.26; Thu, 31 Jan 2019 17:06:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=QWWuMjTz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727971AbfBAAGp (ORCPT + 99 others); Thu, 31 Jan 2019 19:06:45 -0500 Received: from mail-pl1-f193.google.com ([209.85.214.193]:43804 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727510AbfBAAGo (ORCPT ); Thu, 31 Jan 2019 19:06:44 -0500 Received: by mail-pl1-f193.google.com with SMTP id gn14so2230552plb.10 for ; Thu, 31 Jan 2019 16:06:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=puEg/s/znWa/UWSxKitWueTRDucfgAW72O4v5T1Buyw=; b=QWWuMjTzWmBMEyrM0N77HpmcWFKOQpu6+XfKwZXhY9VQ+LDC91VlA8UHF0XtfRFbbw Z12rAj+beow/9skCM6b7QyZNFZFkOkA2lOUujwTVnK2Wc0C93KOjMqYV8xDzfA2yHS1C fBjsS4fbAV1Rdwe0SuiCEHzp+prQgmiubnarY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=puEg/s/znWa/UWSxKitWueTRDucfgAW72O4v5T1Buyw=; b=tFoijl7SJFSigZpbouEmj6uTuRo2dVGPisphNxys7unSFrtXPcpHJmtjAnUvXSDPF/ a8jrVylGthHQQTWGXYfG9stGcJ4cAYDRynmC8NOGUIZSXcI1/09daQDZDJFxkkwsr9xH FkuooQZrDqebUARAb+mA+diST/tt8AQDLG1qvM4QQUn5K0x01+6cmbbuJvhHqbCl/paP Bcb7FgCpwhyhWSw84VZjR2GUoAdZXaTimK7E40Xfa4D9CsBo9ND4v8J/u1usZzgMDXCw E2XkB94mT2NF6W8UeM3reBKeHmxOieMkwx51JTK2ShBCe/9peO3Ug9DXD8h2vATGmxw3 6BBA== X-Gm-Message-State: AJcUukecoldoS45kNBLrAvN/YgIt5H6G1QhzqO+tYh+oMFc7TQUIC1ie ALmVwHULo6X8psvtxJsGWJnmEg== X-Received: by 2002:a17:902:bb05:: with SMTP id l5mr37626449pls.230.1548979604022; Thu, 31 Jan 2019 16:06:44 -0800 (PST) Received: from localhost ([2620:15c:202:1:75a:3f6e:21d:9374]) by smtp.gmail.com with ESMTPSA id x12sm6057925pgr.55.2019.01.31.16.06.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 31 Jan 2019 16:06:43 -0800 (PST) Date: Thu, 31 Jan 2019 16:06:42 -0800 From: Matthias Kaehlcke To: "Rafael J. Wysocki" Cc: "Rafael J. Wysocki" , Viresh Kumar , Linux PM , Linux Kernel Mailing List , Douglas Anderson Subject: Re: [PATCH] cpufreq: Record stats when fast switching is enabled Message-ID: <20190201000642.GP81583@google.com> References: <20190131015139.126890-1-mka@chromium.org> <20190131183730.GN81583@google.com> <3268787.3OZuCagV1k@aspire.rjw.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <3268787.3OZuCagV1k@aspire.rjw.lan> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 01, 2019 at 12:34:32AM +0100, Rafael J. Wysocki wrote: > On Thursday, January 31, 2019 7:37:30 PM CET Matthias Kaehlcke wrote: > > On Thu, Jan 31, 2019 at 11:14:03AM +0100, Rafael J. Wysocki wrote: > > > On Thu, Jan 31, 2019 at 11:07 AM Viresh Kumar wrote: > > > > > > > > On 31-01-19, 11:03, Rafael J. Wysocki wrote: > > > > > On Thu, Jan 31, 2019 at 9:30 AM Viresh Kumar wrote: > > > > > > > > > > > > On 30-01-19, 17:51, Matthias Kaehlcke wrote: > > > > > > > When fast switching is enabled currently no cpufreq stats are > > > > > > > recorded and the corresponding sysfs attributes appear empty (see > > > > > > > also commit 1aefc75b2449 ("cpufreq: stats: Make the stats code > > > > > > > non-modular")). > > > > > > > > > > > > > > Record the stats after a successful fast switch and re-enable access > > > > > > > through sysfs when fast switching is enabled. Since > > > > > > > cpufreq_stats_update() can now be called in interrupt context (during > > > > > > > a fast switch) disable local IRQs while holding the stats spinlock. > > > > > > > > > > > > > > Signed-off-by: Matthias Kaehlcke > > > > > > > --- > > > > > > > The change is so simple that I wonder if I'm missing some important > > > > > > > reason why the stats can't/shouldn't be updated during/after a fast > > > > > > > switch ... > > > > > > > > > > > > > > I would expect that holding the stats spinlock briefly in > > > > > > > cpufreq_stats_update() shouldn't be a problem. In theory it would > > > > > > > also be an option to have a per stats lock, though it seems overkill > > > > > > > from my (possibly ignorant) point of view. > > > > > > > --- > > > > > > > drivers/cpufreq/cpufreq.c | 8 +++++++- > > > > > > > drivers/cpufreq/cpufreq_stats.c | 11 +++-------- > > > > > > > 2 files changed, 10 insertions(+), 9 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > > > > > > > index e35a886e00bcf..63aadb0bbddfe 100644 > > > > > > > --- a/drivers/cpufreq/cpufreq.c > > > > > > > +++ b/drivers/cpufreq/cpufreq.c > > > > > > > @@ -1857,9 +1857,15 @@ EXPORT_SYMBOL(cpufreq_unregister_notifier); > > > > > > > unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy, > > > > > > > unsigned int target_freq) > > > > > > > { > > > > > > > + unsigned int freq; > > > > > > > + > > > > > > > target_freq = clamp_val(target_freq, policy->min, policy->max); > > > > > > > > > > > > > > - return cpufreq_driver->fast_switch(policy, target_freq); > > > > > > > + freq = cpufreq_driver->fast_switch(policy, target_freq); > > > > > > > + if (freq) > > > > > > > + cpufreq_stats_record_transition(policy, freq); > > > > > > > + > > > > > > > + return freq; > > > > > > > } > > > > > > > EXPORT_SYMBOL_GPL(cpufreq_driver_fast_switch); > > > > > > > > > > > > > > diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c > > > > > > > index 1572129844a5b..21b919bfaeccf 100644 > > > > > > > --- a/drivers/cpufreq/cpufreq_stats.c > > > > > > > +++ b/drivers/cpufreq/cpufreq_stats.c > > > > > > > @@ -30,11 +30,12 @@ struct cpufreq_stats { > > > > > > > static void cpufreq_stats_update(struct cpufreq_stats *stats) > > > > > > > { > > > > > > > unsigned long long cur_time = get_jiffies_64(); > > > > > > > + unsigned long flags; > > > > > > > > > > > > > > - spin_lock(&cpufreq_stats_lock); > > > > > > > + spin_lock_irqsave(&cpufreq_stats_lock, flags); > > > > > > > stats->time_in_state[stats->last_index] += cur_time - stats->last_time; > > > > > > > stats->last_time = cur_time; > > > > > > > - spin_unlock(&cpufreq_stats_lock); > > > > > > > + spin_unlock_irqrestore(&cpufreq_stats_lock, flags); > > > > > > > } > > > > > > > > > > > > The only problem that I can think of (or recall) is that this routine > > > > > > also gets called when time_in_state sysfs file is read and that can > > > > > > end up taking lock which the scheduler's hotpath will wait for. > > > > > > > > > > What about the extra locking overhead in the scheduler context? > > > > > > > > What about using READ_ONCE/WRITE_ONCE here ? Not sure if we really > > > > need locking in this particular case. > > > > > > If that works, then fine, but ISTR some synchronization issues related to that. > > > > I also think there would be synchronization issues :( > > > > Is your main concern with the spin lock the contention case or the > > general overhead of locking? > > The general overhead is bad enough. The contention case would be a > disaster. > > > It would be really nice to have cpufreq stats with schedutil. We > > initially considered a sysfs attribute to allow to temporarily disable > > fast switching, but at closer sight this seems messy (would require > > quite some rework in cpufreq_schedutil.c), besides not recording the > > actual behavior. > > > > If another (rarely and only shortly held) lock in scheduler context > > This is a global spinlock and you'd like to take it on every frequency > change for each policy. On x86, as a rule, there is a policy per logical > CPU and systems with hundreds of these are not uncommon. Come on. Thanks for helping me to get a better understanding of the problem. If the global spinlock was the main issue, this could be fixed by having a per stats/policy lock, but it seems there's more than that. > > is a no-go deferred recording could be an option, if that can be > > implemented without locks in scheduler context. > > Why do you need the stats at all in the fast switch case? For the same reason as in the non-fast switch case, easy access to the stats with existing tooling (or no tooling at all). > There is the cpu_frequency tracepoint that can be used to callect > all data that you need. Why can't that be used? It could be used, but requires non-standard tooling to process the data and tracing must be enabled. Could a CONFIG option make sense to enable it (off by default), or is the overhead (with a per stats lock) so high that it would be unreasonable to use it (I really don't have a good sense on this)? Thanks Matthias