Received: by 2002:ac0:8c9a:0:0:0:0:0 with SMTP id r26csp231900ima; Thu, 31 Jan 2019 15:36:16 -0800 (PST) X-Google-Smtp-Source: AHgI3IZm8f5+uQiyRyKeP1xwTR1wzpL1s/eXvvlQ0PfGQlz5e+aIcbeDnFrFn4Y6jzDWH1nxXojD X-Received: by 2002:a63:f241:: with SMTP id d1mr553pgk.2.1548977772487; Thu, 31 Jan 2019 15:36:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548977772; cv=none; d=google.com; s=arc-20160816; b=n/Ng69VUf/sZMk5hq9oL4zkuG3UYXOvKqBs+yZv5I93Sn2aMJbvzAMcSGz31MIN3zM 4Os0vwhAqBPvuihKd5pcdpzAbcOiONZc5JYzBlBBiNTqIG/qffIOgybeY1Ny67+9o2ez NhJXVXo7wBYxX9pkNx79bvlWK7KGhsibIUHx/9I8FmHBQwmqa1NsFzblf1O/COptulvD vUB3lBjmOUaTxhj0kEnatUrw5/rUsvXQ1D9HpwX0EUqbit0IClCifXPWHNTiQ10ErnmF 6aCrbIrUY5OAtrr4Gfq/naenLGqv3ZcpglnMg86E2G41y8hBUvymbvRZ9qk8Gk5pkGEr SFjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=qijXabIHkZ4gV9QOtK/qTfopo8kuzSnyvR4J5EHfKoQ=; b=YA4OjvVqn7YLhcVU510ZNwyyGRBG6FPesS2Z2+Bv1RdTb/Rc5Dey7I88mzhZ+AE29v p/7KvUcfSKwkzi6m8Do92gQhwTerlfW6FJcOlU11ApOkiy/ESEx1qPHvPjHcJinB4YqU 5rglwHlez4PlAcC7tVAdqNFSFcevwaY5CJ+/JA8DGV6bEoqGXUDoZLvvks88GtBcqFgP v7CajPIxU8L7MQqYcDiFb3jFBtVPkmymnkHtg4/hI4DCO16xSSTiFz+qz20iEsfBxZgD IxtGUDuT188dboGmSILOfgDfDfNxqp1v9UHmxbXeOCnUNbBQatTq2O8wQFXTZNRLt9Ll zkhg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n18si4733885pfj.30.2019.01.31.15.35.57; Thu, 31 Jan 2019 15:36:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727283AbfAaXfn (ORCPT + 99 others); Thu, 31 Jan 2019 18:35:43 -0500 Received: from cloudserver094114.home.pl ([79.96.170.134]:58520 "EHLO cloudserver094114.home.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725876AbfAaXfn (ORCPT ); Thu, 31 Jan 2019 18:35:43 -0500 Received: from 79.184.255.169.ipv4.supernova.orange.pl (79.184.255.169) (HELO aspire.rjw.lan) by serwer1319399.home.pl (79.96.170.134) with SMTP (IdeaSmtpServer 0.83.183) id a08ce94b43690ac3; Fri, 1 Feb 2019 00:35:40 +0100 From: "Rafael J. Wysocki" To: Matthias Kaehlcke Cc: "Rafael J. Wysocki" , Viresh Kumar , Linux PM , Linux Kernel Mailing List , Douglas Anderson Subject: Re: [PATCH] cpufreq: Record stats when fast switching is enabled Date: Fri, 01 Feb 2019 00:34:32 +0100 Message-ID: <3268787.3OZuCagV1k@aspire.rjw.lan> In-Reply-To: <20190131183730.GN81583@google.com> References: <20190131015139.126890-1-mka@chromium.org> <20190131183730.GN81583@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thursday, January 31, 2019 7:37:30 PM CET Matthias Kaehlcke wrote: > On Thu, Jan 31, 2019 at 11:14:03AM +0100, Rafael J. Wysocki wrote: > > On Thu, Jan 31, 2019 at 11:07 AM Viresh Kumar wrote: > > > > > > On 31-01-19, 11:03, Rafael J. Wysocki wrote: > > > > On Thu, Jan 31, 2019 at 9:30 AM Viresh Kumar wrote: > > > > > > > > > > On 30-01-19, 17:51, Matthias Kaehlcke wrote: > > > > > > When fast switching is enabled currently no cpufreq stats are > > > > > > recorded and the corresponding sysfs attributes appear empty (see > > > > > > also commit 1aefc75b2449 ("cpufreq: stats: Make the stats code > > > > > > non-modular")). > > > > > > > > > > > > Record the stats after a successful fast switch and re-enable access > > > > > > through sysfs when fast switching is enabled. Since > > > > > > cpufreq_stats_update() can now be called in interrupt context (during > > > > > > a fast switch) disable local IRQs while holding the stats spinlock. > > > > > > > > > > > > Signed-off-by: Matthias Kaehlcke > > > > > > --- > > > > > > The change is so simple that I wonder if I'm missing some important > > > > > > reason why the stats can't/shouldn't be updated during/after a fast > > > > > > switch ... > > > > > > > > > > > > I would expect that holding the stats spinlock briefly in > > > > > > cpufreq_stats_update() shouldn't be a problem. In theory it would > > > > > > also be an option to have a per stats lock, though it seems overkill > > > > > > from my (possibly ignorant) point of view. > > > > > > --- > > > > > > drivers/cpufreq/cpufreq.c | 8 +++++++- > > > > > > drivers/cpufreq/cpufreq_stats.c | 11 +++-------- > > > > > > 2 files changed, 10 insertions(+), 9 deletions(-) > > > > > > > > > > > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > > > > > > index e35a886e00bcf..63aadb0bbddfe 100644 > > > > > > --- a/drivers/cpufreq/cpufreq.c > > > > > > +++ b/drivers/cpufreq/cpufreq.c > > > > > > @@ -1857,9 +1857,15 @@ EXPORT_SYMBOL(cpufreq_unregister_notifier); > > > > > > unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy, > > > > > > unsigned int target_freq) > > > > > > { > > > > > > + unsigned int freq; > > > > > > + > > > > > > target_freq = clamp_val(target_freq, policy->min, policy->max); > > > > > > > > > > > > - return cpufreq_driver->fast_switch(policy, target_freq); > > > > > > + freq = cpufreq_driver->fast_switch(policy, target_freq); > > > > > > + if (freq) > > > > > > + cpufreq_stats_record_transition(policy, freq); > > > > > > + > > > > > > + return freq; > > > > > > } > > > > > > EXPORT_SYMBOL_GPL(cpufreq_driver_fast_switch); > > > > > > > > > > > > diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c > > > > > > index 1572129844a5b..21b919bfaeccf 100644 > > > > > > --- a/drivers/cpufreq/cpufreq_stats.c > > > > > > +++ b/drivers/cpufreq/cpufreq_stats.c > > > > > > @@ -30,11 +30,12 @@ struct cpufreq_stats { > > > > > > static void cpufreq_stats_update(struct cpufreq_stats *stats) > > > > > > { > > > > > > unsigned long long cur_time = get_jiffies_64(); > > > > > > + unsigned long flags; > > > > > > > > > > > > - spin_lock(&cpufreq_stats_lock); > > > > > > + spin_lock_irqsave(&cpufreq_stats_lock, flags); > > > > > > stats->time_in_state[stats->last_index] += cur_time - stats->last_time; > > > > > > stats->last_time = cur_time; > > > > > > - spin_unlock(&cpufreq_stats_lock); > > > > > > + spin_unlock_irqrestore(&cpufreq_stats_lock, flags); > > > > > > } > > > > > > > > > > The only problem that I can think of (or recall) is that this routine > > > > > also gets called when time_in_state sysfs file is read and that can > > > > > end up taking lock which the scheduler's hotpath will wait for. > > > > > > > > What about the extra locking overhead in the scheduler context? > > > > > > What about using READ_ONCE/WRITE_ONCE here ? Not sure if we really > > > need locking in this particular case. > > > > If that works, then fine, but ISTR some synchronization issues related to that. > > I also think there would be synchronization issues :( > > Is your main concern with the spin lock the contention case or the > general overhead of locking? The general overhead is bad enough. The contention case would be a disaster. > It would be really nice to have cpufreq stats with schedutil. We > initially considered a sysfs attribute to allow to temporarily disable > fast switching, but at closer sight this seems messy (would require > quite some rework in cpufreq_schedutil.c), besides not recording the > actual behavior. > > If another (rarely and only shortly held) lock in scheduler context This is a global spinlock and you'd like to take it on every frequency change for each policy. On x86, as a rule, there is a policy per logical CPU and systems with hundreds of these are not uncommon. Come on. > is a no-go deferred recording could be an option, if that can be > implemented without locks in scheduler context. Why do you need the stats at all in the fast switch case? There is the cpu_frequency tracepoint that can be used to callect all data that you need. Why can't that be used?