Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp360644ybg; Wed, 10 Jun 2020 02:46:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz2L/wNayWCKRHHaCZwpjUCCLA71/bNckFViZECWEnlPbzQ/dWpdPf6nQWQoPWdit7vMYZv X-Received: by 2002:aa7:cd4b:: with SMTP id v11mr1796571edw.356.1591782365886; Wed, 10 Jun 2020 02:46:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591782365; cv=none; d=google.com; s=arc-20160816; b=TG6YR6stgqKts3BO+P+81Umy/30IMdhwV2l6TrNfvm25RBbfOuEimCYLqEBQJw8oeO wtpWasP/vZuyNVNfjtQ1UJvp8y1td2BjvgZmsUkmLidE+sD/vw6nx9ZXpALvpOvW9C0S yt3OOnsQsYKSoWuimojDRewOcYjHY6Xg4BCHTqv8tDoVALBIoa4vbKQl7L+gyoc77Wdr JH2tXW7FG+Nnmm22ThalEfEB+VpfoBpsEBXdZcMjGnObkiT3MA1Potc0cBDBntAHm8KO GsA/2GM5iZkLrd3MnelLtiyQ56POzAgLSYXSkfC5wXLsy89I4eASJ4hu+MxpTtl0F0Rm Br4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=9uAYZlx2nLKD6w0DIpBDMhRXHGWsFcdJD3zOLQz8k5I=; b=ptk9y7rNYfl71OQRTV3MCAt+Ro06psavrlBW2unP+457nm7/DDYdoXyoh+81jPJbD2 FXgDxSRM3mFwVYTEahcl34d7LZKfFCrdEpr0FBPzpejFeeCbzpNDhVAsf/Y1hFem5KVv Qj/LIpSLCXpaIa0J5F9RGA36mf6YRWibUjZd9cFgOp8F23ROJHT9m7ro5KaT7H6IDLxG SoECMaUMp/dQ2t6QaP8vseQG9T3NDu3QFsZcMychIUf2WEIPdFDMKjs3alqyDK1mzTrw wjr6ZFLY19+YlgnqcrSQbbvkHgHqttsRHnOv+JlXjqlkGrpw/AWU/gM9xw2koIecGxB/ 26sA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y19si3380526edm.387.2020.06.10.02.45.42; Wed, 10 Jun 2020 02:46:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727892AbgFJJlA (ORCPT + 99 others); Wed, 10 Jun 2020 05:41:00 -0400 Received: from foss.arm.com ([217.140.110.172]:55854 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727101AbgFJJlA (ORCPT ); Wed, 10 Jun 2020 05:41:00 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 749F01F1; Wed, 10 Jun 2020 02:40:59 -0700 (PDT) Received: from localhost (e108754-lin.cambridge.arm.com [10.1.198.53]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0D0B73F73D; Wed, 10 Jun 2020 02:40:58 -0700 (PDT) Date: Wed, 10 Jun 2020 10:40:57 +0100 From: Ionela Voinescu To: Sudeep Holla Cc: "Rafael J. Wysocki" , Viresh Kumar , Xiongfeng Wang , "Rafael J. Wysocki" , Hanjun Guo , Linux PM , Linux Kernel Mailing List Subject: Re: [Question]: about 'cpuinfo_cur_freq' shown in sysfs when the CPU is in idle state Message-ID: <20200610094057.GA28144@arm.com> References: <20200603075200.hbyofgcyiwocl565@vireshk-i7> <39d37e1b-7959-9a8f-6876-f2ed4c1dbc37@huawei.com> <20200604044140.xlv7h62jfowo3rxe@vireshk-i7> <20200604125822.GB12397@bogus> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200604125822.GB12397@bogus> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi guys, Sorry for showing up late to the party, I was on holiday last week. On Thursday 04 Jun 2020 at 13:58:22 (+0100), Sudeep Holla wrote: > On Thu, Jun 04, 2020 at 12:42:06PM +0200, Rafael J. Wysocki wrote: > > On Thu, Jun 4, 2020 at 6:41 AM Viresh Kumar wrote: > > > > > > On 04-06-20, 09:32, Xiongfeng Wang wrote: > > > > On 2020/6/3 21:39, Rafael J. Wysocki wrote: > > > > > The frequency value obtained by kicking the CPU out of idle > > > > > artificially is bogus, though. You may as well return a random number > > > > > instead. > > > > > > > > Yes, it may return a randowm number as well. > > > > > > > > > > > > > > The frequency of a CPU in an idle state is in fact unknown in the case > > > > > at hand, so returning 0 looks like the cleanest option to me. > > > > > > > > I am not sure about how the user will use 'cpuinfo_cur_freq' in sysfs. If I > > > > return 0 when the CPU is idle, when I run a light load on the CPU, I will get a > > > > zero value for 'cpuinfo_cur_freq' when the CPU is idle. When the CPU is not > > > > idle, I will get a non-zero value. The user may feel odd about > > > > 'cpuinfo_cur_frreq' switching between a zero value and a non-zero value. They > > > > may hope it can return the frequency when the CPU execute instructions, namely > > > > in C0 state. I am not so sure about the user will look at 'cpuinfo_cur_freq'. > > > > > > This is what I was worried about as well. The interface to sysfs needs > > > to be robust. Returning frequency on some readings and 0 on others > > > doesn't look right to me as well. This will break scripts (I am not > > > sure if some scripts are there to look for these values) with the > > > randomness of values returned by it. > > > > The only thing the scripts need to do is to skip zeros (or anything > > less than the minimum hw frequency for that matter) coming from that > > attribute. > > > > > On reading values locally from the CPU, I thought about the case where > > > userspace can prevent a CPU going into idle just by reading its > > > frequency from sysfs (and so waste power), but the same can be done by > > > userspace to run arbitrary load on the CPUs. > > > > > > Can we do some sort of caching of the last frequency the CPU was > > > running at before going into idle ? Then we can just check if cpu is > > > idle and so return cached value. > > > > That is an option, but it looks like in this case the cpuinfo_cur_freq > > attribute should not be present at all, as per the documentation. > > > > +1 for dropping the attribute. > I've been experimenting with some code quite recently that uses the scheduler frequency scale factor to compute this hardware current rate for CPPC. On the scheduler tick, the scale factor is computed in arch_scale_freq_tick() to give an indication on delivered performance, using AMUs on arm64 [1] and APERF/MPERF on x86 [2]. Basically, this scale factor has the cached value of the average delivered performance between the last two scheduler ticks, on a capacity scale: 0-1024. All that would be needed is to convert from the scheduler frequency scale to the CPPC expected performance scale. The gist of the code would be: delivered_perf = topology_get_freq_scale(cpu); delivered_perf *= fb_ctrs.reference_perf; delivered_perf = div64_u64(delivered_perf << SCHED_CAPACITY_SHIFT, per_cpu(arch_max_freq_scale, cpu)); While this solution is not perfect, it would provide the best view of the hardware "current" rate without the cost of waking up the CPU when idle, scheduling additional work on the CPU, doing checks on whether the CPU is idle and/or providing other caching mechanisms. Do you think such an implementation could make cpuinfo_cur_freq worth keeping? I'm happy to push the patches for this and discuss the details there. Thanks, Ionela. [1] https://lkml.org/lkml/2020/3/5/183 [2] https://lkml.org/lkml/2020/1/22/1039 > -- > Regards, > Sudeep