Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp409122ybg; Fri, 12 Jun 2020 04:58:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy8nZ6L57KsZy/KswrOOOucTNOEczvoDkw0lnpwPVeOt3AqdgMb19+JMExaY8FTxy90Qv/T X-Received: by 2002:a17:906:4350:: with SMTP id z16mr12575697ejm.139.1591963109049; Fri, 12 Jun 2020 04:58:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591963109; cv=none; d=google.com; s=arc-20160816; b=wmBkhpMdOKOiqspQQMi+oAaTL7kU1seiKExM21fqehhXoXyvA8VjDmuF9SxjSBst98 N49tq/qmj/yDeFxjB6dYUZW08O70ECNjOCWZgLUEcZyjyTe4OFfbvJZovZrzOSNgLtU5 hiWHuBArEVsCox5ReiHg4ERIwGS9d/zo1ErWWoDba3yi8x7JSIw6wXFRKmPo00wX9AFg 2R52CgRcWfZiw6qvat2IUGNsyL/Y+7U4TXoS//YStvknSaheQU2GzRPK5ceWpJgjJUA5 xZ88xYtR1XWUHA1wlSUyUiMHSJQEakjGbK2J6e1Le0Hs4UGyL5f5Gonl1K13Tjsd/SKU Ug0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=bicOXJoaBdeTIG0+nGNRFjA7MnO99BifAFXX4dFEZGQ=; b=hLEGsF0/dmfNblluwJCpUS6/tMCqCudS+vbV62lZM/FRqDplW4VsV6+jcRJ5JgRL/g U0jvvAsC5yc++jlScjBtR2P8L8uurP7y/eIIucX5mMdbyq43+DO5QNeh9bXNPL+fYTaj tC4/Tx5Buv1aGKSeXZhjOEzvJH/55FUYf2Ka/3m4oUx/7nzQqo4Tp/BldE0aQxxYOdB0 fe6vS8FTc7KVXOSMVEyCqhKAKD1spo1ocAxwA9a7VTznfGmMqj+wHtTsTj+/3LUikXi5 aON/YEqAkzmXYUpb6qdmMkoJj/gTzMIcJF0Vvrni1ngmudbxLYQlY6qlv29wxSiGN4aK ZmfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e26si3850802ejc.211.2020.06.12.04.58.05; Fri, 12 Jun 2020 04:58:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726269AbgFLLzh (ORCPT + 99 others); Fri, 12 Jun 2020 07:55:37 -0400 Received: from mail-oi1-f181.google.com ([209.85.167.181]:39148 "EHLO mail-oi1-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725791AbgFLLzh (ORCPT ); Fri, 12 Jun 2020 07:55:37 -0400 Received: by mail-oi1-f181.google.com with SMTP id d67so8390979oig.6; Fri, 12 Jun 2020 04:55:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bicOXJoaBdeTIG0+nGNRFjA7MnO99BifAFXX4dFEZGQ=; b=fzNOlKeEaahF/qKcL2cRfkb3SLT6Y2dBTKwUWZl8TP+K45uRchNGbYb+/KtmMRF3mD flalqFEk8nmedSzEQbt3IbMVWRbGAZBF0hWMaehkQ5TUDlqODkjsvr0hfOGPfvnxyN53 OtMMLlrTIaAqg5h+dWJIrWgXJW49WhbjgsqX+ltdb/qobx5hHSAHLJ3MQleHIq6YobuU 0YCZm+O3wsfuNpBVzAARG2jktvg+eEPRi3+b8vEtOX9AhQ5j77tD+rrvLIhlUNaSyMRL WCOoRs1WuSADn9rxOBfECrSdqJ5lRIdZzAtTd0LTw7M2B8H1WDKYz3MoYw5F2NwrCMzx 0uMg== X-Gm-Message-State: AOAM531Io86SY+OwvyPCNaZO+6bFDsVTkzq+c4D2EDf0xAlwjTUv8guu 5jC2pmhgPByJ9MGJY+mu7+0khIHvlX2dcHGvqKc= X-Received: by 2002:aca:ad88:: with SMTP id w130mr1897354oie.103.1591962935897; Fri, 12 Jun 2020 04:55:35 -0700 (PDT) MIME-Version: 1.0 References: <20200603075200.hbyofgcyiwocl565@vireshk-i7> <39d37e1b-7959-9a8f-6876-f2ed4c1dbc37@huawei.com> <20200604044140.xlv7h62jfowo3rxe@vireshk-i7> <20200604125822.GB12397@bogus> <20200610094057.GA28144@arm.com> <9d715cb0-0a94-0008-5653-e8d9ad1c5332@huawei.com> In-Reply-To: <9d715cb0-0a94-0008-5653-e8d9ad1c5332@huawei.com> From: "Rafael J. Wysocki" Date: Fri, 12 Jun 2020 13:55:20 +0200 Message-ID: Subject: Re: [Question]: about 'cpuinfo_cur_freq' shown in sysfs when the CPU is in idle state To: Xiongfeng Wang Cc: Ionela Voinescu , Sudeep Holla , "Rafael J. Wysocki" , Viresh Kumar , "Rafael J. Wysocki" , Hanjun Guo , Linux PM , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 11, 2020 at 3:52 AM Xiongfeng Wang wrote: > > Hi Ionela, > > Thanks for your reply ! > > On 2020/6/10 17:40, Ionela Voinescu wrote: > > Hi guys, > > > > Sorry for showing up late to the party, I was on holiday last week. > > > > On Thursday 04 Jun 2020 at 13:58:22 (+0100), Sudeep Holla wrote: > >> On Thu, Jun 04, 2020 at 12:42:06PM +0200, Rafael J. Wysocki wrote: > >>> On Thu, Jun 4, 2020 at 6:41 AM Viresh Kumar wrote: > >>>> > >>>> On 04-06-20, 09:32, Xiongfeng Wang wrote: > >>>>> On 2020/6/3 21:39, Rafael J. Wysocki wrote: > >>>>>> The frequency value obtained by kicking the CPU out of idle > >>>>>> artificially is bogus, though. You may as well return a random number > >>>>>> instead. > >>>>> > >>>>> Yes, it may return a randowm number as well. > >>>>> > >>>>>> > >>>>>> The frequency of a CPU in an idle state is in fact unknown in the case > >>>>>> at hand, so returning 0 looks like the cleanest option to me. > >>>>> > >>>>> I am not sure about how the user will use 'cpuinfo_cur_freq' in sysfs. If I > >>>>> return 0 when the CPU is idle, when I run a light load on the CPU, I will get a > >>>>> zero value for 'cpuinfo_cur_freq' when the CPU is idle. When the CPU is not > >>>>> idle, I will get a non-zero value. The user may feel odd about > >>>>> 'cpuinfo_cur_frreq' switching between a zero value and a non-zero value. They > >>>>> may hope it can return the frequency when the CPU execute instructions, namely > >>>>> in C0 state. I am not so sure about the user will look at 'cpuinfo_cur_freq'. > >>>> > >>>> This is what I was worried about as well. The interface to sysfs needs > >>>> to be robust. Returning frequency on some readings and 0 on others > >>>> doesn't look right to me as well. This will break scripts (I am not > >>>> sure if some scripts are there to look for these values) with the > >>>> randomness of values returned by it. > >>> > >>> The only thing the scripts need to do is to skip zeros (or anything > >>> less than the minimum hw frequency for that matter) coming from that > >>> attribute. > >>> > >>>> On reading values locally from the CPU, I thought about the case where > >>>> userspace can prevent a CPU going into idle just by reading its > >>>> frequency from sysfs (and so waste power), but the same can be done by > >>>> userspace to run arbitrary load on the CPUs. > >>>> > >>>> Can we do some sort of caching of the last frequency the CPU was > >>>> running at before going into idle ? Then we can just check if cpu is > >>>> idle and so return cached value. > >>> > >>> That is an option, but it looks like in this case the cpuinfo_cur_freq > >>> attribute should not be present at all, as per the documentation. > >>> > >> > >> +1 for dropping the attribute. > >> > > > > I've been experimenting with some code quite recently that uses the > > scheduler frequency scale factor to compute this hardware current rate > > for CPPC. > > > > On the scheduler tick, the scale factor is computed in > > arch_scale_freq_tick() to give an indication on delivered performance, > > using AMUs on arm64 [1] and APERF/MPERF on x86 [2]. Basically, this scale > > factor has the cached value of the average delivered performance between > > the last two scheduler ticks, on a capacity scale: 0-1024. All that would > > be needed is to convert from the scheduler frequency scale to the CPPC > > expected performance scale. > > > > The gist of the code would be: > > > > delivered_perf = topology_get_freq_scale(cpu); > > delivered_perf *= fb_ctrs.reference_perf; > > delivered_perf = div64_u64(delivered_perf << SCHED_CAPACITY_SHIFT, > > per_cpu(arch_max_freq_scale, cpu)); > > > > While this solution is not perfect, it would provide the best view of > > the hardware "current" rate without the cost of waking up the CPU when > > idle, scheduling additional work on the CPU, doing checks on whether > > the CPU is idle and/or providing other caching mechanisms. > > I think it's a good idea. It's just that the value is a average frequency in the > last two scheduler ticks, not an instantaneous frequency. What > 'cppc_cpufreq_get_rate()' get is also not an instantaneous frequency. It's a > average frequency in 2us. I check the interval between two frequency updates on > my machine. It's about 10ms. So the frequency will change at least one time in > two scheduler ticks if HZ is 1000. > > I am not sure how people will use 'cpuinfo_cur_freq'. They may not expect a very > accurate frequency. How about we return this value when CPU is idle? Or just > return 0 in idle is better ? According to the documentation, if this attribute is present, it should return the exact frequency of the CPU (with a caveat that it may change already when user space is able to consume the value), and which is what the users may expect. As I said before, IMO the CPPC driver should not cause the core to expose cpuinfo_cur_freq at all. Thanks!