Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp874730pxb; Thu, 15 Apr 2021 08:31:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzfBzh6wTxpYYbauU80buAqm3wOIpcNL2HIjIKcQzm3e2QCZih+/UD62Xois1MO9QRDrzC3 X-Received: by 2002:a17:906:6bd3:: with SMTP id t19mr4143483ejs.232.1618500716416; Thu, 15 Apr 2021 08:31:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618500716; cv=none; d=google.com; s=arc-20160816; b=cNWfmiLvrCSpT/YreMvZ2h8V4Q12T/3tbAC2hF9kIISqqk28w2gyIwDqCMkjlrDqz7 ensIClo4U9Af2wU8/zBJrTb8tAZHeZpr510n3vxxr0MxD3/BoRbE54isEoqL1Et8zw4w F1Qnm+kY/b5CqY4pBM9OfCXiJGFunkvCLqLwOHbpyrOqSUTxcHAvgCHYN+C8aZ+CGo1Z ISNrRSd/FRAb+4d/CR85o3FQp8ooACPrMsZRKjEdT6xAXSJA+wJUJCpH7fH8E3OUKTNC 5IktxvxPvsisOQz7D0u+j3ATD/oS8GCp51oxr9aO2MrJM+CaJ28Bnb+CbcT0xEVeEXeT VG2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=To0ue+yewe8pKGfFvfiaMZTqPrGdw6NKt49LfYxU+Gc=; b=cfbldzPZsShhkiRn40d9ojJTNoWsMIdYFVvwRlj6LEtEn/A0cYAFbcdQR5ImqchKQl 9OBSZfgd1vbeKTDgNpLhyF0EjUeXpG0/5CA1sbo+Z10yPzTNiO4Pjy7kJcj2e8b+rD67 vJyEfjDKY2mr8Bq3F/5Nt37KX9cMbONJbFVv4uimwnk1VZ4cgADgXgMP9AHW4XSun3lo N6D2XFiXMM90gdOuq0xXCaXyyhCzxYdFWMbJEXd/rXr8zG7X4YhBCvjq3AonPGxPuyzK HPtIDRN/KhjVoIwYwfAp3+ZK9p5bZxb5WB4CZ7QAW+YMO6jFFdtg/83Nm5hLHXpBGcy0 fXkw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d11si2524797ejm.20.2021.04.15.08.31.31; Thu, 15 Apr 2021 08:31:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233094AbhDOP20 (ORCPT + 99 others); Thu, 15 Apr 2021 11:28:26 -0400 Received: from foss.arm.com ([217.140.110.172]:49336 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231137AbhDOP2Z (ORCPT ); Thu, 15 Apr 2021 11:28:25 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 81CB4106F; Thu, 15 Apr 2021 08:28:02 -0700 (PDT) Received: from e120877-lin.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4A61C3FA35; Thu, 15 Apr 2021 08:28:01 -0700 (PDT) Date: Thu, 15 Apr 2021 16:27:59 +0100 From: Vincent Donnefort To: Quentin Perret Cc: peterz@infradead.org, rjw@rjwysocki.net, viresh.kumar@linaro.org, vincent.guittot@linaro.org, linux-kernel@vger.kernel.org, ionela.voinescu@arm.com, lukasz.luba@arm.com, dietmar.eggemann@arm.com Subject: Re: [PATCH] PM / EM: Inefficient OPPs detection Message-ID: <20210415152758.GD391924@e120877-lin.cambridge.arm.com> References: <1617901829-381963-1-git-send-email-vincent.donnefort@arm.com> <1617901829-381963-2-git-send-email-vincent.donnefort@arm.com> <20210415141207.GA391924@e120877-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 15, 2021 at 03:04:34PM +0000, Quentin Perret wrote: > On Thursday 15 Apr 2021 at 15:12:08 (+0100), Vincent Donnefort wrote: > > On Thu, Apr 15, 2021 at 01:12:05PM +0000, Quentin Perret wrote: > > > Hi Vincent, > > > > > > On Thursday 08 Apr 2021 at 18:10:29 (+0100), Vincent Donnefort wrote: > > > > Some SoCs, such as the sd855 have OPPs within the same performance domain, > > > > whose cost is higher than others with a higher frequency. Even though > > > > those OPPs are interesting from a cooling perspective, it makes no sense > > > > to use them when the device can run at full capacity. Those OPPs handicap > > > > the performance domain, when choosing the most energy-efficient CPU and > > > > are wasting energy. They are inefficient. > > > > > > > > Hence, add support for such OPPs to the Energy Model, which creates for > > > > each OPP a performance state. The Energy Model can now be read using the > > > > regular table, which contains all performance states available, or using > > > > an efficient table, where inefficient performance states (and by > > > > extension, inefficient OPPs) have been removed. > > > > > > > > Currently, the efficient table is used in two paths. Schedutil, and > > > > find_energy_efficient_cpu(). We have to modify both paths in the same > > > > patch so they stay synchronized. The thermal framework still relies on > > > > the original table and hence, DevFreq devices won't create the efficient > > > > table. > > > > > > > > As used in the hot-path, the efficient table is a lookup table, generated > > > > dynamically when the perf domain is created. The complexity of searching > > > > a performance state is hence changed from O(n) to O(1). This also > > > > speeds-up em_cpu_energy() even if no inefficient OPPs have been found. > > > > > > Interesting. Do you have measurements showing the benefits on wake-up > > > duration? I remember doing so by hacking the wake-up path to force tasks > > > into feec()/compute_energy() even when overutilized, and then running > > > hackbench. Maybe something like that would work for you? > > > > I'll give a try and see if I get improved numbers. > > > > > > > > Just want to make sure we actually need all that complexity -- while > > > it's good to reduce the asymptotic complexity, we're looking at a rather > > > small problem (max 30 OPPs or so I expect?), so other effects may be > > > dominating. Simply skipping inefficient OPPs could be implemented in a > > > much simpler way I think. > > > > I could indeed just skip the perf state if marked as ineffective. But the idea > > was to avoid bringing another for loop in this hot-path. > > Right, though it would just extend a little bit the existing loop, so > the overhead is unlikely to be noticeable. In the case where we let cpufreq_table resolution, it's a whole new loop that we would bring. In the case where we rely only on the EM resolution and bypass the cpufreq_table though it would be even. But with the look-up table, we're winning everywhere :-) Anyway I'll see if I can measure any improvement here. -- Vincent > > > Also, not covered by this patch but probably we could get rid of the EM > > complexity limit as the table resolution is way faster with this change. > > Probably yeah. I was considering removing it since eb92692b2544 > ("sched/fair: Speed-up energy-aware wake-ups") but ended up keeping it > as it's entirely untested on large systems. But maybe we can reconsider. > > Thanks, > Quentin