Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753227AbcDASPL (ORCPT ); Fri, 1 Apr 2016 14:15:11 -0400 Received: from mail-pf0-f182.google.com ([209.85.192.182]:35581 "EHLO mail-pf0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751124AbcDASPG (ORCPT ); Fri, 1 Apr 2016 14:15:06 -0400 Subject: Re: [PATCH v6 7/7][Resend] cpufreq: schedutil: New governor based on scheduler utilization data To: "Rafael J. Wysocki" , Peter Zijlstra References: <7262976.zPkLj56ATU@vostro.rjw.lan> <6666532.7ULg06hQ7e@vostro.rjw.lan> <56F5E1F2.5090100@linaro.org> <56F97548.1030903@linaro.org> <20160331122445.GJ3408@twins.programming.kicks-ass.net> Cc: "Rafael J. Wysocki" , Linux PM list , Juri Lelli , ACPI Devel Maling List , Linux Kernel Mailing List , Srinivas Pandruvada , Viresh Kumar , Vincent Guittot , Michael Turquette , Ingo Molnar From: Steve Muckle X-Enigmail-Draft-Status: N1110 Message-ID: <56FEBAA7.3050305@linaro.org> Date: Fri, 1 Apr 2016 11:15:03 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2082 Lines: 46 On 03/31/2016 05:32 AM, Rafael J. Wysocki wrote: > On Thu, Mar 31, 2016 at 2:24 PM, Peter Zijlstra wrote: >> On Mon, Mar 28, 2016 at 11:17:44AM -0700, Steve Muckle wrote: >>> The scenario I'm contemplating is that while a CPU-intensive task is >>> running a thermal interrupt goes off. The driver for this thermal >>> interrupt responds by capping fmax. If this happens just after the tick, >>> it seems possible that we could wait a full tick before changing the >>> frequency. Given a 10ms tick it could be rather annoying for thermal >>> management algorithms on some platforms (I'm familiar with a few). >> >> So I'm blissfully unaware of all the thermal stuffs we have; but it >> looks like its somehow bolten onto cpufreq without feedback. >> >> The thing I worry about is thermal scaling the CPU back past where RT/DL >> tasks can still complete in time. It should not be able to do that, or >> rather, missing deadlines because thermal is about as useful as >> rebooting the device. I'd agree that impacting RT/DL activity because of throttling may be as bad as as a reset, but that seems worst case. There could be some graceful shutdown or notification/alarm that can be done. Or a platform can simply choose to reset. Shouldn't we try to give the system designer the option of doing something in software (by throttling the CPUs as low as necessary to continue operation) rather than giving up and relying on a hardware reset? > Right. If thermal throttling kicks in, the game is pretty much over. > > That's why ideas float about taking the thermal constraints into > account upfront, but that's a different discussion entirely. Current mainstream mobile platforms frequently throttle during normal operation. I think it's important to have a robust throttling mechanism at least until the more proactive thermal management scheme is fully developed and proves to be equally capable (if and when that happens). >> I guess I'm saying is, the whole cpufreq/thermal 'interface' needs work >> anyhow. > > Yes, it does. Agreed! thanks, Steve