Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751121AbdGMFLf (ORCPT ); Thu, 13 Jul 2017 01:11:35 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:46226 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750748AbdGMFLe (ORCPT ); Thu, 13 Jul 2017 01:11:34 -0400 Date: Thu, 13 Jul 2017 07:11:22 +0200 (CEST) From: Thomas Gleixner To: "dbasehore ." cc: LKML , Ingo Molnar , Rajneesh Bhardwaj , x86@kernel.org, Platform Driver , "Rafael J . Wysocki" , Len Brown , Linux-pm mailing list , Peter Zijlstra Subject: Re: [PATCH v5 5/5] intel_idle: Add S0ix validation In-Reply-To: Message-ID: References: <20170708000303.21863-1-dbasehore@chromium.org> <20170708000303.21863-5-dbasehore@chromium.org> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1937 Lines: 39 On Wed, 12 Jul 2017, dbasehore . wrote: > On Wed, Jul 12, 2017 at 3:16 PM, Thomas Gleixner wrote: > > There are more issues with this: If there is a hrtimer scheduled on that > > last CPU which enters the idle freeze state and that timer is 10 minutes > > away, then the check timer can't be programmed and the system will happily > > stay for 10 minutes in some shallow C state without notice. Not really > > useful. > > Are hrtimers not suspended after timekeeping_suspend is called? They are. As I said I forgot about the inner workings and that check for state != shutdown confused me even more, as it just looked like this might be a valid state. > > You know upfront whether the i915 power wells (or whatever other machinery) > > is not powered off to allow the system to enter a specific power state. If > > you think hard enough about creating infrastructure which allows you to > > register power related facilities and then check them in that idle freeze > > enter state, then you get immediate information WHY this happens and not > > just the by chance notification about the fact that it happened. > > It's not always something that can be checked by software. There was > one case where an ordering for powering down audio hardware prevented > proper PC10 entry, but there didn't seem to be any way to check that. > Hardware watchdogs also have the same lack of clarity, but most if not > all desktop and mobile processors ship with one. Overall, this seems > to be the best that can be done at this point in freeze, and we can't > really rely on every part of the system properly validating it's state > in its suspend operation. So if I understand correctly, this is the last resort of catching problems which can't be detected upfront or are caused by a software bug. I'm fine with that, but please explain and document it proper. The current explanation is confusing at best. Thanks, tglx