Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752728AbbKJNAl (ORCPT ); Tue, 10 Nov 2015 08:00:41 -0500 Received: from mail-yk0-f170.google.com ([209.85.160.170]:35890 "EHLO mail-yk0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751284AbbKJNAj (ORCPT ); Tue, 10 Nov 2015 08:00:39 -0500 MIME-Version: 1.0 In-Reply-To: References: <87h9kulkfg.wl%kuninori.morimoto.gx@renesas.com> <87fv0elibh.wl%kuninori.morimoto.gx@renesas.com> <87d1vili9c.wl%kuninori.morimoto.gx@renesas.com> Date: Tue, 10 Nov 2015 14:00:38 +0100 Message-ID: Subject: Re: [PATCH 2/2] thermal: rcar_thermal: use pm_runtime_put_sync() From: Ulf Hansson To: Geert Uytterhoeven Cc: Kuninori Morimoto , Zhang Rui , Eduardo Valentin , Linux-SH , Linux-Kernel , Linux PM list , Cao Minh Hiep , =?UTF-8?B?RHVuZ++8muS6uuOCvQ==?= , Alan Stern , "Rafael J. Wysocki" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4837 Lines: 114 +Rafael, Alan On 10 November 2015 at 11:10, Geert Uytterhoeven wrote: > Hi Ulf, > > On Tue, Nov 10, 2015 at 10:57 AM, Ulf Hansson wrote: >> On 10 November 2015 at 09:18, Geert Uytterhoeven wrote: >>> On Tue, Nov 10, 2015 at 3:12 AM, Kuninori Morimoto >>> wrote: >>>> From: Kuninori Morimoto >>>> >>>> It is using pm_runtime_get_sync() on probe(). Let's use >>>> pm_runtime_put_sync() instead of pm_runtime_put(). Otherwise thermal >>>> sensor doesn't work after unbind/re-bind >>>> >>>> Signed-off-by: Kuninori Morimoto >>>> --- >>>> drivers/thermal/rcar_thermal.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c >>>> index 13d01ed..f7cf2d7 100644 >>>> --- a/drivers/thermal/rcar_thermal.c >>>> +++ b/drivers/thermal/rcar_thermal.c >>>> @@ -373,7 +373,7 @@ static int rcar_thermal_remove(struct platform_device *pdev) >>>> thermal_zone_device_unregister(priv->zone); >>>> } >>>> >>>> - pm_runtime_put(dev); >>>> + pm_runtime_put_sync(dev); >>>> pm_runtime_disable(dev); >> >> For the reasons explained by Geert, this is to me also a "workaround". >> >> I would replace pm_runtime_put() and pm_runtime_disable() with a call >> to pm_runtime_force_suspend(). >> >> In that way, you will make sure you device get runtime suspended >> (clock domain will gate the clock). Additionally, the runtime PM >> status will properly reflect the status of the device. > > That still sounds like a workaround to me, which we have to apply to all > drivers relying on Runtime PM? Definitely not all drivers, but those that runs pm_runtime_get_sync() during ->probe() and expects the ->runtime_resume() callback to always be invoked because of that. I guess we need to check upon which drivers that may suffer from this. I wouldn't be surprised if at least a subset of those cases we find, are poorly designed from PM point of view and won't even probe successfully unless CONFIG_PM is set. Whatever that means... > >>> With a bit more debugging info, this is the difference between the failing >>> and the "fixed" cases: >>> >>> unbind: >>> >>> +rcar_thermal e61f0000.thermal: pm_clk_suspend() >>> +renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF >>> rcar_thermal e61f0000.thermal: removing from PM domain clock-controller >>> pm_genpd_remove_device: Remove e61f0000.thermal from clock-controller >>> -renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal OFF >>> >>> bind: >>> >>> rcar_thermal e61f0000.thermal: adding to PM domain clock-controller >>> __pm_genpd_add_device: Add e61f0000.thermal to clock-controller >>> rcar_thermal e61f0000.thermal: Clock thermal con_id (null) managed by >>> runtime PM. >>> -rcar_thermal e61f0000.thermal: thermal sensor was broken >>> +rcar_thermal e61f0000.thermal: pm_clk_resume() >>> +renesas-cpg-mssr e6150000.clock-controller: MSTP 522/thermal ON >>> rcar_thermal e61f0000.thermal: 1 sensor probed >>> >>> In the failing case, pm_clk_suspend() is not called, and turning off the >>> module clock is thus delayed until removal of the device from the clock >>> domain. >>> But as pm_clk_suspend() wasn't called, the device isn't correctly resumed on >>> rebind, and the module clock is never re-enabled, leading to a failure. >>> >>> Ulf, what do you think? >> >> I totally agree on your analyse. >> >> The problem is that the runtime PM status of the device isn't >> correctly updated at ->remove(). The effect is that the the >> pm_runtime_get_sync() in ->probe() at re-bind will *not* trigger the >> ->runtime_resume() callbacks to be invoked, as the runtime PM core >> believes the device is already runtime resumed. > > So that's where it should be fixed? That would be a more generic approach, although I am not sure how the driver/PM core should be able to take the correct decision in this phase. Devices may be runtime PM managed also without a driver bound. Perhaps when __device_release_driver() finds a bounded driver for the device, it could after all actions been performed to unbind the driver, check if runtime PM is enabled. If it isn't, it could set the runtime PM status to suspended!? I have no idea if that would introduce other issues as it would kind of force the runtime PM status of the device to suspend, without actually knowing if it's the correct thing to do. Kind regards Uffe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/