Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp822765pxb; Thu, 21 Oct 2021 10:08:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzoWKYxPzzDKSIVxZam7YFHvHv5ANMec6TmzMVQeQ2tSkODrG52/sf+HHk3R59LlwBfuFtM X-Received: by 2002:a50:bf4a:: with SMTP id g10mr9331799edk.11.1634836088306; Thu, 21 Oct 2021 10:08:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634836088; cv=none; d=google.com; s=arc-20160816; b=Zb3QnmP6rJOZ18YNqKx6qOecdJx8Ybd4w0MPqewYpa69C618Rt5pRiYKPElK+g4gri aJ5qoELC6Ra8VmSF8g1BBukb8CWBPMQox+PXS/3RNmMzHolQpsrZw9+CE8dmGgMa4SWs 0AnGHrJIDwF3E3dmtHiQpEovxK1dn+sINKVlDbKD2Vwrz6LZwWpQAgEboyFr6dwKJK3Z 85X4lK+qjhYh1y5djcBR3nAlT92ulbWrZuA48x9RBxeH1hgPoC7rNHqhDQ5H+mUe7YQi 2DjRnHLSEd4WjMW3DD8e5yTA3C0Vqyrg+G2m0iKMKk1y/MkDy0G1ysvSohdKUgWBQvhe Ac3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version; bh=rOZ8eo0fAHNOTtLyKjYuQbxB9Mnz8nPucoboPueo7ug=; b=Tuocq8zjeKj1i5mqUmrmTRXsxWZNZh6USgly865qyT1nNw4uujk5KIGr7E2awBfUgk FTV1sj6rBAXN65TOJXT+ALaK9fqXgY240zgetT4MU1VggbKTQVso0EUOVvpBj3kNK7Nv HZn1J0TicvttrcZ7Xe98G98Irk0GwGWT7rikelZeNm/cmV9XWnvxVe9PO5hROAyefe6Q oTfhCXLV6Kol4pwH09+0E2t4eNRk1o8mh8WZbgeLqQHen+17D9dbPDCqmPZcpH4TMEbq 4c3vll88e9oYVbpUwBIgpDJiZ2BtBAuI2mvOYQ5+636uE6Y/R8/+0ygdpFIjXSwp4/OW U/pw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qk9si1272574ejc.487.2021.10.21.10.07.44; Thu, 21 Oct 2021 10:08:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232262AbhJURIH (ORCPT + 99 others); Thu, 21 Oct 2021 13:08:07 -0400 Received: from mail-ot1-f43.google.com ([209.85.210.43]:35548 "EHLO mail-ot1-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231959AbhJURH4 (ORCPT ); Thu, 21 Oct 2021 13:07:56 -0400 Received: by mail-ot1-f43.google.com with SMTP id w12-20020a056830410c00b0054e7ceecd88so1250274ott.2; Thu, 21 Oct 2021 10:05:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=rOZ8eo0fAHNOTtLyKjYuQbxB9Mnz8nPucoboPueo7ug=; b=VOfEqC2gB/2/FQaoC96lL7CkBJMnkAfB5HyRHD6sLA9wHaxY+D1XhYbXFxVm8ZONnm /ZkN5TwOZp+3VvjVBaIfwHlOHlMaSrsbyZvVcR7per5+RWYlg0CZ/IbS6or6IpsmT4ic CKqnw5cHvaD/MRqNc2/4UwaNiP8uJZTM2aFKVvvBnGbrlAQmcV5vTlji7/qXt27ITHhZ L5FAOqEAVUMa+l5Dcr6SuwJsWdlEwix+z3kJjILLPA+Em3jXHeM6ciJAmKKKlnLk9M/4 S/B9u7/ADnAVECdIfnGWLtacxRuqLmwhYyNuuMhtiEIKaZpPulAMCRMSBq+9J18DtIvD EncA== X-Gm-Message-State: AOAM530OVoki5Tp+eG+AuoqpshcDLkgM7yDZejyAqq6zlFMIV+e5dWW0 N3psXuiT77jHXCxE5YSZTBxRjdxDaRi0CphHOGY= X-Received: by 2002:a05:6830:90b:: with SMTP id v11mr5843894ott.254.1634835939753; Thu, 21 Oct 2021 10:05:39 -0700 (PDT) MIME-Version: 1.0 References: <20210929144451.113334-1-ulf.hansson@linaro.org> <20210929144451.113334-3-ulf.hansson@linaro.org> In-Reply-To: From: "Rafael J. Wysocki" Date: Thu, 21 Oct 2021 19:05:28 +0200 Message-ID: Subject: Re: [PATCH 2/2] PM: sleep: Fix runtime PM based cpuidle support To: Ulf Hansson Cc: Daniel Lezcano , Linux PM , Maulik Shah , Peter Zijlstra , Vincent Guittot , Len Brown , Bjorn Andersson , Linux ARM , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 21, 2021 at 6:41 PM Rafael J. Wysocki wrote: > > On Thu, Oct 21, 2021 at 6:29 PM Ulf Hansson wrote: > > > > On Thu, 21 Oct 2021 at 17:46, Rafael J. Wysocki wrote: > > > > > > On Thu, Oct 21, 2021 at 5:09 PM Rafael J. Wysocki wrote: > > > > > > > > On Thu, Oct 21, 2021 at 4:05 PM Ulf Hansson wrote: > > > > > > > > > > On Thu, 21 Oct 2021 at 15:45, Rafael J. Wysocki wrote: > > > > > > > > > > > > On Thu, Oct 21, 2021 at 1:49 PM Ulf Hansson wrote: > > > > > > > > > > > > > > On Wed, 20 Oct 2021 at 20:18, Rafael J. Wysocki wrote: > > > > > > > > > > > > > > > > On Wed, Sep 29, 2021 at 4:44 PM Ulf Hansson wrote: > > > > > > > > > > > > > > > > > > In the cpuidle-psci case, runtime PM in combination with the generic PM > > > > > > > > > domain (genpd), may be used when entering/exiting an idlestate. More > > > > > > > > > precisely, genpd relies on runtime PM to be enabled for the attached device > > > > > > > > > (in this case it belongs to a CPU), to properly manage the reference > > > > > > > > > counting of its PM domain. > > > > > > > > > > > > > > > > > > This works fine most of the time, but during system suspend in the > > > > > > > > > dpm_suspend_late() phase, the PM core disables runtime PM for all devices. > > > > > > > > > Beyond this point and until runtime PM becomes re-enabled in the > > > > > > > > > dpm_resume_early() phase, calls to pm_runtime_get|put*() will fail. > > > > > > > > > > > > > > > > > > To make sure the reference counting in genpd becomes correct, we need to > > > > > > > > > prevent cpuidle-psci from using runtime PM when it has been disabled for > > > > > > > > > the device. Therefore, let's move the call to cpuidle_pause() from > > > > > > > > > dpm_suspend_noirq() to dpm_suspend_late() - and cpuidle_resume() from > > > > > > > > > dpm_resume_noirq() into dpm_resume_early(). > > > > > > > > > > > > > > > > > > Diagnosed-by: Maulik Shah > > > > > > > > > Suggested-by: Maulik Shah > > > > > > > > > Signed-off-by: Ulf Hansson > > > > > > > > > --- > > > > > > > > > drivers/base/power/main.c | 6 ++---- > > > > > > > > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > > > > > > > > > > > > > > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c > > > > > > > > > index cbea78e79f3d..1c753b651272 100644 > > > > > > > > > --- a/drivers/base/power/main.c > > > > > > > > > +++ b/drivers/base/power/main.c > > > > > > > > > @@ -747,8 +747,6 @@ void dpm_resume_noirq(pm_message_t state) > > > > > > > > > > > > > > > > > > resume_device_irqs(); > > > > > > > > > device_wakeup_disarm_wake_irqs(); > > > > > > > > > - > > > > > > > > > - cpuidle_resume(); > > > > > > > > > } > > > > > > > > > > > > > > > > > > /** > > > > > > > > > @@ -870,6 +868,7 @@ void dpm_resume_early(pm_message_t state) > > > > > > > > > } > > > > > > > > > mutex_unlock(&dpm_list_mtx); > > > > > > > > > async_synchronize_full(); > > > > > > > > > + cpuidle_resume(); > > > > > > > > > dpm_show_time(starttime, state, 0, "early"); > > > > > > > > > trace_suspend_resume(TPS("dpm_resume_early"), state.event, false); > > > > > > > > > } > > > > > > > > > @@ -1336,8 +1335,6 @@ int dpm_suspend_noirq(pm_message_t state) > > > > > > > > > { > > > > > > > > > int ret; > > > > > > > > > > > > > > > > > > - cpuidle_pause(); > > > > > > > > > - > > > > > > > > > device_wakeup_arm_wake_irqs(); > > > > > > > > > suspend_device_irqs(); > > > > > > > > > > > > > > > > > > @@ -1467,6 +1464,7 @@ int dpm_suspend_late(pm_message_t state) > > > > > > > > > int error = 0; > > > > > > > > > > > > > > > > > > trace_suspend_resume(TPS("dpm_suspend_late"), state.event, true); > > > > > > > > > + cpuidle_pause(); > > > > > > > > > mutex_lock(&dpm_list_mtx); > > > > > > > > > pm_transition = state; > > > > > > > > > async_error = 0; > > > > > > > > > -- > > > > > > > > > > > > > > > > Well, this is somewhat heavy-handed and it affects even the systems > > > > > > > > that don't really need to pause cpuidle at all in the suspend path. > > > > > > > > > > > > > > Yes, I agree. > > > > > > > > > > > > > > Although, I am not really changing the behaviour in regards to this. > > > > > > > cpuidle_pause() is already being called in dpm_suspend_noirq(), for > > > > > > > everybody today. > > > > > > > > > > > > Yes, it is, but pausing it earlier will cause more energy to be spent, > > > > > > potentially. > > > > > > > > > > > > That said, there are not too many users of suspend_late callbacks in > > > > > > the tree, so it may not matter too much. > > > > > > > > > > > > > > > > > > > > > > Also, IIUC you don't need to pause cpuidle completely, but make it > > > > > > > > temporarily avoid idle states potentially affected by this issue. An > > > > > > > > additional CPUIDLE_STATE_DISABLED_ flag could be used for that I > > > > > > > > suppose and it could be set via cpuidle_suspend() called from the core > > > > > > > > next to cpufreq_suspend(). > > > > > > > > > > > > > > cpuidle_suspend() would then need to go and fetch the cpuidle driver > > > > > > > instance, which in some cases is one driver per CPU. Doesn't that get > > > > > > > rather messy? > > > > > > > > > > > > Per-CPU variables are used for that, so it is quite straightforward. > > > > > > > > > > > > > Additionally, since find_deepest_state() is being called for > > > > > > > cpuidle_enter_s2idle() too, we would need to treat the new > > > > > > > CPUIDLE_STATE_DISABLED_ flag in a special way, right? > > > > > > > > > > > > No, it already checks "disabled". > > > > > > > > > > Yes, but that would be wrong. > > > > > > > > Hmmm. > > > > > > > > > The use case I want to support, for cpuidle-psci, is to allow all idle > > > > > states in suspend-to-idle, > > > > > > > > So does PM-runtime work in suspend-to-idle? How? > > > > > > > > > but prevent those that rely on runtime PM > > > > > (after it has been disabled) for the regular idle path. > > > > > > > > Do you have a special suspend-to-idle handling of those states that > > > > doesn't require PM-runtime? > > > > > > Regardless, pausing cpuidle in the suspend-to-idle path simply doesn't > > > make sense at all, so this needs to be taken care of in the first > > > place. > > > > Right, I do agree, don't get me wrong. But, do we really want to treat > > s2-to-idle differently, compared to s2-to-ram in regards to this? > > > > Wouldn't it be a lot easier to let cpuidle drivers to opt-out for > > cpuidle_pause|resume(), no matter whether it's for s2-to-idle or > > s2-to-ram? > > I don't think so. > > Suspend-to-idle resume cpuidle after pausing it which is just plain > confusing and waste of energy and the fact that the system-wide > suspend flow interferes with using PM-runtime for implementing cpuidle > callbacks at the low level really is an orthogonal problem. > > > > > > > The problem with PM-runtime being unavailable after dpm_suspend() > > > needs to be addressed in a different way IMO, because it only affects > > > one specific use case. > > > > It's one specific case so far, but we have the riscv driver on its > > way, which would suffer from the same problem. > > So perhaps they should be advised about this issue. > > > Anyway, an option is to figure out what platforms and cpuidle drivers, > > that really needs cpuidle_pause|resume() at this point and make an > > opt-in solution instead. > > None of them need to pause cpuidle for suspend-to-idle AFAICS. > > Some may want it in the non-s2idle suspend path, but I'm not sure > about the exact point where cpuidle needs to be paused in this case. > Possibly before offlining the nonboot CPUs. > > > This could then be used by runtime PM based > > cpuidle drivers as well. Would that be a way forward? > > The PM-runtime case should be addressed directly IMO, we only need to > figure out how to do that. > > I'm wondering how you are dealing with the case when user space > prevents pd_dev from suspending via sysfs, for that matter. Or what happens if rpm_suspend() returns -EAGAIN, because someone has started to resume the device right after its reference counter went down to 0. It looks to me like the problem is there regardless of the whole interference with system suspend.