Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp295160pxj; Thu, 17 Jun 2021 02:58:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwiR1s7+Z3MKothv9oldPqNl2UTDHwdQ9kU8z3xtrPHG28j9HDKHuuID/BIXYazo24KNFGi X-Received: by 2002:a02:c906:: with SMTP id t6mr3857865jao.117.1623923888670; Thu, 17 Jun 2021 02:58:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623923888; cv=none; d=google.com; s=arc-20160816; b=ZrvAR6XpACz5/MYRvPDVLUT5qBcCGV1ueTusp/lRNk7i2j/VZZDBHNlX70/C4juq6L TxF/bMh1lgFgB0Bl8TJkP6VuP5Sc4JeUrFyHQCkO99ac6+nNkjTt3pz8Bdx3VWfN65fq jAfiLcrKts1izQQr/1e3EoUe2cJqgE1HUpkeOZQcw0FZmrMRtPi18VA6S2sq1fuDIbdl /RskF9rYpvjm/2Wy/ZSPAfa9SzOllUr96kRdspj+Zv59AW6Uqtu/4HXyfqHDryw7dbR2 ow91k349W64DxCw702EDH0J+vruUhC7nYtCXiUcn/H3/samNUQ9Tz5k8RP7AyMaeaaCO OQig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=wXHJ7ss3OtBtqBAeFZEzYKlhCeBHxtiwTpmWUd/NWDM=; b=AidLmdYwDd2o7zkBz/Of1T+5rQ5CkIcZ/8LpE6VVAKpeiX5cZclXjvHf7HogXtKKG/ PqgRlsjUj5C1CjezSee4iQrkQGE4k9Oulx6GaJGhpVlg0p+AJSh4Im/qxAB58+MSO7ZT QMtG2bapQjYOfuq8jqThjgGUzJkt5Hjc3a7I+ymA/a2k6HD7mHXXblAzlRxb7z0hJkjf 2zbtMnlB/7DaTGYMutgnAz+0k6yEOigq/Ps17SRll9aWK+lDp+fZ8x9z+4pv2S5JTBqW S2+VdZlIlGQbirZg/mgMlkM2v6EiTsT9RJ/lXbqd2kqPkCQYTm7i0WVjkLEyDVicu2kG pycg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Z4Rj3kpk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t7si5597609jai.102.2021.06.17.02.57.57; Thu, 17 Jun 2021 02:58:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Z4Rj3kpk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231361AbhFQJJx (ORCPT + 99 others); Thu, 17 Jun 2021 05:09:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48014 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230473AbhFQJJw (ORCPT ); Thu, 17 Jun 2021 05:09:52 -0400 Received: from mail-vk1-xa33.google.com (mail-vk1-xa33.google.com [IPv6:2607:f8b0:4864:20::a33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80A91C06175F for ; Thu, 17 Jun 2021 02:07:43 -0700 (PDT) Received: by mail-vk1-xa33.google.com with SMTP id k71so1219749vka.11 for ; Thu, 17 Jun 2021 02:07:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=wXHJ7ss3OtBtqBAeFZEzYKlhCeBHxtiwTpmWUd/NWDM=; b=Z4Rj3kpkc5jZPGLcD2bVuGjbTkUQHbRePGY25Bfpmh+ykfEXHyhUouBKBNIaffz0Ql ycQXFOy/i36aor27toKh+Qw13Q8tmHElBuSC06bznCg+E1AnbimyDtH3uNbQc23bLO4M br6RuaCIaKLXQtEVqZV3sjPCTeE1yPFz50G70O/9+ejnWPbjpFxluoOyNTmhnh7Np1em 5Zr1D5bSE6mkbGVEXWZQmGpdXx0t6dWhIwOevL9yzJuy0EDAx3fEnew97M805rOKnXZO CbzXFLQ6QDjOmkviHWRnZZVe+HkByejaZVFcy+dnrpy2KIlvcQOV7qkAs6o+mEN/yYqS c03g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wXHJ7ss3OtBtqBAeFZEzYKlhCeBHxtiwTpmWUd/NWDM=; b=AIP9HP6JwAUyUjMNfKBFnJQNO45erII6rZV8Mv+yJ0/4o4DV5F4k0sJ1d8j5ZbJVwn VB366qBe02kPtMJAc4HHZCqMKOXRqPsv6enjDYg11UojeU5ms7DZIqgdNq6gK9xvk1+J RrOlkEJkZ4/6f8CGosw4fH6DlumvpvkGBm9MLwVIMNCireueKVj1Fny9jmsPzvgfkSAO doEGEJhS77w2Hncj8OvyAegYO6Joq4p653+oBDgAkR2/kb8P8QpRbrLA/SasOZh+f8wB 8rl7H9xi91fRVqUxtAybXoa/TiVq/OoFGp0XzeuHJhAlABuLdYLtzBu9Ak/8Nn23UOBx cgtw== X-Gm-Message-State: AOAM531TNXO4n5FKFIjxDYR78MHOOAqJP+3sP1B5qg9bgwm+djzcqqyB JEDNBhZik1VOQsOOaA3hu9CoK7naRYgiwb8NcJszSA== X-Received: by 2002:a1f:9542:: with SMTP id x63mr2943949vkd.15.1623920862482; Thu, 17 Jun 2021 02:07:42 -0700 (PDT) MIME-Version: 1.0 References: <20210611101540.3379937-1-dmitry.baryshkov@linaro.org> <20210611101540.3379937-3-dmitry.baryshkov@linaro.org> In-Reply-To: From: Ulf Hansson Date: Thu, 17 Jun 2021 11:07:06 +0200 Message-ID: Subject: Re: [PATCH 2/2] PM: domain: use per-genpd lockdep class To: Bjorn Andersson Cc: Dmitry Baryshkov , Mark Brown , Stephen Boyd , "Rafael J. Wysocki" , Kevin Hilman , Greg Kroah-Hartman , Linux PM , Linux Kernel Mailing List , Rajendra Nayak Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org + Rajendra On Tue, 15 Jun 2021 at 17:55, Bjorn Andersson wrote: > > On Tue 15 Jun 05:17 CDT 2021, Ulf Hansson wrote: > > > + Mark > > > > On Fri, 11 Jun 2021 at 16:34, Dmitry Baryshkov > > wrote: > > > > > > Added Stephen to Cc list > > > > > > On Fri, 11 Jun 2021 at 16:50, Ulf Hansson wrote: > > > > > > > > On Fri, 11 Jun 2021 at 12:15, Dmitry Baryshkov > > > > wrote: > > > > > > > > > > In case of nested genpds it is easy to get the following warning from > > > > > lockdep, because all genpd's mutexes share same locking class. Use the > > > > > per-genpd locking class to stop lockdep from warning about possible > > > > > deadlocks. It is not possible to directly use genpd nested locking, as > > > > > it is not the genpd code calling genpd. There are interim calls to > > > > > regulator core. > > > > > > > > > > [ 3.030219] ============================================ > > > > > [ 3.030220] WARNING: possible recursive locking detected > > > > > [ 3.030221] 5.13.0-rc3-00054-gf8f0a2f2b643-dirty #2480 Not tainted > > > > > [ 3.030222] -------------------------------------------- > > > > > [ 3.030223] kworker/u16:0/7 is trying to acquire lock: > > > > > [ 3.030224] ffffde0eabd29aa0 (&genpd->mlock){+.+.}-{3:3}, at: genpd_lock_mtx+0x18/0x2c > > > > > [ 3.030236] > > > > > [ 3.030236] but task is already holding lock: > > > > > [ 3.030236] ffffde0eabcfd6d0 (&genpd->mlock){+.+.}-{3:3}, at: genpd_lock_mtx+0x18/0x2c > > > > > [ 3.030240] > > > > > [ 3.030240] other info that might help us debug this: > > > > > [ 3.030240] Possible unsafe locking scenario: > > > > > [ 3.030240] > > > > > [ 3.030241] CPU0 > > > > > [ 3.030241] ---- > > > > > [ 3.030242] lock(&genpd->mlock); > > > > > [ 3.030243] lock(&genpd->mlock); > > > > > [ 3.030244] > > > > > [ 3.030244] *** DEADLOCK *** > > > > > [ 3.030244] > > > > > [ 3.030244] May be due to missing lock nesting notation > > > > > [ 3.030244] > > > > > [ 3.030245] 6 locks held by kworker/u16:0/7: > > > > > [ 3.030246] #0: ffff6cca00010938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1f0/0x730 > > > > > [ 3.030252] #1: ffff8000100c3db0 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x1f0/0x730 > > > > > [ 3.030255] #2: ffff6cca00ce3188 (&dev->mutex){....}-{3:3}, at: __device_attach+0x3c/0x184 > > > > > [ 3.030260] #3: ffffde0eabcfd6d0 (&genpd->mlock){+.+.}-{3:3}, at: genpd_lock_mtx+0x18/0x2c > > > > > [ 3.030264] #4: ffff8000100c3968 (regulator_ww_class_acquire){+.+.}-{0:0}, at: regulator_lock_dependent+0x6c/0x1b0 > > > > > [ 3.030270] #5: ffff6cca00a59158 (regulator_ww_class_mutex){+.+.}-{3:3}, at: regulator_lock_recursive+0x94/0x1d0 > > > > > [ 3.030273] > > > > > [ 3.030273] stack backtrace: > > > > > [ 3.030275] CPU: 6 PID: 7 Comm: kworker/u16:0 Not tainted 5.13.0-rc3-00054-gf8f0a2f2b643-dirty #2480 > > > > > [ 3.030276] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) > > > > > [ 3.030278] Workqueue: events_unbound deferred_probe_work_func > > > > > [ 3.030280] Call trace: > > > > > [ 3.030281] dump_backtrace+0x0/0x1a0 > > > > > [ 3.030284] show_stack+0x18/0x24 > > > > > [ 3.030286] dump_stack+0x108/0x188 > > > > > [ 3.030289] __lock_acquire+0xa20/0x1e0c > > > > > [ 3.030292] lock_acquire.part.0+0xc8/0x320 > > > > > [ 3.030294] lock_acquire+0x68/0x84 > > > > > [ 3.030296] __mutex_lock+0xa0/0x4f0 > > > > > [ 3.030299] mutex_lock_nested+0x40/0x50 > > > > > [ 3.030301] genpd_lock_mtx+0x18/0x2c > > > > > [ 3.030303] dev_pm_genpd_set_performance_state+0x94/0x1a0 > > > > > [ 3.030305] reg_domain_enable+0x28/0x4c > > > > > [ 3.030308] _regulator_do_enable+0x420/0x6b0 > > > > > [ 3.030310] _regulator_enable+0x178/0x1f0 > > > > > [ 3.030312] regulator_enable+0x3c/0x80 > > > > > > > > At a closer look, I am pretty sure that it's the wrong code design > > > > that triggers this problem, rather than that we have a real problem in > > > > genpd. To put it simply, the code in genpd isn't designed to work like > > > > this. We will end up in circular looking paths, leading to deadlocks, > > > > sooner or later if we allow the above code path. > > > > > > > > To fix it, the regulator here needs to be converted to a proper PM > > > > domain. This PM domain should be assigned as the parent to the one > > > > that is requested to be powered on. > > > > > > This more or less resembles original design, replaced per review > > > request to use separate regulator > > > (https://lore.kernel.org/linux-arm-msm/160269659638.884498.4031967462806977493@swboyd.mtv.corp.google.com/, > > > https://lore.kernel.org/linux-arm-msm/20201023131925.334864-1-dmitry.baryshkov@linaro.org/). > > > > Thanks for the pointers. In hindsight, it looks like the > > "regulator-fixed-domain" DT binding wasn't the right thing. > > > > Fortunately, it looks like the problem can be quite easily fixed, by > > moving to a correct model of the domain hierarchy. > > > > Can you give some pointers to how we actually fix this? > > The problem that lead us down this path is that drivers/clk/qcom/gdsc.c > describes power domains, which are parented by domains provided by > drivers/soc/qcom/rpmhpd.c. > > But I am unable to find a way for the gdsc driver to get hold of the > struct generic_pm_domain of the resources exposed by the rpmhpd driver. You don't need a handle to the struct generic_pm_domain, to assign a parent/child domain. Instead you can use of_genpd_add_subdomain(), which takes two "struct of_phandle_args*" corresponding to the parent/child device nodes of the genpd providers and then let genpd internally do the look up. As an example, you may have a look at how the PM domain topology in drivers/cpuidle/cpuidle-psci-domain.c are being created. > > > The second thing that Dmitry's regulator driver does is to cast the > appropriate performance state vote on the rpmhpd resource, but I _think_ > we can do that using OPP tables in the gdsc client's node... Yes, it looks like using an OPP table and to specify a "required-opps", at some device node is the right thing to do. In this case, I wonder if the "required-opps" belongs to the genpd provider node of the new power-domain (as it seems like it only supports one fixed performance state when it's powered on). On the other hand, specifying this at the consumer node should work as well, I think. Actually, this relates to the changes [1] that Rajendra is working on with "assigned-performance-state" (that we agreed to move to OPP/required-opps) for genpd. > > > Beyond this, perhaps we should consider removing the > > "regulator-fixed-domain" DT property, as to avoid similar problems > > from cropping up? > > > > Currently there's a single user upstream, but we have the exact same > problem in at least 3-4 platforms with patches in the pipeline. > > In order to avoid breakage with existing DT I would prefer to see > regulator-fixed-domain to live for at least one kernel release beyond > the introduction of the other model. Yes, this seems reasonable to me. As Mark suggested, let's mark the regulator-fixed-domain DT property as deprecated and remove it once we have the new solution in place. [...] Kind regards Uffe