Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3862572imm; Tue, 17 Jul 2018 11:33:33 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfDoUTlKOzDrOmmKSZWQkvLyFQ9syeACWKW/OWjAJrfM1K9yAQHD+f7SWkfPNKR6z/L75cQ X-Received: by 2002:a62:a018:: with SMTP id r24-v6mr1836733pfe.144.1531852413546; Tue, 17 Jul 2018 11:33:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531852413; cv=none; d=google.com; s=arc-20160816; b=W2HqXjvkn0dAPhF4KIvf/RgoZWTWbWxwcW7eZZ6JvFTHMryhoJK+Ka8sZTaAsWo+gN RsM2HHxKVPmtYzKc6z9mYtD4eAjPZcKU2mbm8+HVRRGAxqA0pC8Z0h3rRUzejngpkzu5 d8cOWvqHMUF8z38b78Q6eQt80zcGWH/3q1xCykge8B6PTtJM2d3toRhDhJZ+RDO90TJA XwjYHtfUIFfxcMEG+rJoMbfPA5w8wcf4MmXHs8Lw552wkNUiG7LxuFrsHJ7Q+9kaPZwq xC5cpSihu3tgo7sjnn/t1eJDuGXjNsybbI6nFZVXSRRaycotciSd0ky/92Q2nYzHE63Q zwkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=KFe+WvFNEizalSIenmdj8WnuOzqPbpV8HeEdlJgqpQs=; b=Z7Z3SiDNAZQ3z4tXYEWeYpuguva1ezEjgBNROTVBLbMZZeQm2lAfBs+VunbwPmScEU rbFj4pLCEUkATMSI5QZrzHXw9leZ0BVnLmtXZQFQb+jjGE1Z/9d1FyXcudD1g4GgbujT TaYh4UkZYnp4waAYoz7H2UvtjsiaRHLelhQ6bQRRQyEEq2xOHCR43RxC8aaMJbXEYIYK 0XUakdMime6d7Qbux+gAW2ZXlcc1js91jwK1W+UgGIUnNRS8R2J78ghS2DUCCS85IQgY bxed3PCkzaMoID66P+NwsJqomXBX63J2MbZH3XX96CRJ222lkVCYGxtleIqP86w4PUHI i7VQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 23-v6si1551993pge.589.2018.07.17.11.33.18; Tue, 17 Jul 2018 11:33:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730245AbeGQTGF (ORCPT + 99 others); Tue, 17 Jul 2018 15:06:05 -0400 Received: from bmailout2.hostsharing.net ([83.223.90.240]:58853 "EHLO bmailout2.hostsharing.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729741AbeGQTGF (ORCPT ); Tue, 17 Jul 2018 15:06:05 -0400 Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.hostsharing.net", Issuer "COMODO RSA Domain Validation Secure Server CA" (not verified)) by bmailout2.hostsharing.net (Postfix) with ESMTPS id 94B462800B48A; Tue, 17 Jul 2018 20:32:11 +0200 (CEST) Received: by h08.hostsharing.net (Postfix, from userid 100393) id 3E0E1226859; Tue, 17 Jul 2018 20:32:11 +0200 (CEST) Date: Tue, 17 Jul 2018 20:32:11 +0200 From: Lukas Wunner To: Lyude Paul Cc: nouveau@lists.freedesktop.org, David Airlie , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Ben Skeggs , linux-pm@vger.kernel.org Subject: Re: [Nouveau] [PATCH 1/5] drm/nouveau: Prevent RPM callback recursion in suspend/resume paths Message-ID: <20180717183211.GB18363@wunner.de> References: <20180716235936.11268-1-lyude@redhat.com> <20180716235936.11268-2-lyude@redhat.com> <20180717071641.GA5411@wunner.de> <20180717182041.GA18363@wunner.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 17, 2018 at 02:24:31PM -0400, Lyude Paul wrote: > On Tue, 2018-07-17 at 20:20 +0200, Lukas Wunner wrote: > > Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire() > > wants it in resumed state, so is waiting forever for the device to > > runtime suspend in order to resume it again immediately afterwards. > > > > The deadlock in the stack trace you've posted could be resolved using > > the technique I used in d61a5c106351 by adding the following to > > include/linux/pm_runtime.h: > > > > static inline bool pm_runtime_status_suspending(struct device *dev) > > { > > return dev->power.runtime_status == RPM_SUSPENDING; > > } > > > > static inline bool is_pm_work(struct device *dev) > > { > > struct work_struct *work = current_work(); > > > > return work && work->func == dev->power.work; > > } > > > > Then adding this to nvkm_i2c_aux_acquire(): > > > > struct device *dev = pad->i2c->subdev.device->dev; > > > > if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) { > > ret = pm_runtime_get_sync(dev); > > if (ret < 0 && ret != -EACCES) > > return ret; > > } > > > > But here's the catch: This only works for an *async* runtime suspend. > > It doesn't work for pm_runtime_put_sync(), pm_runtime_suspend() etc, > > because then the runtime suspend is executed in the context of the caller, > > not in the context of dev->power.work. > > > > So it's not a full solution, but hopefully something that gets you > > going. I'm not really familiar with the code paths leading to > > nvkm_i2c_aux_acquire() to come up with a full solution off the top > > of my head I'm afraid. > > OK-I was considering doing something similar to that commit beforehand but I > wasn't sure if I was going to just be hacking around an actual issue. That > doesn't seem to be the case. This is very helpful and hopefully I should be able > to figure something out from this, thanks! In some cases, the function acquiring the runtime PM ref is only called from a couple of places and then it would be feasible and appropriate to add a bool parameter to the function telling it to acquire the ref or not. So the function is told using a parameter which context it's running in: In the runtime_suspend code path or some other code path. The technique to use current_work() is an alternative approach to figure out the context if passing in an additional parameter is not feasible for some reason. That was the case with d61a5c106351. That approach only works for work items though. Lukas