Received: by 2002:ac0:8c8e:0:0:0:0:0 with SMTP id r14csp1177789ima; Wed, 6 Feb 2019 15:17:07 -0800 (PST) X-Google-Smtp-Source: AHgI3Ia0cH93OC0NJHlQXlVX8jIcUaadGmLXfDumH+VJ7Uu9wU5pBU+nIHdMK31X7k5zfEkFKiue X-Received: by 2002:a65:6242:: with SMTP id q2mr12017218pgv.245.1549495026924; Wed, 06 Feb 2019 15:17:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549495026; cv=none; d=google.com; s=arc-20160816; b=DZQZlXdYLVM1oAkfI2nbIBecNs0QlkHz2NzIPAX646PRuJdBj7uth7PLJ7410JK4Fx rQ5xgyb+EA3qe42XMQ36ej4CLJaPcff9zhboEU42Fksf5oELdUHrnq3Q0vw7LB3Eb6Db 13j65I+XV7O0W2n0ztS8qy+WWvgL1Oj3YiXGVKONuUrrsGxG08AGGUy2zm1GlTZE8laH wLeFFw5u4rIWrDNPNYEYwG80q6Kx+nvPMYefiRdx3PbbLPazCMkAix1nrCDXUr2mrgTb YCAFbm1qf3uD+I3wbEgVTvuZXB8anDAFOeRl4nhkxwMQA2zeRQbdtZ4QsoEoxoMy3p9T 8KBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=niW9xzR8HXazKHjHyB5B+Vq1LdApwoX++73+TW5nM0I=; b=AiGtLqOgZSdnw2jn0gjo9ohe0Gs8xR8s0DC+6HxkWgBs8+kAy5RG+fpnjO3y7IThK2 406BuxDfoLd5OEqL0YPZ8nqHIrr189p+0t3kaIKBhdNhiLXgpq1lQyjonk07PF61mGrE R7AFB46KcyIR7NenpX1+UJhyEkLHLakHWdJTSbnSYAH7wYMOnlcrHpNhZUssdFhEk0iw JxV92ZRx+8+2VTCSLtxk0aqi9jexE76VqFau9MoimGp1R4RNqt73y+j8C5jkq9RC7O1j 6by9Ig6ijdzPuJqTa4uT7mmbN7zd7+1M6aEG46alaNhBqbGUTqcmNwdrtbBg7VYZ5wGq LxWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c64si7032203pfg.239.2019.02.06.15.16.40; Wed, 06 Feb 2019 15:17:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726436AbfBFXQh (ORCPT + 99 others); Wed, 6 Feb 2019 18:16:37 -0500 Received: from mail-ot1-f66.google.com ([209.85.210.66]:43022 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726161AbfBFXQg (ORCPT ); Wed, 6 Feb 2019 18:16:36 -0500 Received: by mail-ot1-f66.google.com with SMTP id a11so15064062otr.10; Wed, 06 Feb 2019 15:16:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=niW9xzR8HXazKHjHyB5B+Vq1LdApwoX++73+TW5nM0I=; b=LuWtodFvFk79ayCas3yxZeaQmXLf67BZ9WeI/+jOA7cQ9LYVtMczDw4eiPabsW9z60 kMlaYInwD7jlG9p5Gm+IwhdfPkn3YBj8zh80Onxmmbv5k51tdveZlMzHMcJMR71/vunt enZnE+6abaByMqgEvWtAo3r4j/ChEV9I1rpIIXLoz1Eya9hT8jxK/7NXT/rUiKOQJHFi MrU1zIIFDGEJrwzsg2i3RWrBDjuvsrDDfH1Sx31y+IF37kuhTQSQjTxpAJDQPrX5MBqu GgHg7lxz2/7s/DeOdqfWv4Rv0srGvwmz9jJQkkJiDWhGmDYIaPHP8xXQXmBG55lpOh0d RkCA== X-Gm-Message-State: AHQUAub8Lnloz48CIzF+ifPitiGXXbM/r7mrDj7Eoi7+bJhIJtqJtChP XOmm9aBKmalJcdZUuJGL/1/As+b0Pel1IjkXipA= X-Received: by 2002:aca:3d42:: with SMTP id k63mr954685oia.95.1549494995207; Wed, 06 Feb 2019 15:16:35 -0800 (PST) MIME-Version: 1.0 References: <1952449.TVsm6CJCTy@aspire.rjw.lan> <24600564.pMaimVIQbW@aspire.rjw.lan> In-Reply-To: From: "Rafael J. Wysocki" Date: Thu, 7 Feb 2019 00:16:24 +0100 Message-ID: Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag To: Ulf Hansson Cc: "Rafael J. Wysocki" , "Rafael J. Wysocki" , Greg Kroah-Hartman , LKML , Linux PM , Daniel Vetter , Lukas Wunner , Andrzej Hajda , Russell King - ARM Linux , Lucas Stach , Linus Walleij , Thierry Reding , Laurent Pinchart , Marek Szyprowski , Joerg Roedel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 6, 2019 at 2:02 PM Ulf Hansson wrote: > > On Wed, 6 Feb 2019 at 13:10, Rafael J. Wysocki wrote: > > > > On Wed, Feb 6, 2019 at 12:24 PM Ulf Hansson wrote: > > > > > > On Wed, 6 Feb 2019 at 10:56, Rafael J. Wysocki wrote: > > > > > > > > On Tue, Feb 5, 2019 at 12:27 PM Rafael J. Wysocki wrote: > > > > > > > > > > On Tuesday, February 5, 2019 9:15:49 AM CET Ulf Hansson wrote: > > > > > > On Mon, 4 Feb 2019 at 12:45, Rafael J. Wysocki wrote: > > > > > > > > > > > > > > On Mon, Feb 4, 2019 at 12:40 PM Rafael J. Wysocki wrote: > > > > > > > > > > > > > > > > On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson wrote: > > > > > > > > [cut] > > > > > > > > > > > > > > > > > > For example, if the consumer device is suspended after the > > > > > > > device_link_add() that incremented the supplier's PM-runtime count and > > > > > > > then resumed again, the rpm_active refcount will be greater than one > > > > > > > because of the last resume and not because of the initial link > > > > > > > creation. In that case, dropping the supplier's PM-runtime count on > > > > > > > link deletion may not work as expected. > > > > > > > > > > > > I see what your are saying and I must admit, by looking at the code, > > > > > > that it has turned into being rather complicated. Assuming of good > > > > > > reasons, of course. > > > > > > > > > > > > Anyway, I will play a little bit more with my tests to see what I can find out. > > > > > > > > > > > > > > > > > > > > > Arguably, device_link_del() could be made automatically drop the > > > > > > > > supplier's PM-runtime count by one if the link's rpm_active refcount > > > > > > > > is not one, but there will be failing scenarios in that case too > > > > > > > > AFAICS. > > > > > > > > > > > > Let's see. > > > > > > > > > > So for the record, below is the (untested) patch I'm thinking about. > > > > > > > > > > Having considered this for some time, I think that it would be better to > > > > > try to drop the supplier's PM-runtime usage counter on link removal even if > > > > > the link doesn't go away then. That would be more consistent at least IMO. > > > > > > > > So I can't convince myself that this is the case. > > > > > > > > Either way, if there are two callers of device_link_add() for one > > > > consumer-supplier pair trying to add a stateless link between them and > > > > one of these callers passes DL_FLAG_RPM_ACTIVE set in the flags to it, > > > > there may be issues regardless of what device_link_del() and > > > > device_link_remove() do. However, if they decrement the link's > > > > rpm_active refcount (and possibly the supplier's PM-runtime usage > > > > counter too), the supplier may be suspended prematurely, whereas in > > > > the other case (no decrementation of rpm_active, which how the code > > > > works after this series) it may just be prevented from suspending. To > > > > me, the former is worse than the latter. > > > > > > Well, I would say it sucks in both cases. :-) > > > > > > > > > > > Moreover, there is a workaround for the latter issue that seems to be > > > > easy enough: it is sufficient to let the consumer runtime suspend > > > > after calling device_link_add() to create the link (with > > > > DL_FLAG_RPM_ACTIVE set) and before trying to remove it. > > > > > > I get your point, but unfortunate I don't think it's that simple. > > > > > > For example, someone (like a child) may prevent runtime suspend for > > > the consumer. Hence, also the supplier is prevented from being runtime > > > suspended. > > > > Well, in that case the supplier should not be suspended until the > > consumer can be suspended too. > > > > IOW, if you call device_link_del() in that case, it would be a bug if > > it allowed the supplier suspend. > > Well, maybe the "child" was a bad example. The point is, the driver > doesn't control the RPM child count and nor the RPM usage count for > its consumer device - solely by itself. > > Someone, like the driver/PM core can at some point decide to increase > the runtime PM usage count for example for the consumer device. For > whatever good reason. If PM-runtime is enabled for the consumer and there is a good enough reason why it cannot suspend, then the supplier cannot suspend too. Again, allowing the supplier to suspend in that situation would be a bug. However, the supplier can be allowed to suspend if PM-runtime is not enabled for the consumer and it is not operational (and the latter needs to be known to whoever wants to allow the supplier to suspend). > > > > > So, if you want to push this responsibility to the driver, then I > > > think we need make __pm_runtime_set_status() to respect device links, > > > similar to how it already deals with child/parents. > > > > > > In that way, the driver could call pm_runtime_set_suspended(), before > > > dropping the device link in ->probe(), which would allow the supplier > > > to also become runtime suspended. > > > > I guess you mean that runtime PM would be disabled for the consumer at > > that point? > > Yes. > > Calling pm_runtime_set_suspended() should be a part of the error path > in the driver, which includes disabling runtime PM as well (if it > enabled it in the first place of course). OK I admit that it would make sense to change __pm_runtime_set_status() to take device links into account. > > > > > I did a quick research of users of device links, unless I am mistaken, > > > this seems like an okay approach. > > > > > > What do you think? > > > > Well, I think I need to know the exact use case you have in mind. :-) > > The use case is simply a generic driver that fails to probe by > returning -EPROBE_DEFER. So it's hypothetical, but I often tests > common sequences by using my RPM testdriver, to make sure it all works > as expected. > The below sequence is common, so then I have added the use of device > links, to see how this plays. And it doesn't. > > ->probe() > ... > > pm_runtime_get_noresume() > pm_runtime_set_active() > pm_runtime_enable() I don't think you can do the above safely without ensuring that the supplier is not suspended upfront. So you need pm_runtim_get_sync(supplier); before the above, and then -> > > ... > > device_link_add(con, supp, DL_FLAG_STATELESS |DL_FLAG_PM_RUNTIME | > DL_FLAG_RPM_ACTIVE); > > we got some errors... > goto err: -> it is not necessary to add the link before the point at which you know that you are not going to return any errors (except when the link creation itself fails). So then you do device_link_add(con, supp, DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE); ... pm_runtime_put_noidle(supplier); and the link prevents the supplier from suspending until the consumer is suspended. > > ... > > err: > pm_runtime_put_noidle() > pm_runtime_disable() > pm_runtime_set_suspended() > > device_link_remove() > > return err; In particular, device links really should not be added and removed before returning -EPROBE_DEFER IMO, because adding them causes dpm_list to be re-ordered and that may be quite heavy if the consumer has children etc.