Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp254856imm; Mon, 9 Jul 2018 00:49:22 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeKC7ynr0GcUUIthmKkVfu2+qVAkUHM1q7zupqnOMpi+nmo+as7b5gAp+wxy91lDRaHyufY X-Received: by 2002:a62:a119:: with SMTP id b25-v6mr18666104pff.163.1531122562563; Mon, 09 Jul 2018 00:49:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531122562; cv=none; d=google.com; s=arc-20160816; b=bdAwzDNAlLmiOeKwfLNQ85TP6JBhW5U5x5OepvesOPJI7zYsx+LXV+EUhZwyPYx8S+ 8JEhCbQUqvYrJQ2F/p/iZEVRjNFMRdlMGAuik95VBbSdPwsiJ0jCDLdpWNKCydTY75eb GlirKHU5z5OH2ZLKnL0FMoDt47U1KLfg+xAbuhcWsrDXnkwCveVIb0I02xM+9xHWWc87 orxzamKdtkUwEsD8QeY0YuT9gW9pooqPIvNAziih66uE7Xq5pL9Dfc4/7EKo3vBn0ja1 2emXOh7vt9Vjvqkj8xHi31+98/5Y6dT8AevW/orYLZg3F/J6RSzc6b/dB8GREnHonM7g 1TAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=T8grvEUDr7RJB6foZbXavBPzIr6qB+PdXUyIWnzc0ZA=; b=yJQYQ5IM6WMaPjCZjwu9+L2R+rUYeOiXR1WCI7VGHpvpenUlzg95VH6iLM0oVER4Sb 2OkInb+cyhXYZhZw4ocKM1h6FxIdGl6GsfoJS0RshJGbGjIKXMt8LkvYK2phvYCcHEwE DbLoL9K+9htJJ4a6RPpZfXAiuj07/1doaXGRBaZv/dkE/0mWrrZBc37WxfT3+v6MDxbl g2knz3PNhZbCSY73ZCnyEBr3kuRqBiUKJZcu1rz0kXlO9wE28lT1CX9ypLX5KVU3q9I3 BUBuKl/RC9z6HdM2mmbFikpGaVpLTa9tyt5mbeG2xefzUMo9v/kBXc3eON55ZCttaoaT i1YA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=fa0MNyIX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u15-v6si15007541pfa.28.2018.07.09.00.49.08; Mon, 09 Jul 2018 00:49:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=fa0MNyIX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754214AbeGIHsZ (ORCPT + 99 others); Mon, 9 Jul 2018 03:48:25 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:33839 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751169AbeGIHsW (ORCPT ); Mon, 9 Jul 2018 03:48:22 -0400 Received: by mail-oi0-f65.google.com with SMTP id 13-v6so34168758ois.1; Mon, 09 Jul 2018 00:48:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=T8grvEUDr7RJB6foZbXavBPzIr6qB+PdXUyIWnzc0ZA=; b=fa0MNyIXZhglp7IUMWr10DuoydIyHEHb87Kz+nu2PQ6w9+eItW0FgeQ1AA9481d+hd ak8RfjG6l+1Cb4V7EzDcT+pVfwaG/gqqm3iitr5ecpPZRJQnlPhTzKYkWjF6Vi5v+iNs 8O+fvrAqzphn05n2R7h+V4ezu5ITcaKqjJAJQ2y9sMVG324p2FbmYSHAc8qgT8nDTnGN JoiKA1L+CRRZsWFuXPFELwZ3RCowQ7s8Z9prAPLaQSxFI/gngQv7kSZRLdXstp+kHvfX 3Wcxd1AUBQBNgk7rJ21Q+LZyuv4J6qN6OpcIRTRt/wZHAHDUyDSYON0B5SjHFpnm0fXl yCcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=T8grvEUDr7RJB6foZbXavBPzIr6qB+PdXUyIWnzc0ZA=; b=KpkOkqPFqPjOeNzB7pM8EKZThoc/Fzakz2I1yRo9JBQxfdZJLyX3oyMBR8fAOZO3Pl ysN5CkzmLYrk7iyfO52tdQu/4pSQtvQdtilqQjQXNG35Pi7KT3GzBSISvfvkn04j9nQ9 GIiy3YRtHG8fcuBJkgNY5FE2Sv/GAI2nLXq1DBb2iEkNFL25PCnqrIKlOkGsHlVJbPSL NkzW3+NwJptWyEFEorYE7TdI7556/RTgGYzdWUNEhfRUKnuawrEmDHCBoMo5lya56obJ we6NW7yKnVCH7PWIBXItOVcLTGbLyqBUjnjX7OeK9rHXfg3eui76dvMDi/i3WI+yLZ/K H/Ww== X-Gm-Message-State: APt69E1gl7lZPvpCUUexRdXoT65Pp9VjkvgHfp28UdSWEkGEOrS8D2QR HbuFufQqKemYeaWBDPJ/lzMfREn3k4CkZzYBv84= X-Received: by 2002:aca:6910:: with SMTP id e16-v6mr20813353oic.358.1531122501858; Mon, 09 Jul 2018 00:48:21 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:63d2:0:0:0:0:0 with HTTP; Mon, 9 Jul 2018 00:48:21 -0700 (PDT) In-Reply-To: References: <1530600642-25090-1-git-send-email-kernelfans@gmail.com> <4685360.VNmeYLh0dQ@aspire.rjw.lan> <6281446.GoJLz6Hq6C@aspire.rjw.lan> <20180706083603.GA9063@wunner.de> From: "Rafael J. Wysocki" Date: Mon, 9 Jul 2018 09:48:21 +0200 X-Google-Sender-Auth: ma7aVGO_U9RHN1gbLdETK5EKzoM Message-ID: Subject: Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset To: Pingfan Liu Cc: Rafael Wysocki , Lukas Wunner , Linux Kernel Mailing List , Greg Kroah-Hartman , Grygorii Strashko , Christoph Hellwig , Bjorn Helgaas , Dave Young , Linux PCI , linuxppc-dev , Linux PM , Kishon Vijay Abraham I Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 9, 2018 at 8:48 AM, Pingfan Liu wrote: > On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki wrote: >> >> On Sat, Jul 7, 2018 at 6:24 AM, Pingfan Liu wrote: >> > On Fri, Jul 6, 2018 at 9:55 PM Pingfan Liu wrote: >> >> >> >> On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki wrote: >> >> > >> >> > On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner wrote: >> >> > > [cc += Kishon Vijay Abraham] >> >> > > >> >> > > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote: >> >> > >> OK, so calling devices_kset_move_last() from really_probe() clearly is >> >> > >> a mistake. >> >> > >> >> >> > >> I'm not really sure what the intention of it was as the changelog of >> >> > >> commit 52cdbdd49853d doesn't really explain that (why would it be >> >> > >> insufficient without that change?) >> >> > > >> >> > > It seems 52cdbdd49853d fixed an issue with boards which have an MMC >> >> > > whose reset pin needs to be driven high on shutdown, lest the MMC >> >> > > won't be found on the next boot. >> >> > > >> >> > > The boards' devicetrees use a kludge wherein the reset pin is modelled >> >> > > as a regulator. The regulator is enabled when the MMC probes and >> >> > > disabled on driver unbind and shutdown. As a result, the pin is driven >> >> > > low on shutdown and the MMC is not found on the next boot. >> >> > > >> >> > > To fix this, another kludge was invented wherein the GPIO expander >> >> > > driving the reset pin unconditionally drives all its pins high on >> >> > > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c >> >> > > (commit adc284755055, "gpio: pcf857x: restore the initial line state >> >> > > of all pcf lines"). >> >> > > >> >> > > For this kludge to work, the GPIO expander's ->shutdown hook needs to >> >> > > be executed after the MMC expander's ->shutdown hook. >> >> > > >> >> > > Commit 52cdbdd49853d achieved that by reordering devices_kset according >> >> > > to the probe order. Apparently the MMC probes after the GPIO expander, >> >> > > possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't >> >> > > available yet, see mmc_regulator_get_supply(). >> >> > > >> >> > > Note, I'm just piecing the information together from git history, >> >> > > I'm not responsible for these kludges. (I'm innocent!) >> >> > >> >> > Sure enough. :-) >> >> > >> >> > In any case, calling devices_kset_move_last() in really_probe() is >> >> > plain broken and if its only purpose was to address a single, arguably >> >> > kludgy, use case, let's just get rid of it in the first place IMO. >> >> > >> >> Yes, if it is only used for a single use case. >> >> >> > Think it again, I saw other potential issue with the current code. >> > device_link_add->device_reorder_to_tail() can break the >> > "supplier<-consumer" order. During moving children after parent's >> > supplier, it ignores the order of child's consumer. >> >> What do you mean? >> > The drivers use device_link_add() to build "supplier<-consumer" order > without knowing each other. Hence there is the following potential > odds: (consumerX, child_a, ...) (consumer_a,..) (supplierX), where > consumer_a consumes child_a. Well, what's the initial state of the list? > When device_link_add()->device_reorder_to_tail() moves all descendant of > consumerX to the tail, it breaks the "supplier<-consumer" order by > "consumer_a <- child_a". That depends on what the initial ordering of the list is and please note that circular dependencies are explicitly assumed to be not present. The assumption is that the initial ordering of the list reflects the correct suspend (or shutdown) order without the new link. Therefore initially all children are located after their parents and all known consumers are located after their suppliers. If a new link is added, the new consumer goes to the end of the list and all of its children and all of its consumers go after it. device_reorder_to_tail() is recursive, so for each of the devices that went to the end of the list, all of its children and all of its consumers go after it and so on. Now, that operation doesn't change the order of any of the parent<-child or supplier<-consumer pairs that get moved and since all of the devices that depend on any device that get moved go to the end of list after it, the only devices that don't go to the end of list are guaranteed to not depend on any of them (they may be parents or suppliers of the devices that go to the end of the list, but not their children or suppliers). > And we need recrusion to resolve the item in > (consumer_a,..), each time when moving a consumer behind its supplier, > we may break "parent<-child". I don't see this as per the above. Say, device_reorder_to_tail() moves a parent after its child. This means that device_reorder_to_tail() was not called for the child after it had been called for the parent, but that is not true, because it is called for all of the children of each device that gets moved *after* moving that device. >> > Beside this, essentially both devices_kset_move_after/_before() and >> > device_pm_move_after/_before() expose the shutdown order to the >> > indirect caller, and we can not expect that the caller can not handle >> > it correctly. It should be a job of drivers core. >> >> Arguably so, but that's how those functions were designed and the >> callers should be aware of the limitation. >> >> If they aren't, there is a bug in the caller. >> > If we consider device_move()-> device_pm_move_after/_before() more > carefully like the above description, then we can hide the detail from > caller. And keep the info of the pm order inside the core. Yes, we can. My point is that we have not been doing that so far and the current callers of those routines are expected to know that. We can do that to make the life of *future* callers easier (and maybe to simplify the current ones), but currently the caller is expected to do the right thing. >> > It is hard to extract high dimension info and pack them into one dimension >> > linked-list. >> >> Well, yes and no. >> > For "hard", I means that we need two interleaved recursion to make the > order correct. Otherwise, I think it is a bug or limitation. So the limitation is that circular dependencies may not exist, because if they did, there would be no suitable suspend/shutdown ordering between devices. >> We know it for a fact that there is a linear ordering that will work. >> It is inefficient to figure it out every time during system suspend >> and resume, for one and that's why we have dpm_list. >> > Yeah, I agree that iterating over device tree may hurt performance. I > guess the iterating will not cost the majority of the suspend time, > comparing to the device_suspend(), which causes hardware's sync. But > data is more persuasive. Besides the performance, do you have other > concern till now? I simply think that there should be one way to iterate over devices for both system-wide PM and shutdown. The reason why it is not like that today is because of the development history, but if it doesn't work and we want to fix it, let's just consolidate all of that. Now, system-wide suspend resume sometimes iterates the list in the reverse order which would be hard without having a list, wouldn't it? >> Now, if we have it for suspend and resume, it can also be used for shutdown. >> > Yes, I do think so. OK Thanks, Rafael