Received: by 10.192.165.148 with SMTP id m20csp2206466imm; Thu, 3 May 2018 12:12:01 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoL/2bpgYEYBuwetC31Ty+urwhO2rljfRZHsC9jRUp5NzI7upKQ0Y8PJJSdYBRU1V4OGMpu X-Received: by 10.98.68.156 with SMTP id m28mr18443571pfi.145.1525374721017; Thu, 03 May 2018 12:12:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525374720; cv=none; d=google.com; s=arc-20160816; b=sZWA3CJISnSXpAzEDlyN4jZO+CHYbnXT40UsxRZW9zNyXEy9FphixycDSSEtezeqMo sdiWTCTW3uoiu2E2XT46OBRptoq6s2IftwIX8rWgXcRpInpf3skfJERdUOFeR6Wx1TY4 herxwdJMhxujgWYD/GdA30oxl8CjT5vgwW7rzm1LB2nSbNkfa9FHw3N7FVE6nly2dr05 UDu51cD3/rkjEYy/QdSrubBY/IgKWycs0dzBzCxhmZw056qBEdKwA6xpsmYxegqEbSRe 62lodRalXE2qKO/TObVyS939iKzk/a/CGDwpNE9wwqqFaHWxQokG7c+zxN2KNRu5aHWh lESg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature :arc-authentication-results; bh=PZwJl+E+qFxSTgl457wr5GFXg1v3LlPR3u6KpwZs2/I=; b=S9h+PKfxldb1dzsMBruyGTSjHTdWuh53y9slvSatvowK2w9yhvzG0eFW5aQ2MtagcN vS2vgLVDROZI734qw+j0zdJs49TLK7260C+aky3p1vs3VturoDqcz5rrhyaBAIffkEnN +KCQf894551cxwZDntc+WUwQb9B1yRchJbxWG0RC0kMrCIG8UHhRCSCkfYRQhHOUvGPT sZxfHeScEXgtidOH8z6Qafuu7m6/1+TjcKd8Ju2qXvndEEuDb9rAwE4gDyyEmBkk0yiR nVjF2DBKtK2jwcUcWngQD8nzi/DvG7ss0JFTBOlSciyd+D3K+eQ++6WmCWYzXGChAKxs MrLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=coWsh1rM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f19-v6si11710398pgn.277.2018.05.03.12.11.46; Thu, 03 May 2018 12:12:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=coWsh1rM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751575AbeECTLe (ORCPT + 99 others); Thu, 3 May 2018 15:11:34 -0400 Received: from mail.kernel.org ([198.145.29.99]:46868 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751240AbeECTLb (ORCPT ); Thu, 3 May 2018 15:11:31 -0400 Received: from localhost (unknown [69.71.5.252]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 6112921749; Thu, 3 May 2018 19:11:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1525374690; bh=+cMzFwQiMbrHiNbkPY5l36CIQZQ/ROl44SmJrCacRJo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=coWsh1rMpKXsUDQxAEdTKhkFCNRFhNH/AFCh6lkpPkscMWS8ILOrg7z31c6DqG23F TbjZr0t4PCeV6JrcK8ONsKyJGBlStXWz2ZYJBV6XOkXFBz0/d044SX5LwaxwsuzaP+ wt4IJlt88km3y0/eBRKJLQfBrQlFcIb4iEhlrJ3Q= Date: Thu, 3 May 2018 14:11:26 -0500 From: Bjorn Helgaas To: Joseph Salisbury Cc: "Rafael J. Wysocki" , "Rafael J. Wysocki" , Len Brown , Bjorn Helgaas , ACPI Devel Maling List , Linux PCI , "linux-kernel@vger.kernel.org" , 1745646@bugs.launchpad.net, Mika Westerberg Subject: Re: [Regression] PCI / PM: Simplify device wakeup settings code Message-ID: <20180503191126.GA15790@bhelgaas-glaptop.roam.corp.google.com> References: <56a8953c-d833-837c-57d5-fe758d4db02a@canonical.com> <1f67f00a-8141-f9af-2120-c78f7cfecb1d@canonical.com> <20180501195501.GB11698@bhelgaas-glaptop.roam.corp.google.com> <59bc04f8-e819-46c0-651d-a00eef4d34f8@canonical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <59bc04f8-e819-46c0-651d-a00eef4d34f8@canonical.com> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 03, 2018 at 02:29:02PM -0400, Joseph Salisbury wrote: > On 05/02/2018 06:41 AM, Rafael J. Wysocki wrote: > > On Tue, May 1, 2018 at 9:55 PM, Bjorn Helgaas wrote: > >> On Tue, May 01, 2018 at 10:34:29AM +0200, Rafael J. Wysocki wrote: > >>> On Mon, Apr 30, 2018 at 4:22 PM, Joseph Salisbury > >>> wrote: > >>>> On 04/16/2018 11:58 AM, Rafael J. Wysocki wrote: > >>>>> On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury > >>>>> wrote: > >>>>>> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote: > >>>>>>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury > >>>>>>> wrote: > >>>>>>>> Hi Rafael, > >>>>>>>> > >>>>>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel > >>>>>>>> bisect, it was found that reverting the following two commits resolved > >>>>>>>> this bug: > >>>>>>>> > >>>>>>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration") > >>>>>>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code") > >>>>>>>> > >>>>>>>> This is a regression introduced in v4.13-rc1 and still exists in > >>>>>>>> mainline. The bug causes the battery to drain when the system is > >>>>>>>> powered down and unplugged, which does not happed prior to these two > >>>>>>>> commits. > >>>>>>> What system and what do you mean by "powered down"? How much time > >>>>>>> does it take for the battery to drain now? > >>>>>> By powered down, the bug reporter is saying physically powered off and > >>>>>> unplugged. The system is a HP laptop: > >>>>>> > >>>>>> dmi.chassis.vendor: HP > >>>>>> dmi.product.family: 103C_5335KV HP Notebook > >>>>>> dmi.product.name: HP Notebook > >>>>>> vendor_id : GenuineIntel > >>>>>> cpu family : 6 > >>>>>> > >>>>>> > >>>>>>>> The bisect actually pointed to commit de3ef1e, but reverting > >>>>>>>> these two commits fixes the issue. > >>>>>>>> > >>>>>>>> I was hoping to get your feedback, since you are the patch author. Do > >>>>>>>> you think gathering any additional data will help diagnose this issue, > >>>>>>>> or would it be best to submit a revert request? > >>>>>>> First, reverting these is not an option or you will break systems > >>>>>>> relying on them now. 4.13 is three releases back at this point. > >>>>>>> > >>>>>>> Second, your issue appears to be related to the suspend/shutdown path > >>>>>>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the > >>>>>>> change in pci_enable_wake() causes the problem to happen. Can you try > >>>>>>> to revert this one alone and see if that helps? > >>>>>> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was > >>>>>> tested. However, the test kernel still exhibited the bug. > >>>>> So essentially the bisection result cannot be trusted. > >>>> We performed some more testing and confirmed just a revert of the > >>>> following commit resolves the bug: > >>>> > >>>> 0847684cfc5f0 ("PCI / PM: Simplify device wakeup settings code") > >>> Thanks for confirming this! > >>> > >>>> Can you think of any suggestions to help debug further? > >>> The root cause of the regression is likely the change in > >>> pci_enable_wake() removing the device_may_wakeup() check from it. > >>> > >>> Probably, one of the drivers in the platform calls pci_enable_wake() > >>> directly from its ->shutdown() callback and that causes the device to > >>> be set up for system wakeup which in turn causes the power draw while > >>> the system is off to increase. > >>> > >>> I would look at the PCI drivers used on that platform to find which of > >>> them call pci_enable_wake() directly from ->shutdown() and I would > >>> make these calls conditional on device_may_wakeup(). > >> I took a quick look with > >> > >> git grep -E "pci_enable_wake\(.*[^0]\);|device_may_wakeup" > >> > >> and didn't notice any pci_enable_wake() callers that called > >> device_may_wakeup() first. > > I've just look at a bunch of network drivers doing that. > > > > It looks like I may need to restore __pci_enable_wake() with an extra > > "runtime" argument for internal use. > > > > Joseph, can you ask the reporter to test the Bjorn's patch, please? > > The bug reporter has testing Bjorn's patch.? It did in fact resolve the > bug.? Thanks for the quick help, Rafael and Bjorn! Just as a word of caution, I think Rafael said my patch was not the right fix because it would break something else. So I would wait for a better patch from Rafael before actually resolving this issue. Bjorn