Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp87120pxb; Mon, 7 Feb 2022 07:00:00 -0800 (PST) X-Google-Smtp-Source: ABdhPJw8N8DOR1wMm5lh+GBgtFflzPXb7y/Z/bBvhlDJW+oCG3P9Bv9CQs5kfn8/Zn3ETAM8ZrzY X-Received: by 2002:a17:902:e293:: with SMTP id o19mr2668183plc.5.1644246000038; Mon, 07 Feb 2022 07:00:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644246000; cv=none; d=google.com; s=arc-20160816; b=CO2JAtkAFn3USNkG0NwP59lwL92lyozubLFv8mSrToWpqR2rWUvy1umCNn1gWzFWKr yFuNOLpP5e9n/P4Vl0kQhO8TQfKEAn6He3TLpD9ZbmykeuyEE7HHb/28Np3aOIeFZAvw aes7dOvsJlc5rZRsF+YGNbFlknxO9Fhyt5nL54GQW1Ff4Os+moh//uOuHqDKFT7cnodV AneYPEXV0lpDDE2paB8A80+vue0ucrHeXfwZ0u520zNhAYz30j4lA1n+n+HFeG35329P P2NvJ6badYczKdOI0AGVUZ++Ovuyapc6S5TWOxU7t7DL6DCDMJftBWg0gwvYshhEOPpF 0nHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=zeU0QmNXuQ4rCeHAx8rSm27z/lk8zIJSmzXfgT3QU2g=; b=iALLSKMrZNH+bppaPZ3WLS7LxaHJTkScAq/GKhQQn1utly2SaCkkzK3x908J6RQ9lu dFQ6Y1kWZD8r1h/m/e+B0vdW9XKT4QK9wuFNhLP9aAgDjP3t1D/wQSyQCezYuwflJeiK KQGuUO5M1dWOcBeYHf0FhtUMvexshWkISipuMKggBhdZcaUsxD72cyZ8GPnRuZX1r50/ xDaqEp3c7PbDM/Gn6EIzddr+3u5HZeqEazVpvgwHHJtBUOseg+aBfqIvWcUINAG73SnE gGh9igMUStJXepXVl/dIrlHx0ZCG/505kY2p+bhTqXV0Hw6Su1CosZqf2lPssnd7j+P2 EXew== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e188si9937548pgc.209.2022.02.07.06.59.48; Mon, 07 Feb 2022 06:59:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1353682AbiBCTHe (ORCPT + 99 others); Thu, 3 Feb 2022 14:07:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38508 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234813AbiBCTHe (ORCPT ); Thu, 3 Feb 2022 14:07:34 -0500 Received: from bmailout3.hostsharing.net (bmailout3.hostsharing.net [IPv6:2a01:4f8:150:2161:1:b009:f23e:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB4B4C061714; Thu, 3 Feb 2022 11:07:33 -0800 (PST) Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.hostsharing.net", Issuer "RapidSSL TLS DV RSA Mixed SHA256 2020 CA-1" (verified OK)) by bmailout3.hostsharing.net (Postfix) with ESMTPS id 1F698100D9407; Thu, 3 Feb 2022 20:07:32 +0100 (CET) Received: by h08.hostsharing.net (Postfix, from userid 100393) id F0C032D934; Thu, 3 Feb 2022 20:07:31 +0100 (CET) Date: Thu, 3 Feb 2022 20:07:31 +0100 From: Lukas Wunner To: Bjorn Helgaas , Liguang Zhang Cc: Kuppuswamy Sathyanarayanan , Amey Narkhede , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: pciehp: clear cmd_busy bit when Command Completed in polling mode Message-ID: <20220203190731.GA24120@wunner.de> References: <20211111054258.7309-1-zhangliguang@linux.alibaba.com> <20211126173309.GA12255@wunner.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211126173309.GA12255@wunner.de> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Bjorn, the below patch is marked "Changes Requested" in patchwork: https://patchwork.kernel.org/project/linux-pci/patch/20211111054258.7309-1-zhangliguang@linux.alibaba.com/ I think that might be erroneous because the patch is correct, I've provided a Reviewed-by and no change requests are recorded in patchwork or the mailing list archive. If you've got a few minutes to spare, could you double-check the state in patchwork and provide Liguang Zhang with the changes you'd want (if any)? Thanks! Lukas On Fri, Nov 26, 2021 at 06:33:09PM +0100, Lukas Wunner wrote: > On Thu, Nov 11, 2021 at 01:42:58PM +0800, Liguang Zhang wrote: > > This patch fixes this problem that on driver probe from system startup, > > pciehp checks the Presence Detect State bit in the Slot Status register > > to bring up an occupied slot or bring down an unoccupied slot. If empty > > slot's power status is on, turn power off. The Hot-Plug interrupt isn't > > requested yet, so avoid triggering a notification by calling > > pcie_disable_notification(). > > > > Both the CCIE and HPIE bits are masked in pcie_disable_notification(), > > when we issue a hotplug command, pcie_wait_cmd() will polling the > > Command Completed bit instead of waiting for an interrupt. But cmd_busy > > bit was not cleared when Command Completed which results in timeouts > > like this in pciehp_power_off_slot() and pcie_init_notification(): > > > > pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x01c0 > > (issued 2264 msec ago) > > pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x05c0 > > (issued 2288 msec ago) > > > > Signed-off-by: Liguang Zhang > > Fixes: a5dd4b4b0570 ("PCI: pciehp: Wait for hotplug command completion where necessary") > Link: https://bugzilla.kernel.org/show_bug.cgi?id=215143 > Reviewed-by: Lukas Wunner > Cc: stable@vger.kernel.org # v4.2+ > > Thanks a lot, that's a really good catch. > > It's a somewhat intricate bug, so I'll try to explain in my own words: > > If notification is disabled (HPIE or CCIE not set in the Slot Status > register), we rely on pcie_poll_cmd() to poll for Command Completed. > But once it's signaled, we neglect to clear ctrl->cmd_busy. > (Normally it is cleared by the hardirq handler pciehp_isr() if > notification is enabled.) > > The result is that starting with the second Slot Control write, > pciehp will gratuitously wait for a command to finish which has > already finished and it will incorrectly report a timeout. > > The bug was originally introduced in 2015 by commit a5dd4b4b0570 > ("PCI: pciehp: Wait for hotplug command completion where necessary"), > but didn't manifest itself because the first Slot Control Write already > enabled notification and from that point on the hardirq handler would > clear ctrl->cmd_busy. However I think the bug may have manifested > itself with pciehp_poll_mode=1. > > It wasn't until commit 4e6a13356f1c ("PCI: pciehp: Deduplicate presence > check on probe & resume") that multiple consecutive Slot Control writes > were performed on ->probe with notification disabled, so that's the > commit which first exposed the bug with pciehp_poll_mode=0. > > Thanks, > > Lukas > > > --- > > drivers/pci/hotplug/pciehp_hpc.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c > > index 83a0fa119cae..8698aefc6041 100644 > > --- a/drivers/pci/hotplug/pciehp_hpc.c > > +++ b/drivers/pci/hotplug/pciehp_hpc.c > > @@ -98,6 +98,8 @@ static int pcie_poll_cmd(struct controller *ctrl, int timeout) > > if (slot_status & PCI_EXP_SLTSTA_CC) { > > pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, > > PCI_EXP_SLTSTA_CC); > > + ctrl->cmd_busy = 0; > > + smp_mb(); > > return 1; > > } > > msleep(10); > > -- > > 2.19.1.6.gb485710b