Received: by 10.192.165.148 with SMTP id m20csp361996imm; Wed, 2 May 2018 01:22:54 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrV04FwRnC5imLKJ1kFgBU9Q4NDodzTyGbkqxAic/ThsEjFD/Dep++3GjB3udqr4bUbjGaW X-Received: by 2002:a63:65c4:: with SMTP id z187-v6mr7795060pgb.180.1525249374212; Wed, 02 May 2018 01:22:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525249374; cv=none; d=google.com; s=arc-20160816; b=yiLlENLteiYjcIrFcq+ppyJizc9hoLxiGvdlLtddGVQGijNVD2xFSqlT75Kw0qLST1 WfQzZm6b+FJixa5mt6wG/Yytn/Oakh/tvEGQRF6MyZiOqnKQ3RcECfeBOVou0WOPepNa VKVrTI7jZnR2OcxLrn7qv4RADqvrJfaNXRpulnHpos8AjdO95Wz1Q0B6xSxQM92kJxnb NmqGpZASyGs642OE83NVPS3OIDOmWX3gzOZjBl241PJNFbFZAtDcPFJ5bK/zabvWfMZg RF4VuIEiSHQ0GXje14H7XFO8kE4+TyJ4nLYX0B9aCk8kQ2JtKVHq8V8HM7dU7Z1AOaD3 4qgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=alZX/Mw58eCK+oWpp4hBhvv9Pzzkjb8qNPf/HeAj2Qw=; b=xE3/VpcTTkWlEnpsCiN2PJxjPqG1/T6KRHhjz16K7LdM/Tq9kAmD3TrACeVKQdFe5b d0tSl3GbBhTUqyiJF1H7qoJXsqUplZ6AEMigWHjZeh+MBNyKGnTg2mdTxWkalT7XJ5HO Xial8MitV6BCdSTHH8a+x+LT0OpHR45ysFC7M7k52542kABUk0B/Cwshi3JUAcLyORAE 8lf1rB4Ua69H2duUbZhK3J926XXG/M1qtmexzltOqR9PHUllYm5wQaTFF4MWqC3HM2xN Hj1fWF8pCb/MK873ADIbc87KYdBsHLeBm9dhHZf8PC1pufsGTJYJfyumgS6ZKPbFd6oC T69Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=E1x8ZC3M; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l8-v6si9211159pgr.187.2018.05.02.01.22.40; Wed, 02 May 2018 01:22:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=E1x8ZC3M; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751441AbeEBIWK (ORCPT + 99 others); Wed, 2 May 2018 04:22:10 -0400 Received: from mail-io0-f194.google.com ([209.85.223.194]:39029 "EHLO mail-io0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751171AbeEBIWG (ORCPT ); Wed, 2 May 2018 04:22:06 -0400 Received: by mail-io0-f194.google.com with SMTP id r9-v6so16496894iod.6 for ; Wed, 02 May 2018 01:22:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=alZX/Mw58eCK+oWpp4hBhvv9Pzzkjb8qNPf/HeAj2Qw=; b=E1x8ZC3Mx0k58ycvIJs1VIxH/gskwhduNDsSZUqsA7dEKtav0GPWEXUVW/rnmhEDVx O/LmJ9lmkBwD0362dl9qOOWu38d25Llc0Pixxx0gDZTFwJxi3vxaYfbWnhnczJbiQNGX 0vWTAW0uTd9giarR3LmfqCpRZMWnWx4TMFXqs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=alZX/Mw58eCK+oWpp4hBhvv9Pzzkjb8qNPf/HeAj2Qw=; b=bbzDmP6zC3jvEiok9vNw8aJDoAYi8mQrtMFtyN7p3Ww5hmfwVeyqaCZu/XwtyneEaf rpxix+Iw5EBf/UWDA/2lgJj7qPqBxOj9yW2hLR5BI/ZDH6Dhfd8DyoX26sXJgSA9DlMe VcDdq3ngOOj9yKwRbxtsFAoaLpSrp6ALmei155rxLctpI3B7shnTKVADGnTw7pnh1jim qoWBZb6gB4gC2YFjsaVw/8E604p+1zRiqpboh/v4KyBhMbaWnCfnyXVxutF6MQfJ9NiR nzEgiwUF2cAWtXwYcouUnNb3vAJA/rYr/OeHoMOE/CVKo7qCOTyd6mTDutVpomKMSFfN bIXw== X-Gm-Message-State: ALQs6tCWzof4cq6dc+eW/GpCrMtKe3p7knTWc4L0M5y2s+2naW4ZkU80 pzpUrgZ6CqtWZOi9kGJgo92tBsGMS2z51PZaHyvGWg== X-Received: by 2002:a6b:545:: with SMTP id 66-v6mr19590040iof.173.1525249325722; Wed, 02 May 2018 01:22:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.187.134 with HTTP; Wed, 2 May 2018 01:22:05 -0700 (PDT) In-Reply-To: <20180502080631.sriwquzcky6t24c2@localhost> References: <20180502080631.sriwquzcky6t24c2@localhost> From: Ard Biesheuvel Date: Wed, 2 May 2018 10:22:05 +0200 Message-ID: Subject: Re: Regression due to "Workaround for uPD72020x USB3 chips" To: Domenico Andreoli Cc: Mathias Nyman , Bjorn Helgaas , Marc Zyngier , Greg Kroah-Hartman , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2 May 2018 at 10:06, Domenico Andreoli wrote: > Dear all, > > my home machine stopped to boot starting from kernel version 4.12.7. > > The last message I read is about resetting some USB3 bus. It's 100% > reproducible also with any recent kernel up to 4.17.0-rc3. > > I bisected down to the following commit: > > commit 0e1f0eaed6c20db41ff61e024b361ee3ec9d686c (tag: my_broken_xhci) > Author: Marc Zyngier > Date: Tue Aug 1 20:11:08 2017 -0500 > > xhci: Reset Renesas uPD72020x USB controller for 32-bit DMA issue > Hello Domenico, Long time no see :-) Please refer to the following thread https://marc.info/?l=linux-usb&m=151872295419486 for some discussion on this topic, and a possible workaround. Regards, Ard. > commit 8466489ef5ba48272ba4fa4ea9f8f403306de4c7 upstream. > > The Renesas uPD72020x XHCI controller seems to suffer from a really > annoying bug, where it may retain some of its DMA programming across a XHCI > reset, and despite the driver correctly programming new DMA addresses. > This is visible if the device has been using 64-bit DMA addresses, and is > then switched to using 32-bit DMA addresses. The top 32 bits of the > address (now zero) are ignored are replaced by the 32 bits from the > *previous* programming. Sticking with 64-bit DMA always works, but doesn't > seem very appropriate. > > A PCI reset of the device restores the normal functionality, which is done > at probe time. Unfortunately, this has to be done before any quirk has > been discovered, hence the intrusive nature of the fix. > > Tested-by: Ard Biesheuvel > Signed-off-by: Marc Zyngier > Signed-off-by: Bjorn Helgaas > Acked-by: Mathias Nyman > Signed-off-by: Greg Kroah-Hartman > > diff --git drivers/usb/host/pci-quirks.c drivers/usb/host/pci-quirks.c > index 5f4ca7890435..c8f38649f749 100644 > --- drivers/usb/host/pci-quirks.c > +++ drivers/usb/host/pci-quirks.c > @@ -1157,3 +1157,23 @@ static void quirk_usb_early_handoff(struct pci_dev *pdev) > } > DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_ANY_ID, PCI_ANY_ID, > PCI_CLASS_SERIAL_USB, 8, quirk_usb_early_handoff); > + > +bool usb_xhci_needs_pci_reset(struct pci_dev *pdev) > +{ > + /* > + * Our dear uPD72020{1,2} friend only partially resets when > + * asked to via the XHCI interface, and may end up doing DMA > + * at the wrong addresses, as it keeps the top 32bit of some > + * addresses from its previous programming under obscure > + * circumstances. > + * Give it a good wack at probe time. Unfortunately, this > + * needs to happen before we've had a chance to discover any > + * quirk, or the system will be in a rather bad state. > + */ > + if (pdev->vendor == PCI_VENDOR_ID_RENESAS && > + (pdev->device == 0x0014 || pdev->device == 0x0015)) > + return true; > + > + return false; > +} > +EXPORT_SYMBOL_GPL(usb_xhci_needs_pci_reset); > diff --git drivers/usb/host/pci-quirks.h drivers/usb/host/pci-quirks.h > index 655994480198..5582cbafecd4 100644 > --- drivers/usb/host/pci-quirks.h > +++ drivers/usb/host/pci-quirks.h > @@ -15,6 +15,7 @@ void usb_asmedia_modifyflowcontrol(struct pci_dev *pdev); > void usb_enable_intel_xhci_ports(struct pci_dev *xhci_pdev); > void usb_disable_xhci_ports(struct pci_dev *xhci_pdev); > void sb800_prefetch(struct device *dev, int on); > +bool usb_xhci_needs_pci_reset(struct pci_dev *pdev); > #else > struct pci_dev; > static inline void usb_amd_quirk_pll_disable(void) {} > diff --git drivers/usb/host/xhci-pci.c drivers/usb/host/xhci-pci.c > index 1ef622ededfd..cefa223f9f08 100644 > --- drivers/usb/host/xhci-pci.c > +++ drivers/usb/host/xhci-pci.c > @@ -285,6 +285,13 @@ static int xhci_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) > > driver = (struct hc_driver *)id->driver_data; > > + /* For some HW implementation, a XHCI reset is just not enough... */ > + if (usb_xhci_needs_pci_reset(dev)) { > + dev_info(&dev->dev, "Resetting\n"); > + if (pci_reset_function_locked(dev)) > + dev_warn(&dev->dev, "Reset failed"); > + } > + > /* Prevent runtime suspending between USB-2 and USB-3 initialization */ > pm_runtime_get_noresume(&dev->dev); > > --- > > Commenting out the call to pci_reset_function_locked() makes 4.17.0-rc3 > boot again so I think this bug qualifies as regression. > > I investigated a bit, the culprit is pci_reset_secondary_bus(). > > pci_reset_function_locked -> > __pci_reset_function_locked -> > pci_reset_bridge_secondary_bus -> > pcibios_reset_secondary_bus -> > pci_reset_secondary_bus ->>>>>> here the system dies/hangs/stops > > I can read the printk I put right before PCI_BRIDGE_CTL_BUS_RESET is > written in PCI_BRIDGE_CONTROL. I cannot read the printk I put right > after. > > It seems my system doesn't like that PCI reset at all. I cannot swear > that it is completely frozen, some disk activity might be happening > shortly after, but my only option is to power cycle. > > I cannot debug easily. I tried to boot with a patched module and just > attempted to work at it on a live machine but as soon I unload the > module I'm left without keyboard/mouse (apparently all the accessible > USB ports are going through the xhci-pci module) and at the moment I > cannot go via network. > > This is the output of lspci -vt: > > -[0000:00]-+-00.0 Intel Corporation 4th Gen Core Processor DRAM Controller > +-01.0-[01]----00.0 NVIDIA Corporation GM108M [GeForce 840M] > +-02.0 Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller > +-03.0 Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller > +-14.0 Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI > +-16.0 Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 > +-1a.0 Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 > +-1b.0 Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller > +-1c.0-[02]-- > +-1c.2-[03]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller > +-1c.3-[04]----00.0 Realtek Semiconductor Co., Ltd. RTS5229 PCI Express Card Reader > +-1c.4-[05]----00.0 Realtek Semiconductor Co., Ltd. RTL8821AE 802.11ac PCIe Wireless Network Adapter > +-1c.5-[06]----00.0 Renesas Technology Corp. uPD720202 USB 3.0 Host Controller > +-1d.0 Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 > +-1f.0 Intel Corporation C220 Series Chipset Family H81 Express LPC Controller > +-1f.2 Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] > \-1f.3 Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller > > The call to pci_reset_secondary_bus() is correctly applied to -1c.5-. > > I suspect that -1c.5- might benefit from having PCI_DEV_FLAGS_NO_BUS_RESET > in its dev->dev_flags but I'm not sure that it's the proper fix and how > exactly it could end up there. > > Any suggestion on how to proceed further? > > Please ask more info if needed. > > Kind regards, > Domenico > > -- > 3B10 0CA1 8674 ACBA B4FE FCD2 CE5B CF17 9960 DE13