Received: by 10.192.165.148 with SMTP id m20csp354860imm; Wed, 2 May 2018 01:12:58 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoOmazYQcSMwCbeabXK91TF0blvwHxAuXib8O9LGSaJnSenppzmUm8NNhmoe/6YuWOl5n6k X-Received: by 2002:a65:58c2:: with SMTP id e2-v6mr4037061pgu.204.1525248778810; Wed, 02 May 2018 01:12:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525248778; cv=none; d=google.com; s=arc-20160816; b=FA+A4mcN+IibQsE3BAWISKZcWqCsEmi0udEgP1u9NXobn4q0sDtiAT3SEkkxLNBg/P gB6g/Cw2lJk3h1N5OLd3T0BLAXufcgprA7L4FXx/rMqOn2opJ1mC9o3KMOfUtqzUfGm3 r/JPRn5CGy7At95Qt6w6NqSBX6OHOB14GygOqS3tRpV6xddHWnjSIxKcR66m26nIKP+i vAaspc9CHrmCiWwUvBPlEPL6T2XJMrUP9RWst7ozrNcWT0v/su5qWIUPG3q/+JFqUlp/ cRlDAs4b5Do3yVxAFEYwoEwLRfjd7p0mRI7DR237+pqhet3wQ1sunFxq4KdMaGJEgQwB HnCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:content-disposition :mime-version:message-id:subject:cc:to:from:date:dkim-signature :arc-authentication-results; bh=91bDGPUDwZKF4VXv0v6R+RzLJuX8w0+xlGc0blPbM48=; b=JgMX9abm8J4IhzXQk9tk1icw8f6kGXbAWiCVl3TLvXxDb5aH2LpL+T2k15WpvvD32M wc5hHUVkTUH9tp7e359res9xQe6s0NG8pFXrh8m94JvZf4XWPJIeNAIzh2DPD4EU+lI/ g8+LMCgbUbm+ykipVTTEn3s4xMVP2M4lBmvLYc1BkommtdzQ+a4Pw77nFhOKFh1/Nh2n FYtCCiWF9qEfTcmG4lTJHgxHXcCCxCYxv+aiMAbTF8GH5RXxpBMFxhUxqJAzkUFJqLku 6Am8ZcjXbdfI5VRIkC4gJGoiE3sko4BveqGGezgVpHYHIkNzsEdn3U8jz7oPRm6lTZU1 5z3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=cE7eyF/k; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v4-v6si9301282pgn.260.2018.05.02.01.12.44; Wed, 02 May 2018 01:12:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=cE7eyF/k; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751021AbeEBIMV (ORCPT + 99 others); Wed, 2 May 2018 04:12:21 -0400 Received: from mail-wm0-f46.google.com ([74.125.82.46]:52729 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750800AbeEBIMR (ORCPT ); Wed, 2 May 2018 04:12:17 -0400 Received: by mail-wm0-f46.google.com with SMTP id m70so20792514wma.2 for ; Wed, 02 May 2018 01:12:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:mime-version :content-disposition:user-agent; bh=91bDGPUDwZKF4VXv0v6R+RzLJuX8w0+xlGc0blPbM48=; b=cE7eyF/ktxaMv+URsh93D+7mYO/FPfE1kdWpeKJ25OYvOMVK2D/0BPOK6toyzJYYB0 paeZrus+kifPuxF+a//XfOaqbKPsXbOnodOKTyg+olw3A/waTNdTCzh8g93s9sRnyZk+ +Jts/esBl5JiaIO3aF3GICKsjrsyG449gJoUAJVOwj8/Y1V6A6333e3M8i5/9IizKLj3 mIWj8ZIGJrIZbnHfjOSxh7qCkYRoCAXIMd2LXtPWZlSxRQ1h3GY+S6amN699mrWwXqOS thmOke9n14NrEWCCL2i0ibddwQCtIhhlKrCibqG0nhbYLOM0OActGJ/A1wSLntnjc3rD B4gQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :mime-version:content-disposition:user-agent; bh=91bDGPUDwZKF4VXv0v6R+RzLJuX8w0+xlGc0blPbM48=; b=J9h0S/MC5l0fXo2qT7I6QKqVYthEolrC2TcYem1WNPAbO3cqt7R3jG8V/gK6P2Or2B d/egtZritloqoShLDlnt96C+/+eKFfHBUrEbEdmrMDcfiJ715Yz5oeC2hdEmuYNRHcna 4BKiiY4rvy8kpf/LphOEVa61hp/28HKg0Cq78BxelMWkKP0vAi1zxf/38LbKNDRvMRJ4 R25gaS5ew3FZr6WJczsK7kur4HcXVYa4TMuOrm64Z4g79jrkAj0NoUBzcieIfow317Vo G0Ib4IuMXN+JFBrSNFOetkY8yXk8xrDfwmnDyx4EuDowBiS/K2lx2DN3RZwN/UwQDG4r Bmtw== X-Gm-Message-State: ALQs6tCkMuo1Ow84emRbRG7wDnMUmOxu20oFmisp8i5WbbZIakEl2dxK PLjsTMfxCZJrzUrtCu/GxfU= X-Received: by 2002:a50:879c:: with SMTP id a28-v6mr25728231eda.34.1525248395709; Wed, 02 May 2018 01:06:35 -0700 (PDT) Received: from q1 (j176100.upc-j.chello.nl. [24.132.176.100]) by smtp.gmail.com with ESMTPSA id y7-v6sm6388074edq.8.2018.05.02.01.06.34 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 02 May 2018 01:06:34 -0700 (PDT) Received: from cavok by q1 with local (Exim 4.89) (envelope-from ) id 1fDmmW-0005a3-EW; Wed, 02 May 2018 10:06:32 +0200 Date: Wed, 2 May 2018 10:06:31 +0200 From: Domenico Andreoli To: Mathias Nyman , Bjorn Helgaas Cc: Ard Biesheuvel , Marc Zyngier , Greg Kroah-Hartman , linux-kernel@vger.kernel.org Subject: Regression due to "Workaround for uPD72020x USB3 chips" Message-ID: <20180502080631.sriwquzcky6t24c2@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dear all, my home machine stopped to boot starting from kernel version 4.12.7. The last message I read is about resetting some USB3 bus. It's 100% reproducible also with any recent kernel up to 4.17.0-rc3. I bisected down to the following commit: commit 0e1f0eaed6c20db41ff61e024b361ee3ec9d686c (tag: my_broken_xhci) Author: Marc Zyngier Date: Tue Aug 1 20:11:08 2017 -0500 xhci: Reset Renesas uPD72020x USB controller for 32-bit DMA issue commit 8466489ef5ba48272ba4fa4ea9f8f403306de4c7 upstream. The Renesas uPD72020x XHCI controller seems to suffer from a really annoying bug, where it may retain some of its DMA programming across a XHCI reset, and despite the driver correctly programming new DMA addresses. This is visible if the device has been using 64-bit DMA addresses, and is then switched to using 32-bit DMA addresses. The top 32 bits of the address (now zero) are ignored are replaced by the 32 bits from the *previous* programming. Sticking with 64-bit DMA always works, but doesn't seem very appropriate. A PCI reset of the device restores the normal functionality, which is done at probe time. Unfortunately, this has to be done before any quirk has been discovered, hence the intrusive nature of the fix. Tested-by: Ard Biesheuvel Signed-off-by: Marc Zyngier Signed-off-by: Bjorn Helgaas Acked-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman diff --git drivers/usb/host/pci-quirks.c drivers/usb/host/pci-quirks.c index 5f4ca7890435..c8f38649f749 100644 --- drivers/usb/host/pci-quirks.c +++ drivers/usb/host/pci-quirks.c @@ -1157,3 +1157,23 @@ static void quirk_usb_early_handoff(struct pci_dev *pdev) } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_SERIAL_USB, 8, quirk_usb_early_handoff); + +bool usb_xhci_needs_pci_reset(struct pci_dev *pdev) +{ + /* + * Our dear uPD72020{1,2} friend only partially resets when + * asked to via the XHCI interface, and may end up doing DMA + * at the wrong addresses, as it keeps the top 32bit of some + * addresses from its previous programming under obscure + * circumstances. + * Give it a good wack at probe time. Unfortunately, this + * needs to happen before we've had a chance to discover any + * quirk, or the system will be in a rather bad state. + */ + if (pdev->vendor == PCI_VENDOR_ID_RENESAS && + (pdev->device == 0x0014 || pdev->device == 0x0015)) + return true; + + return false; +} +EXPORT_SYMBOL_GPL(usb_xhci_needs_pci_reset); diff --git drivers/usb/host/pci-quirks.h drivers/usb/host/pci-quirks.h index 655994480198..5582cbafecd4 100644 --- drivers/usb/host/pci-quirks.h +++ drivers/usb/host/pci-quirks.h @@ -15,6 +15,7 @@ void usb_asmedia_modifyflowcontrol(struct pci_dev *pdev); void usb_enable_intel_xhci_ports(struct pci_dev *xhci_pdev); void usb_disable_xhci_ports(struct pci_dev *xhci_pdev); void sb800_prefetch(struct device *dev, int on); +bool usb_xhci_needs_pci_reset(struct pci_dev *pdev); #else struct pci_dev; static inline void usb_amd_quirk_pll_disable(void) {} diff --git drivers/usb/host/xhci-pci.c drivers/usb/host/xhci-pci.c index 1ef622ededfd..cefa223f9f08 100644 --- drivers/usb/host/xhci-pci.c +++ drivers/usb/host/xhci-pci.c @@ -285,6 +285,13 @@ static int xhci_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) driver = (struct hc_driver *)id->driver_data; + /* For some HW implementation, a XHCI reset is just not enough... */ + if (usb_xhci_needs_pci_reset(dev)) { + dev_info(&dev->dev, "Resetting\n"); + if (pci_reset_function_locked(dev)) + dev_warn(&dev->dev, "Reset failed"); + } + /* Prevent runtime suspending between USB-2 and USB-3 initialization */ pm_runtime_get_noresume(&dev->dev); --- Commenting out the call to pci_reset_function_locked() makes 4.17.0-rc3 boot again so I think this bug qualifies as regression. I investigated a bit, the culprit is pci_reset_secondary_bus(). pci_reset_function_locked -> __pci_reset_function_locked -> pci_reset_bridge_secondary_bus -> pcibios_reset_secondary_bus -> pci_reset_secondary_bus ->>>>>> here the system dies/hangs/stops I can read the printk I put right before PCI_BRIDGE_CTL_BUS_RESET is written in PCI_BRIDGE_CONTROL. I cannot read the printk I put right after. It seems my system doesn't like that PCI reset at all. I cannot swear that it is completely frozen, some disk activity might be happening shortly after, but my only option is to power cycle. I cannot debug easily. I tried to boot with a patched module and just attempted to work at it on a live machine but as soon I unload the module I'm left without keyboard/mouse (apparently all the accessible USB ports are going through the xhci-pci module) and at the moment I cannot go via network. This is the output of lspci -vt: -[0000:00]-+-00.0 Intel Corporation 4th Gen Core Processor DRAM Controller +-01.0-[01]----00.0 NVIDIA Corporation GM108M [GeForce 840M] +-02.0 Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller +-03.0 Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller +-14.0 Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI +-16.0 Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 +-1a.0 Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 +-1b.0 Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller +-1c.0-[02]-- +-1c.2-[03]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller +-1c.3-[04]----00.0 Realtek Semiconductor Co., Ltd. RTS5229 PCI Express Card Reader +-1c.4-[05]----00.0 Realtek Semiconductor Co., Ltd. RTL8821AE 802.11ac PCIe Wireless Network Adapter +-1c.5-[06]----00.0 Renesas Technology Corp. uPD720202 USB 3.0 Host Controller +-1d.0 Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 +-1f.0 Intel Corporation C220 Series Chipset Family H81 Express LPC Controller +-1f.2 Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] \-1f.3 Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller The call to pci_reset_secondary_bus() is correctly applied to -1c.5-. I suspect that -1c.5- might benefit from having PCI_DEV_FLAGS_NO_BUS_RESET in its dev->dev_flags but I'm not sure that it's the proper fix and how exactly it could end up there. Any suggestion on how to proceed further? Please ask more info if needed. Kind regards, Domenico -- 3B10 0CA1 8674 ACBA B4FE FCD2 CE5B CF17 9960 DE13