Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp1008730ybv; Wed, 5 Feb 2020 19:05:59 -0800 (PST) X-Google-Smtp-Source: APXvYqyslmHHNrtc9g/m6X3vXTFkIEz2HXv0ft2O4xjMFb+JnZUi0qVyCbaIs0jmzP45V/H2Y8Bf X-Received: by 2002:a9d:3c6:: with SMTP id f64mr29919781otf.334.1580958359809; Wed, 05 Feb 2020 19:05:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580958359; cv=none; d=google.com; s=arc-20160816; b=d1OjyfHCM+IT+iFUGg9O0O7hkmx0CHgRWwsSRwLNb0aOYMTEDZBwQ+bARylctNhcgS aj6/+/XGAwApDSLfK56pdraBvLb3Umf2J4SsouLXSYxyllDqQvOcnpJGFjGN7qxaY2vD ukST1IkCEfr7lxm6fGBXyf617WGVSlCbLy1A6z7eCkaM2iopvyiKz4woyUWoCPojx6FE TpxvuR43cmMInNyE+Wjw6DyCJ+/e9mbJ1O3SBgrmEJdKnM0KyDP/PmzszI1TkMHOJKmv a4C7FPrMNA3HvP5Cb7RzITqSjCgwVU22R/nOPIyQHIuPOd+5e6ZEPUw3ENAkiy1kZHkC trVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=OQk36ohF+OVoNaFspLClMboPueoTBjXIt7GNw5bavGA=; b=t+otSJLZwZfr3DMlYweagUnjJpnO/lFkORUmVPcdBNu1o/aBDV4PxS4BkWamjJdAbm Fk9CMURUquUeZETw60DXuGjslW3VaZblDQVVS0yeZCiVxy8Pfk52OYOLI82iVZb2iYBQ IRAzsRVLSIjoOHDJWp1GXkvrc3gJshTNSkGrLqSJIHmz7Rv3Ml2w2UitKJpDUZuxby/g cOCY8wt6o2g4cOsAYUQifGNDN3Dq4CTwV+xsRnEP02QUhN4/Z7QWCw0rCJxwAnzNJZeg faesbgtgXLOWpixKsd8ShGtvFUTkbphmBeZHDXizy3mYLx/ux8XlHRPg0bctAoIhWwHy SSfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b=bTjvcohw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l26si1139015oti.152.2020.02.05.19.05.47; Wed, 05 Feb 2020 19:05:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b=bTjvcohw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727747AbgBFDEj (ORCPT + 99 others); Wed, 5 Feb 2020 22:04:39 -0500 Received: from bilbo.ozlabs.org ([203.11.71.1]:42831 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727307AbgBFDEj (ORCPT ); Wed, 5 Feb 2020 22:04:39 -0500 Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 48Cjvm083Hz9sRm; Thu, 6 Feb 2020 14:04:35 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ellerman.id.au; s=201909; t=1580958276; bh=3GiIB+ybAW+Dkfv7Blu7I604G11RcvKHUbdqalJe2bI=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=bTjvcohw1EBO03bDjNlvCHI6iEvcqDZ4AS4PSAjuGmlc2gr2UaQqLlF1CzQc9rct/ 0NQWwYYGud0SZ+mPCsV/jJW0T1MvGxmZXUWYN8ao8IK0L1pjxvs4m3AUV+aqIWTTjD mWPOEu9QplUmGKz71voVlgf8Lj19kltfU7Z3WRn8sOKxx0wYvH6sX+oq1lThXM/lsl hB2nAwiH0k9AVx4QMsVmO0fC7lRuEK7gjxPck2NvombIxoTHKbUfekIssuhnA4Kmr6 GtHQsdqVxfI4Kxxtp2OEHtX1MU+0HtQ/78SOVjw2vvH8IRTSkzw6sXZ4q2wwI9ZPC+ 3XD6Nh3IoBCRw== From: Michael Ellerman To: Joel Stanley , Mathias Nyman , linux-usb@vger.kernel.org Cc: Linux Kernel Mailing List Subject: Re: TI PCIe xHCI and kexec In-Reply-To: References: Date: Thu, 06 Feb 2020 14:04:32 +1100 Message-ID: <87zhdw1kn3.fsf@mpe.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Joel Stanley writes: > I'm supporting a system that uses Linux-as-a-bootloader to load a > distro kernel via kexec, The systems have a TI TUSB73x0 PCIe > controller which goes out to lunch after a kexec. This is the distro > (post-kexec) kernel: > > [ 0.235411] pci 0003:01:00.0: xHCI HW did not halt within 16000 > usec status = 0x0 > [ 1.037298] xhci_hcd 0003:01:00.0: xHCI Host Controller > [ 1.037367] xhci_hcd 0003:01:00.0: new USB bus registered, assigned > bus number 1 > [ 1.053481] xhci_hcd 0003:01:00.0: Host halt failed, -110 > [ 1.053523] xhci_hcd 0003:01:00.0: can't setup: -110 > [ 1.053565] xhci_hcd 0003:01:00.0: USB bus 1 deregistered > [ 1.053629] xhci_hcd 0003:01:00.0: init 0003:01:00.0 fail, -110 > [ 1.053703] xhci_hcd: probe of 0003:01:00.0 failed with error -110 > > There were some fixes made a few years back to improve the situation, > but we've still had to carry some form of the patch below in the > bootloader kernel. I would like to rework it so it can be merged. > > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c > index dbac0fa9748d..eaa94456dd9d 100644 > --- a/drivers/usb/host/xhci.c > +++ b/drivers/usb/host/xhci.c > @@ -789,6 +789,9 @@ void xhci_shutdown(struct usb_hcd *hcd) > xhci_dbg_trace(xhci, trace_xhci_dbg_init, > "xhci_shutdown completed - status = %x", > readl(&xhci->op_regs->status)); > + > + /* TI XHCI controllers do not come back after kexec without this hack */ > + pci_reset_function_locked(to_pci_dev(hcd->self.sysdev)); > } > EXPORT_SYMBOL_GPL(xhci_shutdown); > > I would like some advice on how to implement it in a way that is > acceptable. Would a quirk on the pci id in xhci_shutdown be ok? Doing something on shutdown only helps the kexec case, where the first kernel shuts down cleanly. kdump kernels will still hit the issue, because there the first kernel has crashed and done no (or very little) shutdown. The ideal solution is to have the driver/bus code in the second kernel detect that the device is fubar and do something to reset it then. But I don't know enough about USB or this device to know if that's really feasible. I realise that doesn't actually solve the problem for the linux-as-bootloader case, because there you want to make it work even if the kernel you're loading hasn't been updated, so you have to do something on shutdown. cheers > [ 1.037638] xhci_hcd 0003:01:00.0: xHCI Host Controller > [ 1.037695] xhci_hcd 0003:01:00.0: new USB bus registered, assigned > bus number 1 > [ 1.037789] xhci_hcd 0003:01:00.0: xHCI capability registers at > (ptrval): > [ 1.037792] xhci_hcd 0003:01:00.0: CAPLENGTH AND HCIVERSION 0x960020: > [ 1.037794] xhci_hcd 0003:01:00.0: CAPLENGTH: 0x20 > [ 1.037796] xhci_hcd 0003:01:00.0: HCIVERSION: 0x96 > [ 1.037798] xhci_hcd 0003:01:00.0: HCSPARAMS 1: 0x8000840 > [ 1.037800] xhci_hcd 0003:01:00.0: Max device slots: 64 > [ 1.037802] xhci_hcd 0003:01:00.0: Max interrupters: 8 > [ 1.037804] xhci_hcd 0003:01:00.0: Max ports: 8 > [ 1.037806] xhci_hcd 0003:01:00.0: HCSPARAMS 2: 0xc0000f1 > [ 1.037808] xhci_hcd 0003:01:00.0: Isoc scheduling threshold: 1 > [ 1.037810] xhci_hcd 0003:01:00.0: Maximum allowed segments in > event ring: 15 > [ 1.037813] xhci_hcd 0003:01:00.0: HCSPARAMS 3 0x7ff000a: > [ 1.037815] xhci_hcd 0003:01:00.0: Worst case U1 device exit latency: 10 > [ 1.037816] xhci_hcd 0003:01:00.0: Worst case U2 device exit latency: 2047 > [ 1.037819] xhci_hcd 0003:01:00.0: HCC PARAMS 0x270f06d: > [ 1.037821] xhci_hcd 0003:01:00.0: HC generates 64 bit addresses > [ 1.037823] xhci_hcd 0003:01:00.0: HC hasn't Contiguous Frame ID Capability > [ 1.037825] xhci_hcd 0003:01:00.0: HC can't generate Stopped - > Short Package event > [ 1.037826] xhci_hcd 0003:01:00.0: FIXME: more HCCPARAMS debugging > [ 1.037829] xhci_hcd 0003:01:00.0: RTSOFF 0x4a0: > [ 1.037831] xhci_hcd 0003:01:00.0: xHCI operational registers at > (ptrval): > [ 1.037833] xhci_hcd 0003:01:00.0: USBCMD 0x0: > [ 1.037835] xhci_hcd 0003:01:00.0: HC is being stopped > [ 1.037837] xhci_hcd 0003:01:00.0: HC has finished hard reset > [ 1.037839] xhci_hcd 0003:01:00.0: Event Interrupts disabled > [ 1.037841] xhci_hcd 0003:01:00.0: Host System Error Interrupts disabled > [ 1.037843] xhci_hcd 0003:01:00.0: HC has finished light reset > [ 1.037846] xhci_hcd 0003:01:00.0: USBSTS 0x0: > [ 1.037847] xhci_hcd 0003:01:00.0: Event ring is empty > [ 1.037849] xhci_hcd 0003:01:00.0: No Host System Error > [ 1.037851] xhci_hcd 0003:01:00.0: HC is running > [ 1.037854] xhci_hcd 0003:01:00.0: (ptrval) port status reg = 0x2a0 > [ 1.037857] xhci_hcd 0003:01:00.0: (ptrval) port power reg = 0x0 > [ 1.037860] xhci_hcd 0003:01:00.0: (ptrval) port link reg = 0x0 > [ 1.037862] xhci_hcd 0003:01:00.0: (ptrval) port reserved reg = 0x0 > [ 1.037865] xhci_hcd 0003:01:00.0: (ptrval) port status reg = 0x2a0 > [ 1.037868] xhci_hcd 0003:01:00.0: (ptrval) port power reg = 0x0 > [ 1.037870] xhci_hcd 0003:01:00.0: (ptrval) port link reg = 0x0 > [ 1.037873] xhci_hcd 0003:01:00.0: (ptrval) port reserved reg = 0x0 > [ 1.037876] xhci_hcd 0003:01:00.0: (ptrval) port status reg = 0xe03 > [ 1.037878] xhci_hcd 0003:01:00.0: (ptrval) port power reg = 0x0 > [ 1.037881] xhci_hcd 0003:01:00.0: (ptrval) port link reg = 0x0 > [ 1.037884] xhci_hcd 0003:01:00.0: (ptrval) port reserved reg = 0x0 > [ 1.037886] xhci_hcd 0003:01:00.0: (ptrval) port status reg = 0xe03 > [ 1.037889] xhci_hcd 0003:01:00.0: (ptrval) port power reg = 0x0 > [ 1.037892] xhci_hcd 0003:01:00.0: (ptrval) port link reg = 0x0 > [ 1.037894] xhci_hcd 0003:01:00.0: (ptrval) port reserved reg = 0x0 > [ 1.037897] xhci_hcd 0003:01:00.0: (ptrval) port status reg = 0x2a0 > [ 1.037900] xhci_hcd 0003:01:00.0: (ptrval) port power reg = 0x0 > [ 1.037902] xhci_hcd 0003:01:00.0: (ptrval) port link reg = 0x0 > [ 1.037905] xhci_hcd 0003:01:00.0: (ptrval) port reserved reg = 0x0 > [ 1.037908] xhci_hcd 0003:01:00.0: (ptrval) port status reg = 0x2a0 > [ 1.037910] xhci_hcd 0003:01:00.0: (ptrval) port power reg = 0x0 > [ 1.037913] xhci_hcd 0003:01:00.0: (ptrval) port link reg = 0x0 > [ 1.037916] xhci_hcd 0003:01:00.0: (ptrval) port reserved reg = 0x0 > [ 1.037918] xhci_hcd 0003:01:00.0: (ptrval) port status reg = 0x2a0 > [ 1.037921] xhci_hcd 0003:01:00.0: (ptrval) port power reg = 0x0 > [ 1.037924] xhci_hcd 0003:01:00.0: (ptrval) port link reg = 0x0 > [ 1.037926] xhci_hcd 0003:01:00.0: (ptrval) port reserved reg = 0x0 > [ 1.037929] xhci_hcd 0003:01:00.0: (ptrval) port status reg = 0x2a0 > [ 1.037931] xhci_hcd 0003:01:00.0: (ptrval) port power reg = 0x0 > [ 1.037934] xhci_hcd 0003:01:00.0: (ptrval) port link reg = 0x0 > [ 1.037937] xhci_hcd 0003:01:00.0: (ptrval) port reserved reg = 0x0 > [ 1.037939] xhci_hcd 0003:01:00.0: // Halt the HC > [ 1.053945] xhci_hcd 0003:01:00.0: Host halt failed, -110 > [ 1.053987] xhci_hcd 0003:01:00.0: can't setup: -110 > [ 1.054032] xhci_hcd 0003:01:00.0: USB bus 1 deregistered > [ 1.054099] xhci_hcd 0003:01:00.0: init 0003:01:00.0 fail, -110 > [ 1.054165] xhci_hcd: probe of 0003:01:00.0 failed with error -110