Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753507AbaACTzW (ORCPT ); Fri, 3 Jan 2014 14:55:22 -0500 Received: from mga14.intel.com ([143.182.124.37]:13823 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753187AbaACTzU (ORCPT ); Fri, 3 Jan 2014 14:55:20 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.95,599,1384329600"; d="scan'208";a="401023755" Date: Fri, 3 Jan 2014 11:54:55 -0800 From: Sarah Sharp To: walt Cc: Alan Stern , Greg Kroah-Hartman , linux-kernel@vger.kernel.org, stable@vger.kernel.org, David Laight , Mark Lord , linux-usb@vger.kernel.org, linux-scsi@vger.kernel.org Subject: Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst Message-ID: <20140103195455.GA4193@xanatos> References: <20131218211219.461663463@linuxfoundation.org> <20131218211220.412278148@linuxfoundation.org> <52C32BB0.90600@gmail.com> <20140102191510.GA9621@xanatos> <52C6D9F1.9000709@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52C6D9F1.9000709@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3429 Lines: 83 On Fri, Jan 03, 2014 at 07:40:33AM -0800, walt wrote: > On 01/02/2014 11:15 AM, Sarah Sharp wrote: > > On Tue, Dec 31, 2013 at 12:40:16PM -0800, walt wrote: > >> On 12/18/2013 01:11 PM, Greg Kroah-Hartman wrote: > >>> 3.12-stable review patch. If anyone has any objections, please let me know. > >>> > >>> ------------------ > >>> > >>> From: David Laight > >>> > >>> commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e upstream. > >>> > >>> Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link TRB > >>> can only occur at a boundary between underlying USB frames (512 bytes for > >>> high speed devices). > >>> > >>> If this isn't done the USB frames aren't formatted correctly and, for example, > >>> the USB3 ethernet ax88179_178a card will stop sending... > >> > >> > >> Unfortunately this patch causes a regression when copying large files to my > >> outboard USB3 drive. (Nothing at all to do with networking.) > > > Do you have CONFIG_USB_DEBUG turned on for 3.13? If so, you should see > > dmesg output from this statement shortly before your drive fails: > > > > if (num_trbs >= TRBS_PER_SEGMENT) { > > xhci_err(xhci, "Too many fragments %d, max %d\n", > > num_trbs, TRBS_PER_SEGMENT - 1); > > return -ENOMEM; > > } > > Well, the answers depend on whether the usb3 drive uses logical volumes or not > (lvm2), which I can't explain. What I've described so far is with lvm2. > > When using lvm2 on the usb3 drive, turning on USB_DEBUG has *no* effect -- the > console prints two or three lines stating that the ext4 journal has quit and > the drive is remounted ro. That particular drive stays wedged until the next > reboot, but no other ill effects to the system. Odd. In 3.12 xHCI has dynamic debugging, and turning on CONFIG_USB_DEBUG should turn on debugging by default, so it's confusing that you didn't see any messages. Can I see your .config from /boot/? Also, did you try capturing dmesg with `tail -f /var/log/kern.log` or just dmesg? Perhaps you need to run `sudo dmesg -n 7`? > OTOH, when I put a disk with just an ordinary ext4 partition in the usb3 dock, > (no logical volumes) the copy failure becomes catastrophic, with kernel panic > messages, leaving the system unresponsive and needing a hard reset to recover. > > I also tried your other suggestion: > > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c > index 4265b48..1a6a43d 100644 > --- a/drivers/usb/host/xhci.c > +++ b/drivers/usb/host/xhci.c > @@ -4714,7 +4714,7 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks) > int retval; > > /* Accept arbitrarily long scatter-gather lists */ > - hcd->self.sg_tablesize = ~0; > + hcd->self.sg_tablesize = 31; > > /* support to build packet from discontinuous buffers */ > hcd->self.no_sg_constraint = 1; > > Sadly it didn't fix the problem. Did I get the patch right? Yes, you did. So perhaps the patch triggers a different bug. I can't tell until I see xHCI debugging output. > Thanks for your help, and I'm happy to try more ideas, as always. Thanks for your patience. :) Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/