2014-01-20 19:35:20

by Sarah Sharp

[permalink] [raw]
Subject: Re: [BUGREPORT] Linux USB 3.0

Hi Markus,

I'm the xHCI driver maintainer, and it helps to Cc me on USB 3.0 bug
reports.

On Sat, Dec 28, 2013 at 07:24:20AM +0100, Markus Rechberger wrote:
> just received following log snippset:

Please state which kernel version you (or your customer) is running.
You've reported issues with several different kernel versions, so which
kernel are you running for this particular snippet?

> Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr

These messages might be harmless. The 3.0 kernel contains a fix for
Intel Panther Point xHCI hosts that suppresses those messages, commit
ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious
successful event."

A later commit extends that to all xHCI 1.0 hosts, commit
07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable
XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was
queued for 3.11 and marked to be backported into stable kernels as old
as 3.0.

> the previous bug report of that user:
> https://bugzilla.kernel.org/show_bug.cgi?id=65021 xhci: complete USB freeze

Hmm, Greg didn't assign that bug to me, so I missed it, sorry.

> On Fri, Dec 27, 2013 at 8:59 PM, Markus Rechberger <[email protected]> wrote:
> > Seems like DH87RL was working with 3.2.0-55-generic-pae unfortunately
> > we don't have such a board for testing and customer patience is
> > limited to bisect the kernel.
> >
> > Does anyone have a clue what modification could have killed USB 3.0
> > support within those releases?
> > It does not seem to be SG support.

3.2 was the kernel where the Intel EHCI to xHCI port switchover code
went in. Without that code, all ports will remain under the EHCI host,
and USB 3.0 devices will work at USB 2.0 speeds. I suspect the USB
device triggers an issue with the xHCI driver, and 3.2 only works
because the device is on an EHCI port without the switchover code.

> > On Fri, Dec 27, 2013 at 6:18 PM, Markus Rechberger <[email protected]> wrote:
> >> I just got another USB 3.0 bugreport, the entire system crashed. That
> >> particular customer already filed a bugreport in November 2013 that
> >> his system is in a bad state when using some USB 2.0 media devices
> >> which even have opensource drivers built into the kernel.
> >>
> >> USB 3.0 support with Linux seems to be a disaster with Linux 3.6.12.
> >> The affected board is an Intel DH87RL board.

Why are they running 3.6.12 in particular? That's not a supported
stable kernel.

> >> On Wed, Dec 25, 2013 at 8:18 AM, Markus Rechberger
> >> <[email protected]> wrote:
> >>> A customer using a device with USBDEVFS is reporting following
> >>> backtrace (it seems to be a rather generic issue related to linux usb
> >>> 3.0 in general):
> >>> According to him this problem is reproducible as soon as he starts the
> >>> data transfer, is there anything known about that?
> >>>
> >>> He is using 3.12.0-031200-generic

So at this point you've reported three separate bugs, all with the same
symptom, but different kernel versions? Are these all from the same bug
reporter, or a different bug reporter?

You've got me seriously confused right now. Please keep one bug report
to one mail thread, and get the original bug reporter to start that
thread. If this is from one bug reporter, please state the current
kernel they are running, and send dmesg showing the issue with
CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING turned on (you may
also need to turn on CONFIG_DYNAMIC_DEBUG in later kernels). Please
attach the dmesg as a file, since your mail client line-wraps.

> >>> Dec 24 14:22:39 homenas kernel: [ 1469.818460] xhci_hcd 0000:0f:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] xhci_hcd 0000:0f:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
> >>> Dec 24 14:30:39 homenas kernel: last message repeated 16 times
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] xhci_hcd 0000:0f:00.0: WARN Successful completion on short TX
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] xhci_hcd 0000:0f:00.0: WARN Successful completion on short TX
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] xhci_hcd 0000:0f:00.0: URB transfer length is wrong, xHC issue? req. len = 46080, act. len = 1382400
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] IP: [] finish_td+0x13f/0x250
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] PGD 0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Oops: 0000 [#1] SMP
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Modules linked in:
> >>> videodev pci_stub vboxpci(OF) vboxnetadp(OF) vboxnetflt(OF)
> >>> vboxdrv(OF) dm_crypt snd_hda_codec_ca0132 snd_hda_intel snd_hda_codec
> >>> snd_hwdep snd_pcm snd_seq_midi dm_multipath psmouse scsi_dh
> >>> snd_rawmidi serio_raw sb_edac snd_seq_midi_event edac_core snd_seq
> >>> snd_timer snd_seq_device lpc_ich snd bnep rfcomm soundcore
> >>> snd_page_alloc bluetooth mei_me mei mac_hid ppdev nfsd w83627ehf
> >>> hwmon_vid nfs_acl auth_rpcgss coretemp nfs fscache lockd lp parport
> >>> sunrpc raid10 raid456 async_pq async_xor async_memcpy
> >>> async_raid6_recov async_tx raid0 multipath linear btrfs raid6_pq xor
> >>> libcrc32c osst st raid1 tg3 mptsas firewire_ohci ptp mxm_wmi
> >>> firewire_core ahci mptscsih pps_core crc_itu_t libahci mpt2sas mptbase
> >>> wmi scsi_transport_sas raid_class [last unloaded: vmnet]
> >>>
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GF O 3.12.0-031200-generic #201311031935
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X79 Extreme9, BIOS P3.30 01/28/2013
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] task: ffffffff81c144a0 ti: ffffffff81c00000 task.ti: ffffffff81c00000
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RIP: 0010:[] [] finish_td+0x13f/0x250

It would help if your client could reproduce this oops on their machine,
and then run markup_oops.pl to find out exactly where the driver is
oopsing. I suspect it has to do with the bad completion length in the
line above, but it could be unrelated.

> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RSP: 0018:ffff88102fc03ca8 EFLAGS: 00010046
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RAX: ffff880f865d2b10 RBX: ffff880f865d2b00 RCX: 0000000000000006
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RDX: ffff880f865d2b10 RSI: 0000000000000007 RDI: 0000000000000046
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RBP: ffff88102fc03d08 R08: 000000000000000a R09: 0000000000000000
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] R10: 00000000000006fd R11: 00000000000006fc R12: ffff880fd2de0000
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] R13: ffff880fd32b1780 R14: 0000000000000000 R15: ffff880fd5c5f000
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] FS: 0000000000000000(0000) GS:ffff88102fc00000(0000) knlGS:0000000000000000
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] CR2: 0000000000000004 CR3: 0000000001c0d000 CR4: 00000000000407f0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Stack:
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] ffff88102fc03ce8 ffff880fd0bc8000 ffff88102fc03d00 ffff880fd268d1a0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] ffff88102fc03df4 0000000100000002 ffff880fd32b1780 ffff880f865d2b00
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] ffff880fd268d1a0 ffff880fd5c5f000 ffff880fd2de0000 ffff880fd2c497b0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Call Trace:
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] process_bulk_intr_td+0x116/0x2d0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] handle_tx_event+0x656/0xb50
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] ? __queue_work+0x3b0/0x3c0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] ? call_timer_fn+0x46/0x160
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] xhci_handle_event+0x1db/0x2a0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] ? run_timer_softirq+0x1b2/0x300
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] xhci_irq+0x120/0x1f0
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] xhci_msi_irq+0x11/0x20
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] handle_irq_event_percpu+0x5d/0x210
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] handle_irq_event+0x48/0x70
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] ? native_apic_msr_eoi_write+0x14/0x20
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] handle_edge_irq+0x77/0x110

Sarah Sharp


2014-02-03 23:08:09

by Markus Rechberger

[permalink] [raw]
Subject: Re: [BUGREPORT] Linux USB 3.0

Hi Sarah,

On Mon, Jan 20, 2014 at 8:35 PM, Sarah Sharp
<[email protected]> wrote:
> Hi Markus,
>
> I'm the xHCI driver maintainer, and it helps to Cc me on USB 3.0 bug
> reports.
>
> On Sat, Dec 28, 2013 at 07:24:20AM +0100, Markus Rechberger wrote:
>> just received following log snippset:
>
> Please state which kernel version you (or your customer) is running.
> You've reported issues with several different kernel versions, so which
> kernel are you running for this particular snippet?
>
>> Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr
>
> These messages might be harmless. The 3.0 kernel contains a fix for
> Intel Panther Point xHCI hosts that suppresses those messages, commit
> ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious
> successful event."
>
> A later commit extends that to all xHCI 1.0 hosts, commit
> 07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable
> XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was
> queued for 3.11 and marked to be backported into stable kernels as old
> as 3.0.
>
>> the previous bug report of that user:
>> https://bugzilla.kernel.org/show_bug.cgi?id=65021 xhci: complete USB freeze
>
> Hmm, Greg didn't assign that bug to me, so I missed it, sorry.
>
>> On Fri, Dec 27, 2013 at 8:59 PM, Markus Rechberger <[email protected]> wrote:
>> > Seems like DH87RL was working with 3.2.0-55-generic-pae unfortunately
>> > we don't have such a board for testing and customer patience is
>> > limited to bisect the kernel.
>> >
>> > Does anyone have a clue what modification could have killed USB 3.0
>> > support within those releases?
>> > It does not seem to be SG support.
>
> 3.2 was the kernel where the Intel EHCI to xHCI port switchover code
> went in. Without that code, all ports will remain under the EHCI host,
> and USB 3.0 devices will work at USB 2.0 speeds. I suspect the USB
> device triggers an issue with the xHCI driver, and 3.2 only works
> because the device is on an EHCI port without the switchover code.
>
>> > On Fri, Dec 27, 2013 at 6:18 PM, Markus Rechberger <[email protected]> wrote:
>> >> I just got another USB 3.0 bugreport, the entire system crashed. That
>> >> particular customer already filed a bugreport in November 2013 that
>> >> his system is in a bad state when using some USB 2.0 media devices
>> >> which even have opensource drivers built into the kernel.
>> >>
>> >> USB 3.0 support with Linux seems to be a disaster with Linux 3.6.12.
>> >> The affected board is an Intel DH87RL board.
>
> Why are they running 3.6.12 in particular? That's not a supported
> stable kernel.
>

our customers are using any kind of linux kernel. The drivers are
using USBFS (devio.c) for interfacing with USB.
It seems like you are in contact with one customer who is using the
DH87RL board.
Just today we got another one in our forum using 3.12.9-2-ARCH.
Also Synology NAS users seem to be affected by the USB 2.0 through USB
3.0 issue.


>> >> On Wed, Dec 25, 2013 at 8:18 AM, Markus Rechberger
>> >> <[email protected]> wrote:
>> >>> A customer using a device with USBDEVFS is reporting following
>> >>> backtrace (it seems to be a rather generic issue related to linux usb
>> >>> 3.0 in general):
>> >>> According to him this problem is reproducible as soon as he starts the
>> >>> data transfer, is there anything known about that?
>> >>>
>> >>> He is using 3.12.0-031200-generic
>
> So at this point you've reported three separate bugs, all with the same
> symptom, but different kernel versions? Are these all from the same bug
> reporter, or a different bug reporter?
>
> You've got me seriously confused right now. Please keep one bug report
> to one mail thread, and get the original bug reporter to start that
> thread. If this is from one bug reporter, please state the current
> kernel they are running, and send dmesg showing the issue with
> CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING turned on (you may
> also need to turn on CONFIG_DYNAMIC_DEBUG in later kernels). Please
> attach the dmesg as a file, since your mail client line-wraps.
>
>> >>> Dec 24 14:22:39 homenas kernel: [ 1469.818460] xhci_hcd 0000:0f:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] xhci_hcd 0000:0f:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
>> >>> Dec 24 14:30:39 homenas kernel: last message repeated 16 times
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] xhci_hcd 0000:0f:00.0: WARN Successful completion on short TX
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] xhci_hcd 0000:0f:00.0: WARN Successful completion on short TX
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] xhci_hcd 0000:0f:00.0: URB transfer length is wrong, xHC issue? req. len = 46080, act. len = 1382400
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] IP: [] finish_td+0x13f/0x250
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] PGD 0
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Oops: 0000 [#1] SMP
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Modules linked in:
>> >>> videodev pci_stub vboxpci(OF) vboxnetadp(OF) vboxnetflt(OF)
>> >>> vboxdrv(OF) dm_crypt snd_hda_codec_ca0132 snd_hda_intel snd_hda_codec
>> >>> snd_hwdep snd_pcm snd_seq_midi dm_multipath psmouse scsi_dh
>> >>> snd_rawmidi serio_raw sb_edac snd_seq_midi_event edac_core snd_seq
>> >>> snd_timer snd_seq_device lpc_ich snd bnep rfcomm soundcore
>> >>> snd_page_alloc bluetooth mei_me mei mac_hid ppdev nfsd w83627ehf
>> >>> hwmon_vid nfs_acl auth_rpcgss coretemp nfs fscache lockd lp parport
>> >>> sunrpc raid10 raid456 async_pq async_xor async_memcpy
>> >>> async_raid6_recov async_tx raid0 multipath linear btrfs raid6_pq xor
>> >>> libcrc32c osst st raid1 tg3 mptsas firewire_ohci ptp mxm_wmi
>> >>> firewire_core ahci mptscsih pps_core crc_itu_t libahci mpt2sas mptbase
>> >>> wmi scsi_transport_sas raid_class [last unloaded: vmnet]
>> >>>
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GF O 3.12.0-031200-generic #201311031935
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X79 Extreme9, BIOS P3.30 01/28/2013
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] task: ffffffff81c144a0 ti: ffffffff81c00000 task.ti: ffffffff81c00000
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RIP: 0010:[] [] finish_td+0x13f/0x250
>
> It would help if your client could reproduce this oops on their machine,
> and then run markup_oops.pl to find out exactly where the driver is
> oopsing. I suspect it has to do with the bad completion length in the
> line above, but it could be unrelated.
>

well we can try but those are regular endcustomers who just want to
have things work.

Maybe it would help if we would ship a USB DVB
(DVB-C/T/T2/AnalogTV(PAL/NTSC)/FM Radio/S-Video/Composite) stick to
you?
At least you should be able to use NTSC in US using a videorecorder,
or composite with a camera.
Since our devices are using USBFS all the critical code is in
userspace, no crash should be possible actually.
However the reality looks different the linux usb stack causes freezes
and other problems when xhci is active.

The device transfers around 20mb/sec via USBFS for AnalogTV so it's a
good target for testing.

Markus

>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RSP: 0018:ffff88102fc03ca8 EFLAGS: 00010046
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RAX: ffff880f865d2b10 RBX: ffff880f865d2b00 RCX: 0000000000000006
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RDX: ffff880f865d2b10 RSI: 0000000000000007 RDI: 0000000000000046
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RBP: ffff88102fc03d08 R08: 000000000000000a R09: 0000000000000000
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] R10: 00000000000006fd R11: 00000000000006fc R12: ffff880fd2de0000
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] R13: ffff880fd32b1780 R14: 0000000000000000 R15: ffff880fd5c5f000
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] FS: 0000000000000000(0000) GS:ffff88102fc00000(0000) knlGS:0000000000000000
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] CR2: 0000000000000004 CR3: 0000000001c0d000 CR4: 00000000000407f0
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Stack:
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] ffff88102fc03ce8 ffff880fd0bc8000 ffff88102fc03d00 ffff880fd268d1a0
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] ffff88102fc03df4 0000000100000002 ffff880fd32b1780 ffff880f865d2b00
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] ffff880fd268d1a0 ffff880fd5c5f000 ffff880fd2de0000 ffff880fd2c497b0
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Call Trace:
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] process_bulk_intr_td+0x116/0x2d0
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] handle_tx_event+0x656/0xb50
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] ? __queue_work+0x3b0/0x3c0
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] ? call_timer_fn+0x46/0x160
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] xhci_handle_event+0x1db/0x2a0
>> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] [] ? run_timer_softirq+0x1b2/0x300
>> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] xhci_irq+0x120/0x1f0
>> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] xhci_msi_irq+0x11/0x20
>> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] handle_irq_event_percpu+0x5d/0x210
>> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] handle_irq_event+0x48/0x70
>> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] ? native_apic_msr_eoi_write+0x14/0x20
>> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076] [] handle_edge_irq+0x77/0x110
>
> Sarah Sharp

2014-02-04 09:32:45

by David Laight

[permalink] [raw]
Subject: RE: [BUGREPORT] Linux USB 3.0

From: Markus Rechberger
> >> Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA
> ptr
> >
> > These messages might be harmless. The 3.0 kernel contains a fix for
> > Intel Panther Point xHCI hosts that suppresses those messages, commit
> > ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious
> > successful event."
> >
> > A later commit extends that to all xHCI 1.0 hosts, commit
> > 07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable
> > XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was
> > queued for 3.11 and marked to be backported into stable kernels as old
> > as 3.0.

I see the same error message on the 0.96 ASMedia controller when
the rx buffers for the ax88179_178a driver cross 64k boundaries.

So this isn't confined to 1.0 controllers.

David


2014-02-08 09:00:25

by Markus Rechberger

[permalink] [raw]
Subject: Re: [BUGREPORT] Linux USB 3.0

On Tue, Feb 4, 2014 at 10:31 AM, David Laight <[email protected]> wrote:
> From: Markus Rechberger
>> >> Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA
>> ptr
>> >
>> > These messages might be harmless. The 3.0 kernel contains a fix for
>> > Intel Panther Point xHCI hosts that suppresses those messages, commit
>> > ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious
>> > successful event."
>> >
>> > A later commit extends that to all xHCI 1.0 hosts, commit
>> > 07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable
>> > XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was
>> > queued for 3.11 and marked to be backported into stable kernels as old
>> > as 3.0.
>
> I see the same error message on the 0.96 ASMedia controller when
> the rx buffers for the ax88179_178a driver cross 64k boundaries.
>
> So this isn't confined to 1.0 controllers.
>

Sarah,

since there is no response yet, is there anyone at Intel dedicated at
working on USB 3.0?
We are also getting more and more negative USB 3.0 feedback with Linux

Best Regards,
Markus

2014-02-08 13:06:27

by Markus Rechberger

[permalink] [raw]
Subject: Re: [BUGREPORT] Linux USB 3.0

The next one, just today (unfortunately it's in German):
http://support.sundtek.com/index.php/topic,1505.msg11020.html#msg11020

This guy is using Ubuntu with Linux 3.13.0-8-generic
The system seems to freeze completely after some time.
Since the driver is using the usbdevfs interface the problem is in the usbcore.

On Sat, Feb 8, 2014 at 10:00 AM, Markus Rechberger
<[email protected]> wrote:
> On Tue, Feb 4, 2014 at 10:31 AM, David Laight <[email protected]> wrote:
>> From: Markus Rechberger
>>> >> Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA
>>> ptr
>>> >
>>> > These messages might be harmless. The 3.0 kernel contains a fix for
>>> > Intel Panther Point xHCI hosts that suppresses those messages, commit
>>> > ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious
>>> > successful event."
>>> >
>>> > A later commit extends that to all xHCI 1.0 hosts, commit
>>> > 07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable
>>> > XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was
>>> > queued for 3.11 and marked to be backported into stable kernels as old
>>> > as 3.0.
>>
>> I see the same error message on the 0.96 ASMedia controller when
>> the rx buffers for the ax88179_178a driver cross 64k boundaries.
>>
>> So this isn't confined to 1.0 controllers.
>>
>
> Sarah,
>
> since there is no response yet, is there anyone at Intel dedicated at
> working on USB 3.0?
> We are also getting more and more negative USB 3.0 feedback with Linux
>
> Best Regards,
> Markus

2014-02-09 23:15:57

by Robert Hancock

[permalink] [raw]
Subject: Re: [BUGREPORT] Linux USB 3.0

On 08/02/14 03:00 AM, Markus Rechberger wrote:
> On Tue, Feb 4, 2014 at 10:31 AM, David Laight <[email protected]> wrote:
>> From: Markus Rechberger
>>>>> Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA
>>> ptr
>>>>
>>>> These messages might be harmless. The 3.0 kernel contains a fix for
>>>> Intel Panther Point xHCI hosts that suppresses those messages, commit
>>>> ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious
>>>> successful event."
>>>>
>>>> A later commit extends that to all xHCI 1.0 hosts, commit
>>>> 07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable
>>>> XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was
>>>> queued for 3.11 and marked to be backported into stable kernels as old
>>>> as 3.0.
>>
>> I see the same error message on the 0.96 ASMedia controller when
>> the rx buffers for the ax88179_178a driver cross 64k boundaries.
>>
>> So this isn't confined to 1.0 controllers.
>>
>
> Sarah,
>
> since there is no response yet, is there anyone at Intel dedicated at
> working on USB 3.0?
> We are also getting more and more negative USB 3.0 feedback with Linux

Still nobody appears to have provided the requested debugging
information that was requested. So there is not much that can be done
upstream to debug things based only on vague reports, especially when
not using current kernel versions.

2014-02-11 18:29:51

by Markus Rechberger

[permalink] [raw]
Subject: Re: [BUGREPORT] Linux USB 3.0

On Mon, Feb 10, 2014 at 12:15 AM, Robert Hancock <[email protected]> wrote:
> On 08/02/14 03:00 AM, Markus Rechberger wrote:
>>
>> On Tue, Feb 4, 2014 at 10:31 AM, David Laight <[email protected]>
>> wrote:
>>>
>>> From: Markus Rechberger
>>>>>>
>>>>>> Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd 0000:00:14.0:
>>>>>> ERROR Transfer event TRB DMA
>>>>
>>>> ptr
>>>>>
>>>>>
>>>>> These messages might be harmless. The 3.0 kernel contains a fix for
>>>>> Intel Panther Point xHCI hosts that suppresses those messages, commit
>>>>> ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious
>>>>> successful event."
>>>>>
>>>>> A later commit extends that to all xHCI 1.0 hosts, commit
>>>>> 07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable
>>>>> XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was
>>>>> queued for 3.11 and marked to be backported into stable kernels as old
>>>>> as 3.0.
>>>
>>>
>>> I see the same error message on the 0.96 ASMedia controller when
>>> the rx buffers for the ax88179_178a driver cross 64k boundaries.
>>>
>>> So this isn't confined to 1.0 controllers.
>>>
>>
>> Sarah,
>>
>> since there is no response yet, is there anyone at Intel dedicated at
>> working on USB 3.0?
>> We are also getting more and more negative USB 3.0 feedback with Linux
>
>
> Still nobody appears to have provided the requested debugging information
> that was requested. So there is not much that can be done upstream to debug
> things based only on vague reports, especially when not using current kernel
> versions.
>

Next kernel crash report, this time a Synology NAS System:
http://support.sundtek.com/index.php/topic,1511.0.html

2014-02-11 18:43:29

by Bjørn Mork

[permalink] [raw]
Subject: Re: [BUGREPORT] Linux USB 3.0

Markus Rechberger <[email protected]> writes:

> Next kernel crash report, this time a Synology NAS System:
> http://support.sundtek.com/index.php/topic,1511.0.html

There is no etxhci_hcd driver in the mainline kernel...


Feb 11 18:50:41 DiskStation kernel: [103740.405521] Backtrace:
Feb 11 18:50:41 DiskStation kernel: [103740.408095] [<7f2d8f2c>] (find_trb_seg+0x0/0x54 [etxhci_hcd]) from [<7f2d9ac0>] (etxhci_find_new_dequeue_state+0x5c/0x200 [etxhci_hcd])
Feb 11 18:50:41 DiskStation kernel: [103740.420389] r4:9675fd44
Feb 11 18:50:41 DiskStation kernel: [103740.423046] [<7f2d9a64>] (etxhci_find_new_dequeue_state+0x0/0x200 [etxhci_hcd]) from [<7f2d4520>] (etxhci_cleanup_stalled_ring+0x50/0x140 [etxhci_hcd])
Feb 11 18:50:41 DiskStation kernel: [103740.436749] [<7f2d44d0>] (etxhci_cleanup_stalled_ring+0x0/0x140 [etxhci_hcd]) from [<7f2d46e0>] (etxhci_endpoint_reset+0xd0/0x100 [etxhci_hcd])
Feb 11 18:50:41 DiskStation kernel: [103740.449738] r7:bc0e9830 r6:965b6360 r5:bc0e9800 r4:be34cc00
Feb 11 18:50:41 DiskStation kernel: [103740.455595] [<7f2d4610>] (etxhci_endpoint_reset+0x0/0x100 [etxhci_hcd]) from [<7f086c00>] (usb_hcd_reset_endpoint+0x2c/0x80 [usbcore])
Feb 11 18:50:41 DiskStation kernel: [103740.467837] [<7f086bd4>] (usb_hcd_reset_endpoint+0x0/0x80 [usbcore]) from [<7f088ff0>] (usb_enable_endpoint+0x70/0x74 [usbcore])
Feb 11 18:50:41 DiskStation kernel: [103740.479558] [<7f088f80>] (usb_enable_endpoint+0x0/0x74 [usbcore]) from [<7f08903c>] (usb_enable_interface+0x48/0x5c [usbcore])
Feb 11 18:50:41 DiskStation kernel: [103740.491066] r8:00000001 r7:be34cc00 r6:be8ff368 r5:00000001 r4:0000002c
Feb 11 18:50:41 DiskStation kernel: [103740.497750] r3:00000001
Feb 11 18:50:41 DiskStation kernel: [103740.500522] [<7f088ff4>] (usb_enable_interface+0x0/0x5c [usbcore]) from [<7f08942c>] (usb_set_interface+0x1c8/0x22c [usbcore])
Feb 11 18:50:41 DiskStation kernel: [103740.512032] r8:be04c860 r7:bdc4d600 r6:00000000 r5:be8ff368 r4:be34cc00
Feb 11 18:50:41 DiskStation kernel: [103740.518715] r3:be8ff368
Feb 11 18:50:41 DiskStation kernel: [103740.521495] [<7f089264>] (usb_set_interface+0x0/0x22c [usbcore]) from [<7f0902c0>] (usbdev_ioctl+0xf40/0x1cac [usbcore])
Feb 11 18:50:41 DiskStation kernel: [103740.532512] [<7f08f380>] (usbdev_ioctl+0x0/0x1cac [usbcore]) from [<800db114>] (do_vfs_ioctl+0xa8/0x8bc)
Feb 11 18:50:41 DiskStation kernel: [103740.542109] [<800db06c>] (do_vfs_ioctl+0x0/0x8bc) from [<800db968>] (sys_ioctl+0x40/0x64)
Feb 11 18:50:41 DiskStation kernel: [103740.550396] r9:9675e000 r8:8000e388 r7:00000017 r6:80085504 r5:2f4f54f4
Feb 11 18:50:41 DiskStation kernel: [103740.557079] r4:bc10b540
Feb 11 18:50:41 DiskStation kernel: [103740.559822] [<800db928>] (sys_ioctl+0x0/0x64) from [<8000e1e0>] (ret_fast_syscall+0x0/0x30)
Feb 11 18:50:41 DiskStation kernel: [103740.568282] r7:00000036 r6:00000000 r5:00000000 r4:2f4f64d8


Bjørn

2014-02-11 18:44:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [BUGREPORT] Linux USB 3.0

On Tue, Feb 11, 2014 at 07:29:47PM +0100, Markus Rechberger wrote:
> On Mon, Feb 10, 2014 at 12:15 AM, Robert Hancock <[email protected]> wrote:
> > On 08/02/14 03:00 AM, Markus Rechberger wrote:
> >>
> >> On Tue, Feb 4, 2014 at 10:31 AM, David Laight <[email protected]>
> >> wrote:
> >>>
> >>> From: Markus Rechberger
> >>>>>>
> >>>>>> Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd 0000:00:14.0:
> >>>>>> ERROR Transfer event TRB DMA
> >>>>
> >>>> ptr
> >>>>>
> >>>>>
> >>>>> These messages might be harmless. The 3.0 kernel contains a fix for
> >>>>> Intel Panther Point xHCI hosts that suppresses those messages, commit
> >>>>> ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious
> >>>>> successful event."
> >>>>>
> >>>>> A later commit extends that to all xHCI 1.0 hosts, commit
> >>>>> 07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable
> >>>>> XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was
> >>>>> queued for 3.11 and marked to be backported into stable kernels as old
> >>>>> as 3.0.
> >>>
> >>>
> >>> I see the same error message on the 0.96 ASMedia controller when
> >>> the rx buffers for the ax88179_178a driver cross 64k boundaries.
> >>>
> >>> So this isn't confined to 1.0 controllers.
> >>>
> >>
> >> Sarah,
> >>
> >> since there is no response yet, is there anyone at Intel dedicated at
> >> working on USB 3.0?
> >> We are also getting more and more negative USB 3.0 feedback with Linux
> >
> >
> > Still nobody appears to have provided the requested debugging information
> > that was requested. So there is not much that can be done upstream to debug
> > things based only on vague reports, especially when not using current kernel
> > versions.
> >
>
> Next kernel crash report, this time a Synology NAS System:
> http://support.sundtek.com/index.php/topic,1511.0.html

That kernel has a closed source kernel module loaded, no community
member can look at it, sorry, please get support from the company that
wrote that module.

greg k-h

2014-02-11 19:32:30

by Markus Rechberger

[permalink] [raw]
Subject: Re: [BUGREPORT] Linux USB 3.0

On Tue, Feb 11, 2014 at 7:45 PM, Greg KH <[email protected]> wrote:
> On Tue, Feb 11, 2014 at 07:29:47PM +0100, Markus Rechberger wrote:
>> On Mon, Feb 10, 2014 at 12:15 AM, Robert Hancock <[email protected]> wrote:
>> > On 08/02/14 03:00 AM, Markus Rechberger wrote:
>> >>
>> >> On Tue, Feb 4, 2014 at 10:31 AM, David Laight <[email protected]>
>> >> wrote:
>> >>>
>> >>> From: Markus Rechberger
>> >>>>>>
>> >>>>>> Dec 27 23:23:50 solist kernel: [ 36.118245] xhci_hcd 0000:00:14.0:
>> >>>>>> ERROR Transfer event TRB DMA
>> >>>>
>> >>>> ptr
>> >>>>>
>> >>>>>
>> >>>>> These messages might be harmless. The 3.0 kernel contains a fix for
>> >>>>> Intel Panther Point xHCI hosts that suppresses those messages, commit
>> >>>>> ad808333d8201d53075a11bc8dd83b81f3d68f0b "Intel xhci: Ignore spurious
>> >>>>> successful event."
>> >>>>>
>> >>>>> A later commit extends that to all xHCI 1.0 hosts, commit
>> >>>>> 07f3cb7c28bf3f4dd80bfb136cf45810c46ac474 "usb: host: xhci: Enable
>> >>>>> XHCI_SPURIOUS_SUCCESS for all controllers with xhci 1.0" That was
>> >>>>> queued for 3.11 and marked to be backported into stable kernels as old
>> >>>>> as 3.0.
>> >>>
>> >>>
>> >>> I see the same error message on the 0.96 ASMedia controller when
>> >>> the rx buffers for the ax88179_178a driver cross 64k boundaries.
>> >>>
>> >>> So this isn't confined to 1.0 controllers.
>> >>>
>> >>
>> >> Sarah,
>> >>
>> >> since there is no response yet, is there anyone at Intel dedicated at
>> >> working on USB 3.0?
>> >> We are also getting more and more negative USB 3.0 feedback with Linux
>> >
>> >
>> > Still nobody appears to have provided the requested debugging information
>> > that was requested. So there is not much that can be done upstream to debug
>> > things based only on vague reports, especially when not using current kernel
>> > versions.
>> >
>>
>> Next kernel crash report, this time a Synology NAS System:
>> http://support.sundtek.com/index.php/topic,1511.0.html
>
> That kernel has a closed source kernel module loaded, no community
> member can look at it, sorry, please get support from the company that
> wrote that module.
>

I'm going to collect all XHCI issues we get here as a reference,
unfortunately we're busy with our own hardware so we don't have the
time to dig into USB 3.0 Kernel issues at the moment. All that can be
done is collecting the feedback and maybe help to translate between
German and English. So if someone wants to volunteer to fix some
issues (eg Intel) just drop me a line. As Sarah indicated there are
already several issues mentioned within this post.

Markus